How Do I Restore the Latest tablestatus File That Has Been Lost or Damaged When TableStatus Versioning Is Enabled?
Question
When the TableStatus versioning feature is enabled, how do I restore the latest tablestatus file if it is lost or damaged due to other exceptions?
Answer
Use the latest available tablestatus file to restore data in the following scenarios:
Scenario 1: The CarbonData data files and .segment files of the current batch are damaged and cannot be restored.
- Log in to the client node and run the following commands to view the tablestatus file of the HDFS table and find the latest tablestatus version number:
cd Client installation path
source bigdata_env
source Spark/component_env
kinit Component service user (You do not need to run the kinit command for normal clusters.)
hdfs dfs -ls /user/hive/warehouse/hrdb.db/car01/Metadata
In the preceding figure, the tablestatus_1669028899548 file of the current batch is damaged and the tablestatus_1669028852132 file is required.
- Go to Spark SQL and run the following command to change the value of latestversion to the latest version:
alter table car01 set SERDEPROPERTIES ('latestversion'='1669082252132');
You need to exit the current session, reconnect to the session, and perform the query. This method has been used to restore customer data as much as possible. Generally, segment data files on the live network cannot be restored in power-off scenarios.
Scenario 2: The CarbonData data files and .segment files of the current batch are complete and can be restored.
Use the TableStatusRecovery tool to restore non-partitioned tables. Log in to the Spark client node and run the following commands:
cd Client installation path
source bigdata_env
source Spark/component_env
kinit Component service user (You do not need to run the kinit command for normal clusters.)
spark-submit --master yarn --class org.apache.carbondata.recovery.tablestatus.TableStatusRecovery Spark/spark/carbonlib/carbondata-spark_*.jar hrdb car01
Parameter description: hrdb car01 indicates the table name.
Restrictions on using TableStatusRecovery for restoration:
- After the merge, if the tablestatus file is lost or damaged, this tool cannot be used to restore the segments in the merge state because only the tablestatus file contains the segment merge information.
- After segments are deleted by ID or date, if the tablestatus file is lost or damaged, the deleted segment information cannot be restored because only the tablestatus file contains the segment deletion information.
- This tool cannot be used on materialized view tables.
- If the latest tablestatus file is faulty and query cannot be performed after using this tool for restoration, remove this latest file and use the previous tablestatus file for restoration.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot