Data in ro and rt Tables Cannot Be Synchronized to a MOR Table Recreated After Being Deleted Using Spark SQL
Question
After a MOR table is deleted using Spark SQL and then re-created, data in ro and rt tables cannot be synchronized to the MOR table in real time. The following error information is displayed:
WARN HiveSyncTool: Got runtime exception when hive syncing, but continuing as ignoreExceptions config is set java.lang.IllegalArgumentException: Failed to get schema for table hudi_table2_ro does not exist at org.apache.hudi.hive.HoodieHiveClient.getTableSchema(HoodieHiveClient.java:183) at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:286) at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:213)
Answer
Cause:
To reduce access to Hive Metastore, a cache mechanism is added for Hudi tables. By default, data is cached for 1 hour. So, after a MOR table is deleted using Spark SQL and then recreated, data in ro and rt tables cannot be synchronized to the MOR table in real time.
Solution:
Set hoodie.datasource.hive_sync.interval to 0.
set hoodie.datasource.hive_sync.interval=0;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot