Task Failed Due to Concurrent Writes to One Table or Partition
Symptom
When Hive executes an INSERT statement, an error is reported indicating that a file or directory already exists or is cleared in HDFS. The error details are as follows:
Cause Analysis
- Check the start time and end time of the task based on the HiveServer audit logs.
- Check whether data is inserted into the same table or partition in the time segment.
- Hive does not support concurrent data insertion for a table or partition. As a result, multiple tasks perform operations on the same temporary data directory, and one task moves the data of another task, causing task failure.
Solution
The service logic is modified so that data is inserted to the same table or partition in single thread mode.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot