Recovery from Failures
- System-Level
DLI uses an architecture with separated storage and compute resources. A compute cluster can be autocratically recovered if a system fault occurs, thanks to the Kubernetes resource scheduling and failover mechanism.
- Job-Level
You can enable automatic restart and recovery for Flink and Spark jobs. After this function is enabled, jobs will be automatically restarted and recovered if exceptions occur.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot