Help Center/ MapReduce Service/ Developer Guide (Normal_3.x)/ Spark2x Development Guide (Security Mode)/ More Information/ FAQ/ Restrictions on Restoring the Spark Application from the checkpoint
Updated on 2022-09-14 GMT+08:00

Restrictions on Restoring the Spark Application from the checkpoint

Question

The Spark application can be restored from the checkpoint and continues to execute the task from the breakpoint of the last task, ensuring that data is not lost. However, in some cases, the Spark application fails to be restored from the checkpoint.

Answer

The checkpoint contains the object serialization information, task execution status information, and configuration information of the Spark application. Therefore, the Spark application cannot be restored from the checkpoint if the following problems exist:

  1. The service code is changed and the SerialVersionUID is not specified in the changed class.
  2. The internal Spark class is changed and the SerialVersionUID is not specified in the changed class.

Besides, some configuration items are stored in the checkpoint. Therefore, if some configuration items of the service are modified, the configuration items may remain unchanged when the service is restored from the checkpoint. Currently, only the following configurations are reloaded when the service is restored from the checkpoint.

"spark.yarn.app.id",
 "spark.yarn.app.attemptId",
 "spark.driver.host",
 "spark.driver.bindAddress",
 "spark.driver.port",
 "spark.master",
 "spark.yarn.jars",
 "spark.yarn.keytab",
 "spark.yarn.principal",
 "spark.yarn.credentials.file",
 "spark.yarn.credentials.renewalTime",
 "spark.yarn.credentials.updateTime",
 "spark.ui.filters",
 "spark.mesos.driver.frameworkId",
 "spark.yarn.jars"

Solution

Manually delete the checkpoint directory and restart the service program.

Deleting a file is a high-risk operation. Ensure that the files are no longer needed before performing this operation.