What Do I Do If a Cluster Is Always in the Snapshot Creation State?
Possible causes are as follows:
- The cluster is heavily loaded, and snapshot creation takes a long time.
The default snapshot creation speed of a single node is 40 MB/s. The speed will be lower if the cluster is busy. You can query the status of a snapshot by referring to preceding sections.
You can run the GET _snapshot/repo_auto/snapshot-name command to check the number of shards that are being backed up. You can also terminate snapshot creation via APIs.
Solution: Wait for the snapshot creation to complete, or terminate the task.
- Failed to update snapshot information.
Elasticsearch stores ongoing snapshot information in the cluster state. After a snapshot is created, its state needs to be updated, but Elasticsearch may fail to update the snapshot state due to high memory usage. Elasticsearch does not retry failed updates, so the snapshot remains in the Creating state.
Solution: Call the snapshot deletion API.
- Temporary AKs or SKs expire.
CSS uses an agency to write data in Elasticsearch to OBS. To create a snapshot repository, you need to use the agency to obtain a temporary AK and a temporary SK, and configure them in the repository. Temporary AKs and SKs have a validity period (24 hours). Snapshot creation will fail if it does not complete within 24 hours. In this case, the repository cannot be updated, queried, and deleted, and the cluster state information cannot be deleted manually or by a rolling restart. To delete residual snapshot information, perform a normal restart.
Solution: Currently, residual snapshot information can only be deleted in a normal restart. CSS will provide a termination interface to rectify the fault.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.