Automatic Recovery of Extended Primary/Standby Replication Delay
Scenario
The primary/standby replication delay of a DB instance was long, kept increasing for a period of time, and then automatically recovered.
The following figure is an example showing how the real-time replication delay metric changes on the Cloud Eye console.
Possible Causes
According to Primary/Standby Replication Delay Scenarios and Solutions and How Primary/Standby Replication Works, this problem is caused by large transactions or DDL operations.
You can analyze full logs or slow query logs to check whether there are large transactions or DDL operations.
As shown in the following figure, if a DDL operation for adding an index was recorded in the slow query logs, the table contained hundreds of millions of data records, and the execution took about one day, the replication delay kept increasing when the DDL operation was replayed on the read replica or standby node. After the DDL operation was replayed, the replication delay dropped back to the normal range.
Solution
- Wait until the DDL operation is complete.
- Add indexes during off-peak hours.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot