Connection to a Primary/Standby DB Instance Suspended Using pt-osc
Scenario
The pt-online-schema-change (pt-osc) tool can be used to perform online DDL operations on a local single instance, but cannot be used on an RDS for MySQL primary/standby instance and the connection is suspended, as shown in the following figure.
Possible Causes
How pt-osc works:
- Create an empty table with the same structure as the original table but with the suffix _new added to the table name.
- Modify the structure of the empty table created in 1.
- Add three triggers to the original table: delete, update, and insert. The triggers are used to execute the statements to be executed in the original table in the new table during data copy.
- Copy the data in the original table to the new table in the form of data chunks.
- Rename the original table, change the name of the new table to that of the original table, and delete the original table.
- Delete the triggers.
A large amount of data needs to be copied. There will be a replication delay between the RDS for MySQL primary and standby instances. Workloads running on the standby instance may be affected. Considering the replication delay, the pt-osc provides the following options:
- --max-lag
- --check-interval
- --recursion-method
- --check-slave-lag
If the replication delay of the standby instance exceeds the value of max-lag, the tool stops copying data for check-interval seconds. If you specify check-slave-lag, the tool only monitors that particular server for replication delay. It does not monitor other servers. recursion-method is used to control exactly which servers the tool monitors. Its values include processlist (default value, monitoring the primary/standby replication delay), hosts, dsn, and none (ignoring the primary/standby replication delay). For more information, see pt-online-schema-change.
In this case:
- When pt-osc is used to connect to the RDS for MySQL primary/standby instance, the connection is suspended because there is a primary/standby replication delay and the tool stops copying data. You can add --recursion-method=none to solve the problem.
- If the primary/standby replication delay is ignored, data copy becomes fast. To minimize the impact on workloads, you can set the --max-load configuration item.
Solution
Add the --recursion-method=none configuration item to the pt-osc command to ignore the replication delay.
- Adding a field
pt-online-schema-change --user=root --password=xxx --host=xxx --alter "ADD COLUMN content text" D=aaa,t=tmp_test --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute
- Deleting a field
pt-online-schema-change --user=root --password=xxx --host=xxx --alter "DROP COLUMN content " D=aaa,t=tmp_test --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute
- Modifying a field
pt-online-schema-change --user=root --password=xxx --host=xxx --alter "MODIFY COLUMN age TINYINT NOT NULL DEFAULT 0" D=aaa,t=tmp_test --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute
- Renaming a field
pt-online-schema-change --user=root --password=xxx --host=xxx --alter "CHANGE COLUMN age address varchar(30)" D=aaa,t=tmp_test --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute
- Adding an index
pt-online-schema-change --user=root --password=xxx --host=xxx --alter "ADD INDEX idx_address(address)" D=aaa,t=tmp_test --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute
- Deleting an index
pt-online-schema-change --user=root --password=xxx --host=xxx --alter "DROP INDEX idx_address" D=aaa,t=tmp_test --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute
If the primary/standby replication delay is important for your workloads, adjust the following parameters as required: max-lag, check-interval, recursion-method, and check-slave-lag. For more information, see pt-online-schema-change.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot