Help Center> Relational Database Service> Troubleshooting> RDS for MySQL> Other Issues> Connection to a Primary/Standby DB Instance Suspended Using pt-osc
Updated on 2023-03-06 GMT+08:00

Connection to a Primary/Standby DB Instance Suspended Using pt-osc

Scenario

The pt-online-schema-change (pt-osc) tool can be used to perform online DDL operations on a local single instance, but cannot be used on an RDS for MySQL primary/standby instance and the connection is suspended, as shown in the following figure.

Possible Causes

How pt-osc works:

  1. Create an empty table with the same structure as the original table but with the suffix _new added to the table name.
  2. Modify the structure of the empty table created in 1.
  3. Add three triggers to the original table: delete, update, and insert. The triggers are used to execute the statements to be executed in the original table in the new table during data copy.
  4. Copy the data in the original table to the new table in the form of data chunks.
  5. Rename the original table, change the name of the new table to that of the original table, and delete the original table.
  6. Delete the triggers.

A large amount of data needs to be copied. There will be a replication delay between the RDS for MySQL primary and standby instances. Workloads running on the standby instance may be affected. Considering the replication delay, the pt-osc provides the following options:

  • --max-lag
  • --check-interval
  • --recursion-method
  • --check-slave-lag

If the replication delay of the standby instance exceeds the value of max-lag, the tool stops copying data for check-interval seconds. If you specify check-slave-lag, the tool only monitors that particular server for replication delay. It does not monitor other servers. recursion-method is used to control exactly which servers the tool monitors. Its values include processlist (default value, monitoring the primary/standby replication delay), hosts, dsn, and none (ignoring the primary/standby replication delay). For more information, see pt-online-schema-change.

In this case:

  • When pt-osc is used to connect to the RDS for MySQL primary/standby instance, the connection is suspended because there is a primary/standby replication delay and the tool stops copying data. You can add --recursion-method=none to solve the problem.
  • If the primary/standby replication delay is ignored, data copy becomes fast. To minimize the impact on workloads, you can set the --max-load configuration item.

Solution

Add the --recursion-method=none configuration item to the pt-osc command to ignore the replication delay.

Common uses of pt-osc:
  • Adding a field

    pt-online-schema-change --user=root --password=xxx --host=xxx --alter "ADD COLUMN content text" D=aaa,t=tmp_test --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute

  • Deleting a field

    pt-online-schema-change --user=root --password=xxx --host=xxx --alter "DROP COLUMN content " D=aaa,t=tmp_test --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute

  • Modifying a field

    pt-online-schema-change --user=root --password=xxx --host=xxx --alter "MODIFY COLUMN age TINYINT NOT NULL DEFAULT 0" D=aaa,t=tmp_test --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute

  • Renaming a field

    pt-online-schema-change --user=root --password=xxx --host=xxx --alter "CHANGE COLUMN age address varchar(30)" D=aaa,t=tmp_test --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --quiet --execute

  • Adding an index

    pt-online-schema-change --user=root --password=xxx --host=xxx --alter "ADD INDEX idx_address(address)" D=aaa,t=tmp_test --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute

  • Deleting an index

    pt-online-schema-change --user=root --password=xxx --host=xxx --alter "DROP INDEX idx_address" D=aaa,t=tmp_test --no-check-alter --no-check-replication-filters --alter-foreign-keys-method=auto --recursion-method=none --print --execute

If the primary/standby replication delay is important for your workloads, adjust the following parameters as required: max-lag, check-interval, recursion-method, and check-slave-lag. For more information, see pt-online-schema-change.