Help Center/ MapReduce Service/ Component Operation Guide (LTS) (Ankara Region)/ Using ClickHouse/ ClickHouse FAQ/ What Should I Do If an Exception Occurred in the replication_queue and Data Is Inconsistent Between Replicas After a ClickHouse Cluster Is Powered On from a Sudden Poweroff?
Updated on 2024-11-29 GMT+08:00

What Should I Do If an Exception Occurred in the replication_queue and Data Is Inconsistent Between Replicas After a ClickHouse Cluster Is Powered On from a Sudden Poweroff?

Symptom

ClickHouseServer replicas failed to be synchronized. The following error message was displayed:

When the system table system.replication_queue is queried, the following error message was displayed:
 Code: 234. DB::Exception: No active replica has part 1694228400_0_3742_2242 or covering part (cannot execute queue-0000010315: GET_PART with virtual parts [1694228400_0_3742_2242]). (NO_REPLICA_HAS_PART)

Procedure

  1. Log in to the ClickHouse client node and connect to ClickHouseServer by referring to Using ClickHouse from Scratch.
  2. Run SELECT * from system.replication_queue to query the information in the system.replication_queue system table. The following error message is displayed in the last_exception column:

    Code: 234. DB::Exception: No active replica has part 1694228400_0_3742_2242 or covering part (cannot execute queue-0000010315: GET_PART with virtual parts [1694228400_0_3742_2242]). (NO_REPLICA_HAS_PART)

  3. Run the following statement to obtain the ZooKeeper storage path and locate all abnormal nodes:

    SELECT replica_path || '/queue/' || node_name, last_exception FROM system.replication_queue JOIN system.replicas USING (database, table) WHERE create_time < now();

  4. Log in to the ZooKeeper client by referring to Using ZooKeeper from Scratch.
  5. Run the delete command to delete all abnormal nodes obtained in 3. The following is an example.

    delete /clickhouse/tables/68/test0722/test156/replicas/1/queue/queue-0000010315

    The command parameters are as follows:

    • /clickhouse/tables/68/test0722/test156/replicas/1/queue indicates the path of the abnormal nodes obtained in 3.
    • queue-0000010315 indicates the abnormal nodes obtained in 3.

  6. Log in to the ClickHouse client node, connect to ClickHouseServer, and run the following statement to restart the replicas:

    SYSTEM RESTART REPLICAS