ALM-45433 ClickHouse AZ Topology Exception
Alarm Description
If the cross-AZ HA function is enabled for a cluster where ClickHouse has been deployed, the ClickHouse topology remains unchanged. This alarm is generated when the cross-AZ HA does not take effect if backup nodes of the same shard are in the same AZ.
This alarm is automatically cleared when the system detects that all shards meet the cross-AZ HA deployment requirements.
Alarm Attributes
Alarm ID |
Alarm Severity |
Auto Cleared |
---|---|---|
45433 |
Critical |
Yes |
Alarm Parameters
Parameter |
Description |
---|---|
Source |
Specifies the cluster for which the alarm was generated. |
ServiceName |
Specifies the service for which the alarm was generated. |
HostName |
Specifies the host for which the alarm was generated. |
Impact on the System
The current deployment of the ClickHouse service does not support cross-AZ HA.
Possible Causes
After cross-AZ HA is enabled, all backup nodes of a shard are in the same AZ.
Handling Procedure
Modify the AZ of backup nodes.
- Log in to the node where the client is installed as the client installation user. Run the following command to switch to the client installation directory:
cd {Client installation path}
- Run the following command to configure environment variables:
source bigdata_env
- Run the following command to authenticate the user (skip this step in normal mode):
kinit Component service user
- Run the following command to log in to the client tool:
zkCli.sh -serverService IP address of the node where the ZooKeeper instance resides:Client port
- Run the following command to view the current topology:
get /clickhouse/topo
If the ClickHouse is installed with multiple services, run the get /clickhouse{-n}/topo command. For example, if the ClickHouse-1 is installed, run the get /clickhouse-1/topo command.
[zk: 192.168.20.36:24002(CONNECTED) 0] get /clickhouse/topo <topo> <mcluster> <shard id="14" index="1"> <server id="15"> <replica>1</replica> <az>AZ1</az> <host>192-168-20-205</host> <port>21427</port> </server> <server id="16"> <replica>2</replica> <az>AZ1</az> <host>192-168-20-2205</host> <port>21427</port> </server> </shard> </mcluster> </topo>
- Select a host from the desired shard and deploy the host in another AZ.
- Log in to FusionInsight Manager, click Host, select the host you have deployed in 6 and choose More > Reinstall to reinstall the host.
- Choose Cluster > Cross-AZ HA, click Configure AZ and Policy and change the AZ information of the reinstalled host to the AZ planned in the 6.
- Wait for five minutes and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 10.
Collect fault information.
- On FusionInsight Manager, choose O&M > Log > Download.
- Expand the drop-down list next to the Service field. In the Services dialog box that is displayed, select ClickHouseServer for the target cluster.
- Expand the Hosts list. In the Select Host dialog box that is displayed, select the abnormal host, and click OK.
- Click in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M personnel and provide the collected logs.
Alarm Clearance
This alarm is automatically cleared after the fault is rectified.
Related Information
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.