Updated on 2023-05-06 GMT+08:00

Accessing ClickHouse Through ELB

Currently, ClickHouse is deployed in cluster mode regardless of whether replication or sharding is used. When ClickHouse provides services externally, multiple ClickHouse service nodes will be exposed and no unified access entry is available. ClickHouse provides the BalancedClickhouseDataSource class, which supports the load balancing capability by randomly allocating client's access requests to multiple nodes. However, it behaves unsatisfactorily in fault detection. Especially, the client needs to proactively detect changes of cluster nodes during scale-in or scale-out.

MRS work with Elastic Load Balance (ELB) to address the preceding issues. Figure 1 shows the deployment architecture. This architecture can automatically distribute user access traffic evenly to multiple backend ClickHouse nodes, expanding external service capabilities and improving fault tolerance. When a backend ClickHouse node becomes faulty, ELB automatically fails over access traffic to another properly running node.

Figure 1 Accessing ClickHouse nodes through ELB

Table 1 lists the advantages of the ELB-based deployment over BalancedClickhouseDataSource.

Table 1 Differences between ELB-based deployment and BalancedClickhouseDataSource

Load Balancing Method

Difference

ELB-based deployment

  • Supports multiple request policies.
  • Supports automatic fault detection and failover.
  • Allows you to add ClickHouse backend nodes simply by changing ELB configurations.

BalancedClickhouseDataSource

  • Causes load imbalance due to random allocation of requests.
  • Lacks of sufficient fault detection capabilities.

Table 2 lists the supported protocols and ports for accessing ClickHouse through ELB. Configure them as required.

Table 2 Supported protocols and ports for accessing ClickHouse nodes through ELB

Protocol

Port Number

Used When

TCP

9000

A client request is sent to ELB to connect to ClickHouse. For example, if you run the clickhouse client command to connect to ClickHouse, set host to the private IP address of ELB.

HTTP

8123

An HTTP request is sent to ELB to connect to ClickHouse.

This section describes how to use a client to access ClickHouse through ELB. The procedure is as follows:

  • Step 1: Buy an ELB and obtain its private IP address.
  • Step 2: Add an ELB listener and configure its protocol and port.
  • Step 3: Add backend ClickHouse servers to the ELB.
  • Step 4: Use a client to access ClickHouse through ELB.

Prerequisites

  • You have created an MRS cluster and its ClickHouse instances are running properly.
  • The MRS client has been installed in a directory, for example, /opt/client. The client directory in the following operations is only an example. Change it to the actual installation directory.

Buying an ELB and Connecting It to ClickHouse Nodes

Buying an ELB and obtaining its private IP address

For details, see Creating a Shared Load Balancer.

  1. Log in to the ELB console and click Buy Elastic Load Balancer.
  2. On the Buy Elastic Load Balancer page, set Type to Shared, set VPC and Subnet to the same values as those of the MRS cluster, and retain the default values for other parameters.
  3. Click Next, confirm the configurations, and click Submit.
  4. On the Load Balancers page, obtain the private IP address of the newly created load balancer.

Adding an ELB listener

For details, see Adding a TCP Listener.

  1. On the Load Balancers page, click the name of the created load balancer to go to its details page.
  2. Click the Listeners tab and then Add Listener.

  3. On the Add Listener page, complete the configuration as prompted.

    1. Configure the listener.
      Set Frontend Protocol/Port to TCP and 9000, respectively. Retain the default values for other parameters. Click Next.

      If an HTTP request is sent to access ClickHouse through ELB, set Frontend Protocol/Port to HTTP and 8123, respectively.

    2. Configure the backend server group.

      Set Load Balancing Algorithm to Weighted round robin and click Finish. On the displayed page, click OK.

Adding ClickHouse backend servers

  1. Switch to the MRS console and click the MRS cluster to be interconnected.
  2. On the displayed page, click the Nodes tab and expand ClickHouse to obtain its node names and IP addresses.

  3. Switch to the ELB console, locate the created load balancer, and click its name.
  4. Click the Listeners tab and then Backend Server Groups. Click Add.

  5. On the Add Backend Server page, select the backend servers based on the node names and IP addresses of ClickHouse obtained in 2. Click Next.
  6. On the displayed page, set Batch Add Ports to 9000 and click OK. Confirm your configurations and click Finish.

    If an HTTP request is sent to access ClickHouse through ELB, set Batch Add Ports to 8123.

  7. Configure the security group.

    After the configuration is complete, go to the Backend Server Groups tab on the Listeners page. The Health Check Result of the backend servers is Unhealthy.

    To solve this issue, you need to configure the inbound rule of the security group for the ClickHouse backend server to allow access from 100.125.0.0/16. The procedure is as follows:

    1. Go back to the Listeners tab. Click the Backend Server Groups subtab and then the name of a backend server.
    2. On the displayed page, click the Security Groups tab and then Manage Rule. Next, click the Inbound Rules tab and then Add Rule.
    3. On the Add Inbound Rule page, set Protocol & Port to TCP and 9000, and IP address to 100.125.0.0/16. Click OK.

      If an HTTP request is sent to access ClickHouse through ELB, set Batch Add Ports to 8123.

    4. Go back to the ELB console, navigate to the details page of the created load balancer, and refresh the page. Click the Listeners tab and then the Backend Server Groups subtab. The Health Check Status changes to Healthy.

Accessing ClickHouse through ELB

  1. Use the client to log in to the node where the ClickHouse service instance is deployed. For details, see Using ClickHouse from Scratch. Note that the host parameter in the clickhouse client command must be set to the private IP address of ELB.
  2. Check the connection result on the client.

    If you manually run a client command to connect to the ClickHouse node, there may be only a few concurrent requests. In this case, the ELB may always send the requests to the same backend ClickHouse node. This is normal.

    If there are a large number of concurrent requests, the ELB will distribute the requests to multiple ClickHouse nodes in polling mode.