Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

ClickHouse Client Practices

Updated on 2024-10-09 GMT+08:00

ClickHouse is a column-based database oriented to online analysis and processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.

ClickHouse uses the ReplicatedMergeTree engine and ZooKeeper to replicate tables. When creating a table, you can specify the engine to enable table replication for high availability. Shards and replicas of each table are independent from each other.

ClickHouse supports distributed tables by using the Distributed engine. Views are created on all shards (local tables) to ease distributed queries. ClickHouse supports data sharding, which is one of the features of distributed storage. This allows for parallel read and write for faster queries.

This topic uses the cluster client to quickly connect to the ClickHouse service in a cluster.

Prerequisites

  • The cluster client has been installed, for example, in the /opt/client directory.
  • For clusters with Kerberos authentication enabled, you have created a user with ClickHouse operation permissions on Manager, for example, clickhouseuser.

Using a ClickHouse Client

  1. Install the cluster client. For details, see Installing a Client.
  2. Log in to the node where the client is installed as the client installation user.
  3. Run the following command to go to the client installation directory:

    cd /opt/client

  4. Run the following command to configure environment variables:

    source bigdata_env

  5. If Kerberos authentication has been enabled for the cluster, authenticate the user first. If not, skip this step.

    The service user must have the permission to create ClickHouse tables. For details, see Creating a ClickHouse Role. Create a user on Manager and associate the user to the required role.

    kinit Component service user

    The following provides an example:

    kinit clickhouseuser

  6. Run the clickhouse client command to connect to the ClickHouse server.

    • Using a non-SSL method for login when Kerberos authentication is disabled for the ClickHouse cluster

      clickhouse client --host IP address of the ClickHouse instance --port 9000 --user Username --password

    • Using SSL for login when Kerberos authentication is enabled for the current cluster:

      clickhouse client --host IP address of the ClickHouse instance --port 9440 --user Username --password

    Table 1 Parameters of the clickhouse client command

    Parameter

    Description

    --host

    Name of the server node. You can use the host name or IP address of the node where the ClickHouse instance is deployed.

    To obtain the IP address of the ClickHouseServer instance, log in to FusionInsight Manager of the cluster and choose Cluster > Services > ClickHouse > Instances.

    --port

    Connection port.

    • Clusters with Kerberos authentication enabled use SSL connections by default. The default port is 9440, and the --secure parameter must be specified.

      You can also obtain the port number from the tcp_port_secure parameter of the ClickHouseServer instance.

      If a non-SSL connection is required in this scenario, log in to FusionInsight Manager, choose Cluster > Services > ClickHouse > Configurations, search for SSL_NONESSL_BOTH_ENABLE, change the value to true, and restart all ClickHouse instances.

    • Clusters with Kerberos authentication disabled use non-SSL connections by default. The default port is 9000, and the --secure parameter is not required.

      You can also obtain the port number from the tcp_port parameter of the ClickHouseServer instance.

      If an SSL connection is required in this scenario, log in to FusionInsight Manager, choose Cluster > Services > ClickHouse > Configurations, search for SSL_NONESSL_BOTH_ENABLE, change the value to true, and restart all ClickHouse instances.

    NOTE:

    The preceding ports are open-source ports. If a custom port policy is used during cluster creation, replace the ports by referring to Common Troubleshooting.

    You can run the clickhouse -h command to view the command help of the ClickHouse component.

    --user

    Connection username.

    • If Kerberos authentication has been enabled for the cluster (the cluster is in security mode) and you have passed the kinit authentication, the --user and --password parameters are optional for client login. You must create a user with this name on Manager because there is no default user for Kerberos authentication.
    • If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), do not use the ClickHouse user created on FusionInsight Manager to log in to the client. You need to execute the create user SQL statement on the client to create a ClickHouse user.

      Alternatively, you can use the default user for login.

    --password

    User password for connection. This parameter is used together with the --user parameter.

    --query

    Query to process when using non-interactive mode.

    --database

    The current database is used by default.

    This parameter uses the default configuration on the server.

    --multiline

    If this parameter is specified, multiline queries are allowed. (Enter only indicates line feed and does not indicate that the query statement is complete.)

    --multiquery

    If this parameter is specified, multiple queries separated with semicolons (;) can be processed. This parameter is valid only in non-interactive mode.

    --format

    Specified default format used to output the result.

    --vertical

    If this parameter is specified, the result is output in vertical format by default. In this format, each value is printed on a separate line, which helps to display a wide table.

    --time

    If this parameter is specified, the query execution time is printed to stderr in non-interactive mode.

    --stacktrace

    If this parameter is specified, stack trace information will be printed when an exception occurs.

    --config-file

    Name of the configuration file.

    --secure

    If this parameter is specified, the server will be connected in SSL mode.

    --history_file

    Path of files that record command history.

    --param_<name>

    Query with parameters. Pass values from the client to the server.

    For details, see https://clickhouse.tech/docs/en/interfaces/cli/#cli-queries-with-parameters.

  7. Run the quit; command to exit the ClickHouse server connection.

Viewing Environment Parameters (cluster) of ClickHouse

  1. Use the ClickHouse client to connect to the ClickHouse server.
  2. Query the cluster identifier and other information about the environment parameters.

    select cluster,shard_num,replica_num,host_name from system.clusters;
    SELECT 
        cluster, 
        shard_num, 
        replica_num, 
        host_name
    FROM system.clusters
    
    ┌─cluster───────────┬─shard_num─┬─replica_num─┬─host_name──────── ┐
    │ default_cluster_1             │         1   │           1   │ node-master1dOnG           │
    │ default_cluster_1             │         1   │           2   │ node-group-1tXED0001       │
    │ default_cluster_1             │         2   │           1   │ node-master2OXQS           │
    │ default_cluster_1             │         2   │           2   │ node-group-1tXED0002       │
    │ default_cluster_1             │         3   │           1   │ node-master3QsRI           │
    │ default_cluster_1             │         3   │           2   │ node-group-1tXED0003       │
    └─────────────── ┴────── ┴─────── ┴──────────────┘
    
    6 rows in set. Elapsed: 0.001 sec. 

  3. Query the shard and replica identifiers.

    select * from system.macros;
    SELECT *
    FROM system.macros
    
    ┌─macro───┬─substitution─────┐
    │ id          │ 76                     │
    │ replica     │ 2                      │
    │ shard       │ 3                      │
    └────── ┴────────────┘
    
    3 rows in set. Elapsed: 0.001 sec. 

Creating a Local Replicated Table and a Distributed Table

  1. Create a replicated table using the ReplicatedMergeTree engine.

    For details about the syntax, see https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables.

    For example, to create the ReplicatedMergeTree table test in the default database on all default_cluster_1 nodes, run the following command:

    CREATE TABLE default.test ON CLUSTER default_cluster_1

    (

    `EventDate` DateTime,

    `id` UInt64

    )

    ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/default/test', '{replica}')

    PARTITION BY toYYYYMM(EventDate)

    ORDER BY id;

    • ON CLUSTER is a distributed DDL clause that creates the same local table on all instances in the cluster in one execution.
    • default_cluster_1 is the cluster identifier used for viewing environment parameters of the ClickHouse service.
      CAUTION:
      ReplicatedMergeTree engine receives the following two parameters:
      • Storage path of the table data in ZooKeeper

        The path must be in the /clickhouse directory. Otherwise, data insertion may fail due to insufficient ZooKeeper quota.

        To avoid data conflict between different tables in ZooKeeper, the directory must be in the following format:

        /clickhouse/tables/{shard}/default/test, in which /clickhouse/tables/{shard} is fixed, default indicates the database name, and text indicates the name of the created table.

      • Replica name: Generally, {replica} is used.
    CREATE TABLE default.test ON CLUSTER default_cluster_1
    (
        `EventDate` DateTime, 
        `id` UInt64
    )
    ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/default/test', '{replica}')
    PARTITION BY toYYYYMM(EventDate)
    ORDER BY id
    
    ┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
    │ node-group-1tXED0002                   │  9000  │      0   │         │                   5   │                3   │
    │ node-group-1tXED0003                   │  9000  │      0   │         │                   4   │                3   │
    │ node-master1dOnG                       │  9000  │      0   │         │                   3   │                3   │
    └────────────────────┴────┴─────┴──── ┴─────────── ┴──────────┘
    ┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
    │ node-master3QsRI                       │  9000  │      0   │         │                   2   │                0   │
    │ node-group-1tXED0001                   │  9000  │      0   │         │                   1   │                0   │
    │ node-master2OXQS                       │  9000  │      0   │         │                   0   │                0   │
    └────────────────────┴────┴─────┴──── ┴─────────── ┴──────────┘
    
    6 rows in set. Elapsed: 0.189 sec. 

  2. Create a distributed table using the Distributed engine.

    For example, run the following statement to create the test_all table in the default database on all default_cluster_1 nodes:

    CREATE TABLE default.test_all ON CLUSTER default_cluster_1

    (

    `EventDate` DateTime,

    `id` UInt64

    )

    ENGINE = Distributed(default_cluster_1, default, test, rand());

    CREATE TABLE default.test_all ON CLUSTER default_cluster_1
    (
        `EventDate` DateTime, 
        `id` UInt64
    )
    ENGINE = Distributed(default_cluster_1, default, test, rand())
    
    ┌─host─────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
    │ node-group-1tXED0002                   │  9000  │      0   │         │                   5   │                0   │
    │ node-master3QsRI                       │  9000  │      0   │         │                   4   │                0   │
    │ node-group-1tXED0003                   │  9000  │      0   │         │                   3   │                0   │
    │ node-group-1tXED0001                   │  9000  │      0   │         │                   2   │                0   │
    │ node-master1dOnG                       │  9000  │      0   │         │                   1   │                0   │
    │ node-master2OXQS                       │  9000  │      0   │         │                   0   │                0   │
    └────────────────────┴────┴─────┴──── ┴─────────── ┴──────────┘
    
    6 rows in set. Elapsed: 0.115 sec. 
    
    NOTE:

    Distributed requires the following parameters:

    • default_cluster_1 is the cluster identifier used for viewing environment parameters of the ClickHouse service.
    • default indicates the name of the database where the local table is located.
    • test is the name of the local table.
    • (Optional) Sharding key

      This key and the weight configured in the config.xml file determine the route for writing data to the distributed table, that is, the physical table to which the data is written. It can be the original data (for example, site_id) of a column in the table or the result of the function call, for example, rand() is used in the preceding SQL statement. Note that data must be evenly distributed in this key. Another common operation is to use the hash value of a column with a large difference, for example, intHash64(user_id).

ClickHouse Table Data Operations

  1. Insert data to the local table.

    For example, insert data to the test table.

    insert into test values(toDateTime(now()), rand());

  2. Query local table information.

    For example, query data in the test table.

    select * from test;

    SELECT *
    FROM test
    
    ┌───────────EventDate─┬─────────id─┐
    │ 2020-11-05 21:10:42             │ 1596238076           │
    └──────────────── ┴───────────┘
    
    1 rows in set. Elapsed: 0.002 sec. 
     

  3. Query the distributed table.

    The following example queries the distributed tabletest_all, which is created based on test, and gets the same result as querying the test table.

    select * from test_all;

    SELECT *
    FROM test_all
    
    ┌───────────EventDate─┬─────────id─┐
    │ 2020-11-05 21:10:42             │ 1596238076           │
    └──────────────── ┴───────────┘
    
    1 rows in set. Elapsed: 0.004 sec. 

  4. Switch to the shard node with the same shard_num and query the information about the current table. The same table data can be queried.

    For example, run the exit; command to exit the original node.

    Run the following command to switch to the node-group-1tXED0003 node:

    clickhouse client --host node-group-1tXED0003 --multiline --port 9440 --secure;

    show tables;

    SHOW TABLES
    
    ┌─name─────┐
    │ test           │
    │ test_all       │
    └────────┘
     

  5. Query local table data. For example, query data in the test table on the node-group-1tXED0003 node.

    select * from test;
    SELECT *
    FROM test
    
    ┌───────────EventDate─┬─────────id─┐
    │ 2020-11-05 21:10:42             │ 1596238076           │
    └──────────────── ┴───────────┘
    
    1 rows in set. Elapsed: 0.005 sec. 
    

  6. Switch to the shard node with different shard_num value and query the data of the created table.

    For example, exit the node-group-1tXED0003 node.

    exit;

    Switch to the node-group-1tXED0001 node.

    clickhouse client --host node-group-1tXED0001 --multiline --port 9440 --secure;

    Query the local table test. The test table data cannot be queried on the different shard nodes.

    select * from test;

    SELECT *
    FROM test
    
    Ok.

    Query data in the distributed table test_all. The data can be queried.

    select * from test_all;

    SELECT *
    FROM test
    
    ┌───────────EventDate─┬─────────id─┐
    │ 2020-11-05 21:12:19             │ 3686805070           │
    └──────────────── ┴───────────┘
    
    1 rows in set. Elapsed: 0.002 sec. 
     

Common Troubleshooting

After the command for connecting to the ClickHouse client is executed, error message "Connection refused" is displayed.

Check whether the cluster uses custom ports (Component Port is set to Custom during cluster creation). If custom ports are used, replace the ports used in the command for connecting to the ClickHouse component client with the default custom ports in the following table.

Parameter

Default Open Source Port

Default Custom Port

Port Description

interserver_http_port

9009

9009

HTTP port for the communication between ClickHouse servers.

interserver_https_port

9010

9010

HTTPS port for the communication between ClickHouse servers.

http_port

8123

8123

Port for connecting to the ClickHouse server through HTTP.

https_port

8443

8443

Port for connecting to the ClickHouse server through HTTPS.

tcp_port

9000

9000

Port for connecting the client to the ClickHouse server through TCP.

tcp_port_secure

9440

9440

Port for connecting the client to the ClickHouse server through TCP SSL.

lb_tcp_port

21424

21424

TCP port listened by ClickHouseBalancer

lb_http_port

21425

21425

HTTP port listened by ClickHouseBalancer

lb_https_port

21426

21426

HTTPS port listened by ClickHouseBalancer

lb_tcp_secure_port

21428

21428

TCP SSL port listened by ClickHouseBalancer

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback