Updated on 2024-08-10 GMT+08:00

ClickHouse Application Development

ClickHouse is a column-oriented database for online analytical processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.

ClickHouse features:

  • High data compression ratio
  • Multi-core parallel computing
  • Vectorized computing engine
  • Support for nested data structure
  • Support for sparse indexes
  • Support for INSERT and UPDATE

ClickHouse application scenarios:

  • Real-time data warehouse

    The streaming computing engine (such as Flink) is used to write real-time data to ClickHouse. With the excellent query performance of ClickHouse, multi-dimensional and multi-mode real-time query and analysis requests can be responded within subseconds.

  • Offline query

    Large-scale service data is imported to ClickHouse to construct a large wide table with hundreds of millions to tens of billions of records and hundreds of dimensions. It supports personalized statistics collection and continuous exploratory query and analysis at any time to assist business decision-making and provide excellent query experience.

MRS provides sample application development projects based on ClickHouse JDBC. This practice provides guidance for you to obtain and import a sample project after creating an MRS cluster and then conduct building and commissioning locally. In this sample project, you can create and delete ClickHouse tables, and insert and query data in the MRS cluster.

Creating an MRS ClickHouse Cluster

  1. Create and purchase an MRS cluster that contains ClickHouse. For details, see Buying a Custom Cluster.

    In this practice, an MRS 3.2.0-LTS.1 cluster, with ClickHouse installed and with Kerberos authentication enabled, is used as an example.

  2. Click Buy Now and wait until the MRS cluster is created.

Preparing an Application Authentication User

For an MRS cluster with Kerberos authentication enabled, prepare a user who has the operation permission on related components for application authentication.

The following ClickHouse permission configuration example is for reference only. You can modify the configuration as you need.

  1. After the cluster is created, log in to FusionInsight Manager.
  2. Choose System > Permission > Role and click Create Role in the right pane.

    1. Enter a role name, for example, developrole, and click OK.
    2. In Configure Resource Permission, select the desired cluster, choose ClickHouse, and select SUPER_USER_GROUP.

  3. Choose System > Permission > User, click Create in the right pane to create a human-machine user, for example, developuser, and add the developrole role.

    After the user is created, log in to FusionInsight Manager as developuser and change the initial password as prompted.

Obtaining the Sample Project

  1. Obtain the sample project from Huawei Mirrors.

    Download the Maven project source code and configuration files of the sample project, and configure related development tools on your local PC. For details, see Obtaining Sample Projects from Huawei Mirrors.

    Select a branch based on the cluster version and download the required MRS sample project.

    For example, the sample project suitable for this practice is clickhouse-examples, which can be obtained at https://github.com/huaweicloud/huaweicloud-mrs-example/tree/mrs-3.2.0.1/src/clickhouse-examples.

  2. Use IDEA to import the sample project and wait for the Maven project to download the dependency packages. For details, see Configuring and Importing Sample Projects.

    Figure 1 ClickHouse sample project

    After you configure Maven and SDK parameters on the local PC, the sample project automatically loads related dependency packages.

  3. In this sample project, the application connects to the ClickHouse server through the IP address and user information in the configuration file. Therefore, after the project is imported, you need to modify the clickhouse-example.properties file in the conf directory of the sample project based on the actual environment information.

    loadBalancerIPList=192.168.64.10,192.168.64.122
    sslUsed=true
    loadBalancerHttpPort=21425
    loadBalancerHttpsPort=21426
    CLICKHOUSE_SECURITY_ENABLED=true
    user=developuser
    # Passwords stored in plaintext pose security risks. Store them in ciphertext in configuration files or environment variables.
    password=User password
    isMachineUser=false
    isSupportMachineUser=false
    clusterName=default_cluster
    databaseName=testdb
    tableName=testtb
    batchRows=10000
    batchNum=10
    clickhouse_dataSource_ip_list=192.168.64.10:21426,192.168.64.122:21426
    native_dataSource_ip_list=192.168.64.10:21424,192.168.64.122:21424
    Table 1 Configuration description

    Configuration Item

    Description

    loadBalancerIPList

    Addresses of the ClickHouseBalancer instances.

    To view the instance IP addresses, log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, and click Instance.

    In this example, set this parameter to 192.168.64.10,192.168.64.122.

    sslUsed

    Whether to enable SSL encryption. Set this parameter to true for clusters in security mode.

    loadBalancerHttpPort

    HTTP and HTTPS port numbers of the load balancer.

    Log in to FusionInsight Manager and choose Cluster > Services > ClickHouse. Click Logical Cluster, locate the row containing the desired logical cluster, and view Port and Ssl Port in the HTTP Balancer Port column.

    loadBalancerHttpsPort

    CLICKHOUSE_SECURITY_ENABLED

    Whether to enable the ClickHouse security mode.

    In this example, set this parameter to true.

    user

    Authentication information of the development user. For a machine-machine user, leave password empty.

    Passwords stored in plaintext pose security risks. Store them in ciphertext in configuration files or environment variables.

    password

    isMachineUser

    Whether the authentication user is a machine-machine user.

    isSupportMachineUser

    Whether to support authentication of a machine-machine user. In this example, set this parameter to false.

    clusterName

    Name of the ClickHouse logical cluster connected to the application. In this example, retain the default value default_cluster.

    databaseName

    Names of the database and table to be created in the sample project. You can change the names based on site requirements.

    tableName

    batchRows

    Number of data records to be written in a batch. In this example, set this parameter to 10.

    batchNum

    Total number of batches for writing data. Retain the default value in this example.

    clickhouse_dataSource_ip_list

    Addresses and HTTP ports of the ClickHouseBalancer instances.

    Log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, and click Logical Cluster. This example uses a cluster in security mode. Therefore, locate the row containing the desired logical cluster, and view Ssl Port in the HTTP Balancer Port column.

    In this example, set this parameter to 192.168.64.10:21426,192.168.64.122:21426.

    native_dataSource_ip_list

    Addresses and TCP ports of the ClickHouseBalancer instances.

    Log in to FusionInsight Manager and choose Cluster > Services > ClickHouse. Click Logical Cluster, locate the row containing the desired logical cluster, and view Port in the TCP Balancer Port column.

    In this example, set this parameter to 192.168.64.10:21424,192.168.64.122:21424.

  4. Develop the application in this sample project through the clickhouse-jdbc API. For details about the code snippets of each function, see ClickHouse Sample Code.

    • Setting up a connection: Set up a connection to the ClickHouse service instance.

      During connection setup, the user information configured in Table 1 is passed as the authentication credential for security authentication on the server.

      clickHouseProperties.setPassword(userPass);
      clickHouseProperties.setUser(userName);
      BalancedClickhouseDataSource balancedClickhouseDataSource = new BalancedClickhouseDataSource(JDBC_PREFIX + UriList, clickHouseProperties);
    • Creating a database: Create a ClickHouse database.
      Execute the on cluster statement to create a database in the cluster.
      private void createDatabase(String databaseName, String clusterName) throws Exception  {    
           String createDbSql = "create database if not exists " + databaseName + " on cluster " + clusterName;    
           util.exeSql(createDbSql);
      }
    • Creating a table: Create a table in the ClickHouse database.

      Execute the on cluster statement to create a ReplicatedMerge table and a Distributed table in the cluster.

      private void createTable(String databaseName, String tableName, String clusterName) throws Exception { 
         String createSql = "create table " + databaseName + "." + tableName + " on cluster " + clusterName  + " (name String, age UInt8, date Date)engine=ReplicatedMergeTree('/clickhouse/tables/{shard}/" + databaseName + "." + tableName + "'," + "'{replica}') partition by toYYYYMM(date) order by age"; 
         String createDisSql = "create table " + databaseName + "." + tableName + "_all" + " on cluster " + clusterName + " as " + databaseName + "." + tableName + " ENGINE = Distributed(default_cluster," + databaseName + "," + tableName + ", rand());";    ArrayList<String> sqlList = new ArrayList<String>();   
         sqlList.add(createSql);   
         sqlList.add(createDisSql);  
         util.exeSql(sqlList);
      }
    • Inserting data: Insert data into the ClickHouse table.
      Insert data into the created table. The table created in this example has three columns, String, UInt8, and Date.
      String insertSql = "insert into " + databaseName + "." + tableName + " values (?,?,?)";
      PreparedStatement preparedStatement = connection.prepareStatement(insertSql);
      long allBatchBegin = System.currentTimeMillis();
      for (int j = 0; j < batchNum; j++) {
          for (int i = 0; i < batchRows; i++) { 
             preparedStatement.setString(1, "huawei_" + (i + j * 10));
             preparedStatement.setInt(2, ((int) (Math.random() * 100)));
             preparedStatement.setDate(3, generateRandomDate("2018-01-01", "2021-12-31")); 
             preparedStatement.addBatch();
          }
         long begin = System.currentTimeMillis();
         preparedStatement.executeBatch();    
         long end = System.currentTimeMillis();
         log.info("Inert batch time is {} ms", end - begin);
      }
      long allBatchEnd = System.currentTimeMillis();
      log.info("Inert all batch time is {} ms", allBatchEnd - allBatchBegin);

Building and Running the Application

If you can access the MRS cluster from your local PC, you can commission and run the application locally.

  1. In the clickhouse-examples project of IntelliJ IDEA, click Run 'Demo' to run the application project.

    Figure 2 Running the ClickHouse Demo application

  2. View the output on the console, as shown in the following figure. You can see that the ClickHouse table is created and data is inserted successfully.

    ...
    2023-06-03 11:30:27,127 | INFO  | main | Execute query:create table testdb.testtb on cluster default_cluster (name String, age UInt8, date Date)engine=ReplicatedMergeTree('/clickhouse/tables/{shard}/testdb.testtb','{replica}') partition by toYYYYMM(date) order by age | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68)
    2023-06-03 11:30:27,412 | INFO  | main | Execute time is 284 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72)
    2023-06-03 11:30:27,412 | INFO  | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:63)
    2023-06-03 11:30:28,426 | INFO  | main | Execute query:create table testdb.testtb_all on cluster default_cluster as testdb.testtb ENGINE = Distributed(default_cluster,testdb,testtb, rand()); | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68)
    2023-06-03 11:30:28,686 | INFO  | main | Execute time is 259 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72)
    2023-06-03 11:30:28,686 | INFO  | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.insertData(Util.java:137)
    2023-06-03 11:30:29,784 | INFO  | main | Insert batch time is 227 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:31,490 | INFO  | main | Insert batch time is 200 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:33,337 | INFO  | main | Insert batch time is 335 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:35,295 | INFO  | main | Insert batch time is 454 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:37,077 | INFO  | main | Insert batch time is 275 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:38,811 | INFO  | main | Insert batch time is 218 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:40,468 | INFO  | main | Insert batch time is 144 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:42,216 | INFO  | main | Insert batch time is 238 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:43,977 | INFO  | main | Insert batch time is 257 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:45,756 | INFO  | main | Insert batch time is 277 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154)
    2023-06-03 11:30:47,270 | INFO  | main | Inert all batch time is 17720 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:158)
    2023-06-03 11:30:47,271 | INFO  | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:63)
    2023-06-03 11:30:47,828 | INFO  | main | Execute query:select * from testdb.testtb_all order by age limit 10 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68)
    2023-06-03 11:30:47,917 | INFO  | main | Execute time is 89 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72)
    2023-06-03 11:30:47,918 | INFO  | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:63)
    2023-06-03 11:30:48,580 | INFO  | main | Execute query:select toYYYYMM(date),count(1) from testdb.testtb_all group by toYYYYMM(date) order by count(1) DESC limit 10 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68)
    2023-06-03 11:30:48,680 | INFO  | main | Execute time is 99 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72)
    2023-06-03 11:30:48,682 | INFO  | main | name	age	date	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,682 | INFO  | main | huawei_89	3	2021-02-21	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,682 | INFO  | main | huawei_81	3	2020-05-27	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,682 | INFO  | main | huawei_70	4	2021-10-28	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,682 | INFO  | main | huawei_73	4	2020-03-23	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | huawei_44	5	2020-12-10	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | huawei_29	6	2021-10-12	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | huawei_74	6	2021-03-03	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | huawei_38	7	2020-05-30	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | huawei_57	8	2020-09-27	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | huawei_23	8	2020-08-08	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,683 | INFO  | main | toYYYYMM(date)	count()	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,684 | INFO  | main | 202005	8	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,684 | INFO  | main | 202007	7	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,684 | INFO  | main | 202004	6	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,684 | INFO  | main | 202009	6	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,684 | INFO  | main | 202103	6	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,685 | INFO  | main | 202012	6	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,685 | INFO  | main | 202010	5	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,685 | INFO  | main | 202112	5	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,685 | INFO  | main | 202003	5	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,685 | INFO  | main | 202104	4	 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159)
    2023-06-03 11:30:48,689 | INFO  | main | Use HA module. | ru.yandex.clickhouse.BalancedClickhouseDataSource.<init>(BalancedClickhouseDataSource.java:122)
    2023-06-03 11:30:51,651 | INFO  | main | Name is: huawei_89, age is: 3 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,651 | INFO  | main | Name is: huawei_81, age is: 3 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,651 | INFO  | main | Name is: huawei_70, age is: 4 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,651 | INFO  | main | Name is: huawei_73, age is: 4 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,652 | INFO  | main | Name is: huawei_44, age is: 5 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,652 | INFO  | main | Name is: huawei_29, age is: 6 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,652 | INFO  | main | Name is: huawei_74, age is: 6 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,652 | INFO  | main | Name is: huawei_38, age is: 7 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,654 | INFO  | main | Name is: huawei_57, age is: 8 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    2023-06-03 11:30:51,654 | INFO  | main | Name is: huawei_23, age is: 8 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73)
    ...

  3. Install the MRS cluster client and log in to the ClickHouse client.

    For example, if the client installation directory is /opt/client, log in to the node where the client is installed as the client installation user.

    cd /opt/client

    source bigdata_env

    kinit developuser

  4. Run the following command to connect to the ClickHouse server:

    clickhouse client --host IP address of the ClickHouseServer instance --port Connection port --secure

    To obtain the IP address of the ClickHouse instance, log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, and click the Instance tab. You can obtain the connection port by searching for the tcp_port_secure parameter in the ClickHouse service configurations.

    For example, run the following command:

    clickhouse client --host 192.168.64.10 --port 21427 --secure

  5. Run the following command to view the table content created by the application:

    select * from testdb.testtb;

    ┌─name──────┬─age─┬───────date─┐
    │ huawei_70 │   4 │ 2021-10-28 │
    │ huawei_29 │   6 │ 2021-10-12 │
    │ huawei_16 │  28 │ 2021-10-04 │
    │ huawei_15 │  29 │ 2021-10-03 │
    └───────────┴─────┴────────────┘
    ...