ClickHouse Application Development
ClickHouse is a column-oriented database for online analytical processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.
ClickHouse features:
- High data compression ratio
- Multi-core parallel computing
- Vectorized computing engine
- Support for nested data structure
- Support for sparse indexes
- Support for INSERT and UPDATE
ClickHouse application scenarios:
- Real-time data warehouse
The streaming computing engine (such as Flink) is used to write real-time data to ClickHouse. With the excellent query performance of ClickHouse, multi-dimensional and multi-mode real-time query and analysis requests can be responded within subseconds.
- Offline query
Large-scale service data is imported to ClickHouse to construct a large wide table with hundreds of millions to tens of billions of records and hundreds of dimensions. It supports personalized statistics collection and continuous exploratory query and analysis at any time to assist business decision-making and provide excellent query experience.
MRS provides sample application development projects based on ClickHouse JDBC. This practice provides guidance for you to obtain and import a sample project after creating an MRS cluster and then conduct building and commissioning locally. In this sample project, you can create and delete ClickHouse tables, and insert and query data in the MRS cluster.
Creating an MRS ClickHouse Cluster
- Create and purchase an MRS cluster that contains ClickHouse. For details, see Buying a Custom Cluster.
In this practice, an MRS 3.2.0-LTS.1 cluster, with ClickHouse installed and with Kerberos authentication enabled, is used as an example.
- Click Buy Now and wait until the MRS cluster is created.
Preparing an Application Authentication User
For an MRS cluster with Kerberos authentication enabled, prepare a user who has the operation permission on related components for application authentication.
The following ClickHouse permission configuration example is for reference only. You can modify the configuration as you need.
- After the cluster is created, log in to FusionInsight Manager.
- Choose System > Permission > Role and click Create Role in the right pane.
- Enter a role name, for example, developrole, and click OK.
- In Configure Resource Permission, select the desired cluster, choose ClickHouse, and select SUPER_USER_GROUP.
- Choose System > Permission > User, click Create in the right pane to create a human-machine user, for example, developuser, and add the developrole role.
After the user is created, log in to FusionInsight Manager as developuser and change the initial password as prompted.
Obtaining the Sample Project
- Obtain the sample project from Huawei Mirrors.
Download the Maven project source code and configuration files of the sample project, and configure related development tools on your local PC. For details, see Obtaining Sample Projects from Huawei Mirrors.
Select a branch based on the cluster version and download the required MRS sample project.
For example, the sample project suitable for this practice is clickhouse-examples, which can be obtained at https://github.com/huaweicloud/huaweicloud-mrs-example/tree/mrs-3.2.0.1/src/clickhouse-examples.
- Use IDEA to import the sample project and wait for the Maven project to download the dependency packages. For details, see Configuring and Importing Sample Projects.
Figure 1 ClickHouse sample project
After you configure Maven and SDK parameters on the local PC, the sample project automatically loads related dependency packages.
- In this sample project, the application connects to the ClickHouse server through the IP address and user information in the configuration file. Therefore, after the project is imported, you need to modify the clickhouse-example.properties file in the conf directory of the sample project based on the actual environment information.
loadBalancerIPList=192.168.64.10,192.168.64.122 sslUsed=true loadBalancerHttpPort=21425 loadBalancerHttpsPort=21426 CLICKHOUSE_SECURITY_ENABLED=true user=developuser # Passwords stored in plaintext pose security risks. Store them in ciphertext in configuration files or environment variables. password=User password isMachineUser=false isSupportMachineUser=false clusterName=default_cluster databaseName=testdb tableName=testtb batchRows=10000 batchNum=10 clickhouse_dataSource_ip_list=192.168.64.10:21426,192.168.64.122:21426 native_dataSource_ip_list=192.168.64.10:21424,192.168.64.122:21424
Table 1 Configuration description Configuration Item
Description
loadBalancerIPList
Addresses of the ClickHouseBalancer instances.
To view the instance IP addresses, log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, and click Instance.
In this example, set this parameter to 192.168.64.10,192.168.64.122.
sslUsed
Whether to enable SSL encryption. Set this parameter to true for clusters in security mode.
loadBalancerHttpPort
HTTP and HTTPS port numbers of the load balancer.
Log in to FusionInsight Manager and choose Cluster > Services > ClickHouse. Click Logical Cluster, locate the row containing the desired logical cluster, and view Port and Ssl Port in the HTTP Balancer Port column.
loadBalancerHttpsPort
CLICKHOUSE_SECURITY_ENABLED
Whether to enable the ClickHouse security mode.
In this example, set this parameter to true.
user
Authentication information of the development user. For a machine-machine user, leave password empty.
Passwords stored in plaintext pose security risks. Store them in ciphertext in configuration files or environment variables.
password
isMachineUser
Whether the authentication user is a machine-machine user.
isSupportMachineUser
Whether to support authentication of a machine-machine user. In this example, set this parameter to false.
clusterName
Name of the ClickHouse logical cluster connected to the application. In this example, retain the default value default_cluster.
databaseName
Names of the database and table to be created in the sample project. You can change the names based on site requirements.
tableName
batchRows
Number of data records to be written in a batch. In this example, set this parameter to 10.
batchNum
Total number of batches for writing data. Retain the default value in this example.
clickhouse_dataSource_ip_list
Addresses and HTTP ports of the ClickHouseBalancer instances.
Log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, and click Logical Cluster. This example uses a cluster in security mode. Therefore, locate the row containing the desired logical cluster, and view Ssl Port in the HTTP Balancer Port column.
In this example, set this parameter to 192.168.64.10:21426,192.168.64.122:21426.
native_dataSource_ip_list
Addresses and TCP ports of the ClickHouseBalancer instances.
Log in to FusionInsight Manager and choose Cluster > Services > ClickHouse. Click Logical Cluster, locate the row containing the desired logical cluster, and view Port in the TCP Balancer Port column.
In this example, set this parameter to 192.168.64.10:21424,192.168.64.122:21424.
- Develop the application in this sample project through the clickhouse-jdbc API. For details about the code snippets of each function, see ClickHouse Sample Code.
- Setting up a connection: Set up a connection to the ClickHouse service instance.
During connection setup, the user information configured in Table 1 is passed as the authentication credential for security authentication on the server.
clickHouseProperties.setPassword(userPass); clickHouseProperties.setUser(userName); BalancedClickhouseDataSource balancedClickhouseDataSource = new BalancedClickhouseDataSource(JDBC_PREFIX + UriList, clickHouseProperties);
- Creating a database: Create a ClickHouse database.
- Creating a table: Create a table in the ClickHouse database.
Execute the on cluster statement to create a ReplicatedMerge table and a Distributed table in the cluster.
private void createTable(String databaseName, String tableName, String clusterName) throws Exception { String createSql = "create table " + databaseName + "." + tableName + " on cluster " + clusterName + " (name String, age UInt8, date Date)engine=ReplicatedMergeTree('/clickhouse/tables/{shard}/" + databaseName + "." + tableName + "'," + "'{replica}') partition by toYYYYMM(date) order by age"; String createDisSql = "create table " + databaseName + "." + tableName + "_all" + " on cluster " + clusterName + " as " + databaseName + "." + tableName + " ENGINE = Distributed(default_cluster," + databaseName + "," + tableName + ", rand());"; ArrayList<String> sqlList = new ArrayList<String>(); sqlList.add(createSql); sqlList.add(createDisSql); util.exeSql(sqlList); }
- Inserting data: Insert data into the ClickHouse table.
Insert data into the created table. The table created in this example has three columns, String, UInt8, and Date.
String insertSql = "insert into " + databaseName + "." + tableName + " values (?,?,?)"; PreparedStatement preparedStatement = connection.prepareStatement(insertSql); long allBatchBegin = System.currentTimeMillis(); for (int j = 0; j < batchNum; j++) { for (int i = 0; i < batchRows; i++) { preparedStatement.setString(1, "huawei_" + (i + j * 10)); preparedStatement.setInt(2, ((int) (Math.random() * 100))); preparedStatement.setDate(3, generateRandomDate("2018-01-01", "2021-12-31")); preparedStatement.addBatch(); } long begin = System.currentTimeMillis(); preparedStatement.executeBatch(); long end = System.currentTimeMillis(); log.info("Inert batch time is {} ms", end - begin); } long allBatchEnd = System.currentTimeMillis(); log.info("Inert all batch time is {} ms", allBatchEnd - allBatchBegin);
- Setting up a connection: Set up a connection to the ClickHouse service instance.
Building and Running the Application
If you can access the MRS cluster from your local PC, you can commission and run the application locally.
- In the clickhouse-examples project of IntelliJ IDEA, click Run 'Demo' to run the application project.
Figure 2 Running the ClickHouse Demo application
- View the output on the console, as shown in the following figure. You can see that the ClickHouse table is created and data is inserted successfully.
... 2023-06-03 11:30:27,127 | INFO | main | Execute query:create table testdb.testtb on cluster default_cluster (name String, age UInt8, date Date)engine=ReplicatedMergeTree('/clickhouse/tables/{shard}/testdb.testtb','{replica}') partition by toYYYYMM(date) order by age | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68) 2023-06-03 11:30:27,412 | INFO | main | Execute time is 284 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72) 2023-06-03 11:30:27,412 | INFO | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:63) 2023-06-03 11:30:28,426 | INFO | main | Execute query:create table testdb.testtb_all on cluster default_cluster as testdb.testtb ENGINE = Distributed(default_cluster,testdb,testtb, rand()); | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68) 2023-06-03 11:30:28,686 | INFO | main | Execute time is 259 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72) 2023-06-03 11:30:28,686 | INFO | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.insertData(Util.java:137) 2023-06-03 11:30:29,784 | INFO | main | Insert batch time is 227 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:31,490 | INFO | main | Insert batch time is 200 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:33,337 | INFO | main | Insert batch time is 335 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:35,295 | INFO | main | Insert batch time is 454 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:37,077 | INFO | main | Insert batch time is 275 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:38,811 | INFO | main | Insert batch time is 218 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:40,468 | INFO | main | Insert batch time is 144 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:42,216 | INFO | main | Insert batch time is 238 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:43,977 | INFO | main | Insert batch time is 257 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:45,756 | INFO | main | Insert batch time is 277 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:154) 2023-06-03 11:30:47,270 | INFO | main | Inert all batch time is 17720 ms | com.huawei.clickhouse.examples.Util.insertData(Util.java:158) 2023-06-03 11:30:47,271 | INFO | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:63) 2023-06-03 11:30:47,828 | INFO | main | Execute query:select * from testdb.testtb_all order by age limit 10 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68) 2023-06-03 11:30:47,917 | INFO | main | Execute time is 89 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72) 2023-06-03 11:30:47,918 | INFO | main | Current load balancer is 192.168.64.10:21426 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:63) 2023-06-03 11:30:48,580 | INFO | main | Execute query:select toYYYYMM(date),count(1) from testdb.testtb_all group by toYYYYMM(date) order by count(1) DESC limit 10 | com.huawei.clickhouse.examples.Util.exeSql(Util.java:68) 2023-06-03 11:30:48,680 | INFO | main | Execute time is 99 ms | com.huawei.clickhouse.examples.Util.exeSql(Util.java:72) 2023-06-03 11:30:48,682 | INFO | main | name age date | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,682 | INFO | main | huawei_89 3 2021-02-21 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,682 | INFO | main | huawei_81 3 2020-05-27 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,682 | INFO | main | huawei_70 4 2021-10-28 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,682 | INFO | main | huawei_73 4 2020-03-23 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | huawei_44 5 2020-12-10 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | huawei_29 6 2021-10-12 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | huawei_74 6 2021-03-03 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | huawei_38 7 2020-05-30 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | huawei_57 8 2020-09-27 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | huawei_23 8 2020-08-08 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,683 | INFO | main | toYYYYMM(date) count() | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,684 | INFO | main | 202005 8 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,684 | INFO | main | 202007 7 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,684 | INFO | main | 202004 6 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,684 | INFO | main | 202009 6 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,684 | INFO | main | 202103 6 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,685 | INFO | main | 202012 6 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,685 | INFO | main | 202010 5 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,685 | INFO | main | 202112 5 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,685 | INFO | main | 202003 5 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,685 | INFO | main | 202104 4 | com.huawei.clickhouse.examples.Demo.queryData(Demo.java:159) 2023-06-03 11:30:48,689 | INFO | main | Use HA module. | ru.yandex.clickhouse.BalancedClickhouseDataSource.<init>(BalancedClickhouseDataSource.java:122) 2023-06-03 11:30:51,651 | INFO | main | Name is: huawei_89, age is: 3 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,651 | INFO | main | Name is: huawei_81, age is: 3 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,651 | INFO | main | Name is: huawei_70, age is: 4 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,651 | INFO | main | Name is: huawei_73, age is: 4 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,652 | INFO | main | Name is: huawei_44, age is: 5 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,652 | INFO | main | Name is: huawei_29, age is: 6 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,652 | INFO | main | Name is: huawei_74, age is: 6 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,652 | INFO | main | Name is: huawei_38, age is: 7 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,654 | INFO | main | Name is: huawei_57, age is: 8 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) 2023-06-03 11:30:51,654 | INFO | main | Name is: huawei_23, age is: 8 | com.huawei.clickhouse.examples.ClickhouseJDBCHaDemo.queryData(ClickhouseJDBCHaDemo.java:73) ...
- Install the MRS cluster client and log in to the ClickHouse client.
For example, if the client installation directory is /opt/client, log in to the node where the client is installed as the client installation user.
cd /opt/client
source bigdata_env
kinit developuser
- Run the following command to connect to the ClickHouse server:
clickhouse client --host IP address of the ClickHouseServer instance --port Connection port --secure
To obtain the IP address of the ClickHouse instance, log in to FusionInsight Manager, choose Cluster > Services > ClickHouse, and click the Instance tab. You can obtain the connection port by searching for the tcp_port_secure parameter in the ClickHouse service configurations.
For example, run the following command:
clickhouse client --host 192.168.64.10 --port 21427 --secure
- Run the following command to view the table content created by the application:
select * from testdb.testtb;
┌─name──────┬─age─┬───────date─┐ │ huawei_70 │ 4 │ 2021-10-28 │ │ huawei_29 │ 6 │ 2021-10-12 │ │ huawei_16 │ 28 │ 2021-10-04 │ │ huawei_15 │ 29 │ 2021-10-03 │ └───────────┴─────┴────────────┘ ...
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot