Quickly Creating a ClickHouse Cluster and Performing Statistical Analysis
ClickHouse offers easy-to-use, flexible, and stable hosting services in the cloud. A data warehouse can be created in minutes for massive real-time data query and analysis, improving the overall efficiency of data value mining. By leveraging the massively parallel processing (MPP) architecture, ClickHouse can query data several times faster than conventional data warehouses.
Background Information
Assume that there is a student score table and you need to use ClickHouse to perform the following operations:
- Create the user information table demo_t.
- Add the user gender and subject to the user information.
- Query basic user information by user ID.
- Delete the user information table after the service ends.
Table 1 Score table user_id
name
sex
subject
score
time
10000
A
1
Chinese
89
2023-07-01 09:00:00
10001
B
0
Math
132
2023-07-01 09:00:00
10002
C
0
Math
90
2023-07-02 09:00:00
10003
D
0
English
120
2023-07-01 14:00:00
10004
E
1
Chinese
101
2023-07-01 09:00:00
10005
F
1
Chinese
110
2023-07-01 09:00:00
Table 2 Description Parameter
Value
Description
user_id
10000
User ID, which uniquely identifies a user.
name
2023-07-01 09:00:00
Data import time
sex
A
Student name
subject
1
Gender female (0 indicates male).
score
Chinese
Discipline
time
89
Score
Preparations
- Sign up for a HUAWEI ID and enable Huawei Cloud services. For details, see Signing Up for a HUAWEI ID and Enabling Huawei Cloud Services. The account cannot be in arrears or frozen.
- Create a VPC and subnet. For details, see Creating a VPC and Subnet.
Step 1: Buying a ClickHouse Cluster
- Log in to the CloudTable console.
- Select a region in the upper left corner.
- Click Cluster Management.
- Click Buy Cluster in the upper right corner of the Cluster Management page and set related parameters. For details about how to configure ports for security group rules, see ClickHouse security group rules.
- Click Buy Now. On the displayed page, confirm the specifications and click Finish.
- Return to the cluster list to view the cluster creation progress. If the cluster status is Running, the cluster is successfully created. For details, see Creating a ClickHouse Cluster.
Table 3 ClickHouse security group rules Direction
Action
Port/Range
Type
Destination/Source Address
Description
Outbound
Allow
All
IPv4/IPv6
0.0.0.0/0
Permit in the outbound direction
Inbound
Allow
8123
Security group of the CloudTable ClickHouse cluster
ClickHouse HTTP port number
Allow
9000
ClickHouse TCP port number
Allow
8443
ClickHouse HTTPS port number
Allow
9440
Secure TCP security port of ClickHouse
Allow
2181
ZooKeeper client connection monitoring port
Step 2: Downloading the ClickHouse Client and Verification File.
- Log in to the CloudTable console.
- Select a region in the upper left corner.
- Click Help in the navigation pane.
- Choose Download the ClickHouse Client under Helpful Links on the right of the help page to download the client installation package.
- Click Download Client Verification File to download the verification file.
Step 3: Preparing an ECS
- Purchase an ECS and log in to the ECS console.
- Select a region in the upper left corner.
- In the service list on the left, choose Computing > Elastic Cloud Server. The Elastic Cloud Server page is displayed.
- Click Buy ECS in the upper right corner. The parameter configuration page is displayed.
- Configure ECS parameters, including basic settings, instance, OS, storage replica, network, security group, public access, ECS management, advanced settings, and quantity.
- Check the configurations, select the agreement, and click Submit. After the ECS is created, it will be started by default.
For details, see Purchasing an ECS.
To ensure successful connection of the cluster to the VPC, the security group configurations must align with those of the ECS.
Step 4: Adding a Security Group
- Obtain the IP address of the local host. Press Win+R. The Run dialog box is displayed.
- Enter cmd in the text box and click OK. The cmd window is displayed.
- Enter ipconfig in the command window and press Enter to query the IP address of the local host.
- Log in to the ECS console.
- On the ECS list page, click the ECS name. On the Basic Information tab page, click the Security Group tab. On the displayed page, click Inbound Rules.
- Click Add Rule in the upper right corner of the page.
- Enter the local IP address obtained in 3 as the source IP address. Click OK. The security group is added.
Step 5: Installing and Verifying the ClickHouse Client
You can manually install the client on an ECS.
- Use the SSH login tool (such as PuTTY) to log in to the Linux ECS through the EIP.
For details about how to log in to the ECS, see "Remotely Logging In to a Linux ECS (Using an SSH Password)" in Logging In to a Linux ECS of the Elastic Cloud Server User Guide.
- Upload the client downloaded in 2 to the Linux ECS.
- Install the client and connect to the cluster.
- Use the SSH login tool to remotely log in to the Linux ECS through the EIP.
For details, see Login Using an SSH Password in the Elastic Cloud Server User Guide.
- Go to the root directory of the SSH login tool.
cd /
- Create a folder in the root directory.
mkdir Folder name
- Go to the directory of the created folder.
cd /Folder name/
- Place the client in the directory.
- Decompress the client package.
tar -zxf Client package name
- Decompress the client verification file to the same directory as the client.
- Decompress the client verification file.
cd <Path for storing the client verification file > tar xzvf Client_sha256.tar.gz
- Obtain the client verification code.
sha256sum ClickHouse_Client_23.3.tar.gz
- Check the verification code in the client verification file and compare it with the client verification code. A match indicates no tampering, while a mismatch suggests tampering.
less ClickHouse_Client_23.3.tar.gz.sha256
- Decompress the client verification file.
- Load the .so file.
sh install.sh
- Go to the bin directory.
cd bin/
Grant the 700 permission to the directory.
chmod 700 clickhouse
- Connect to the port of the ClickHouse cluster.
Use the following command to connect to a normal cluster.
./clickhouse client --host Private IP address of the cluster --port 9000 --user admin --password Password
For details about the security cluster connection commands, see Configuring Secure Channel Encryption for ClickHouse Clusters.
./clickhouse client --host Private IP address of the cluster --port 9440 --user admin --password Password --secure --config-file /root/config.xml
- Private IP Address: cluster access address on the cluster details page. Replace it with the access address of the cluster you purchased.
- Password: the password set when you purchase the cluster. If there are special characters, use backslashes (\) to escape them. If the password is enclosed in single quotation marks ('), the special characters do not need to be escaped.
- Use the SSH login tool to remotely log in to the Linux ECS through the EIP.
Step 6: Inserting Data
- Create a database.
create database DB_demo;
- Use the database.
use DB_demo;
- Create a table.
create table DB_demo_t(user_id Int32,name String,sex Tinyint ,subject String,score Int32,time datetime)engine=TinyLog;
- Insert data.
insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10000','A','1','Chinese','89','2023-07-01 09:00:00'); insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10001','B','0','Math','132','2023-07-01 09:00:00'); insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10002','C','0','Math','90','2023-07-02 09:00:00'); insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10003','D','0','English','120','2023-07-01 14:00:00'); insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10004','E','1','Chinese','101','2023-07-01 09:00:00'); insert into DB_demo_t(user_id,name,sex,subject,score,time) values('10005','F','1','Chinese','110','2023-07-01 09:00:00');
- Query the data.
- Query the imported data.
host-172-16-13-95 :) select * from DB_demo_t; SELECT * FROM DB_demo_t Query id: 4e119f77-0592-4131-bbe2-31f42bc069a1 ┌─user_id─┬─name─┬─sex─┬─subject─┬─score─┬────────────────time─┐ │ 10000 │ A │ 1 │ Chinese │ 89 │ 2023-07-01 09:00:00 │ │ 10001 │ B │ 0 │ Math │ 132 │ 2023-07-01 09:00:00 │ │ 10002 │ C │ 0 │ Math │ 90 │ 2023-07-02 09:00:00 │ │ 10003 │ D │ 0 │ English │ 120 │ 2023-07-01 14:00:00 │ │ 10004 │ E │ 1 │ Chinese │ 101 │ 2023-07-01 09:00:00 │ │ 10005 │ F │ 1 │ Chinese │ 110 │ 2023-07-01 09:00:00 │ └─────────┴──────┴─────┴────────┴───────┴─────────────────────┘ 6 rows in set. Elapsed: 0.004 sec.
- Query the imported data.
- Delete data.
- Delete the table.
drop table DB_demo_t;
- Delete the database.
drop database DB_demo;
- Delete the table.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot