Updated on 2024-11-12 GMT+08:00

Running ClickHouse in CCE

ClickHouse is a columnar database management system for online analytical processing (OLAP). It is suitable for real-time query and analysis of large-scale datasets. There are four ways to deploy ClickHouse on containers. For details, see Table 1. ClickHouse Operator is a tool for deploying and managing ClickHouse in Kubernetes clusters. It can replicate clusters and manage users, configuration files, and persistent volumes. These functions simplify application configuration, management, and monitoring.

Table 1 ClickHouse deployment on containers

Deployment Method

Difficulty in Deployment

Difficulty in Management

Native Kubectl

Difficult

Difficult

Kubectl and Operator

Medium

Medium

Helm

Easy

Difficult

Helm and Operator

Easy

Easy

The following describes how to deploy ClickHouse in a CCE cluster using Kubectl and Operator. For details, see https://github.com/Altinity/clickhouse-operator.

Prerequisites

Procedure for Deploying ClickHouse

The following describes how to deploy ClickHouse in a CCE Turbo cluster of v1.29. For details about the cluster parameters, see Table 2..

Table 2 Cluster parameters

Parameter

Value

Type

CCE Turbo Cluster

Cluster Version

1.29

Region

AP-Singapore

Container Engine

Containerd

Network Model

Cloud Native Network 2.0

Request Forwarding

iptables

  1. Create a ClickHouse Operator.

    1. Download the YAML file clickhouse-operator-install-bundle.yaml from https://github.com/Altinity/clickhouse-operator/blob/master/deploy/operator/clickhouse-operator-install-bundle.yaml.

      clickhouse-operator-install-bundle.yaml is used to deploy the ClickHouse Operator in the kube-system namespace to monitor the resources in all Kubernetes namespaces. If the ClickHouse Operator is deployed in another namespace, only resources in that namespace are monitored.

      kubectl apply -f clickhouse-operator-install-bundle.yaml

      Information similar to the following is displayed:

      customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.comcreated
      customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
      customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
      ...
    2. Check whether the ClickHouse Operator is successfully created.
      kubectl get pod -n kube-system | grep clickhouse

      If the pod status is Running, the ClickHouse Operator was successfully created.

      clickhouse-operator-656d67bd4d-k64gm          2/2      Running    4 (15m ago)    3d23h
    3. Check all CRD resources related to ClickHouse in the cluster.
      kubectl get crd | grep clickhouse

      Information similar to the following is displayed:

      clickhouseinstallations.clickhouse.altinity.com                2024-08-20T09:30:30Z
      clickhouseinstallationtemplates.clickhouse.altinity.com        2024-08-20T09:30:30Z
      clickhousekeeperinstallations.clickhouse-keeper.altinity.com   2024-08-20T09:30:30Z
      clickhouseoperatorconfigurations.clickhouse.altinity.com       2024-08-20T09:30:30Z

  2. Create namespace test-clickhouse-operator. To facilitate the verification, the subsequent operations are all performed in test-clickhouse-operator.

    kubectl create namespace test-clickhouse-operator

  3. Create a ClickHouse cluster.

    1. Create a YAML file named simple-01.yaml. You can obtain simple-01.yaml from https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/docs/chi-examples/01-simple-layout-01-1shard-1repl.yaml.
      vim simple-01.yaml

      ClickHouseInstallation is a custom resource object (CR) defined when a ClickHouse Operator is used in a Kubernetes cluster. After ClickHouseInstallation resources are created or updated, the ClickHouse Operator automatically creates and manages Kubernetes resources, such as StatefulSets, Services, and PersistentVolumeClaims, to ensure that the ClickHouse cluster runs as expected.

      The file content is as follows:

      apiVersion: "clickhouse.altinity.com/v1"
      kind: "ClickHouseInstallation"
      metadata:
         name: "simple-01"
      spec:
        configuration:
          users:
            # printf 'test_password' | sha256sum
            test_user/password_sha256_hex: 10a6e6cc8311a3e2bcc09bf6c199adecd5dd59408c343e926b129c4914f3cb01
            test_user/password: test_password
            # to allow access outside from kubernetes
            test_user/networks/ip:
            - 0.0.0.0/0
          clusters:
          - name: "simple"

      Use the preceding file to create a ClickHouse cluster.

      kubectl apply -n test-clickhouse-operator -f simple-01.yaml

  4. Check whether ClickHouse resources are successfully created.

    1. Check the pods of the test-clickhouse-operator namespace. If all pods are in the Running state, the pods were successfully created.
      kubectl get pod -n test-clickhouse-operator

      Information similar to the following is displayed:

      NAME                               READY   STATUS    RESTARTS      AGE
      chi-simple-01-simple-0-0-0         2/2     Running   0             3d7h
    2. Check the other service resources.
      kubectl get service -n test-clickhouse-operator

      Information similar to the following is displayed:

      NAME                         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                      AGE
      chi-simple-01-simple-0-0     ClusterIP   None          <none>        9000/TCP,8123/TCP,9009/TCP   3d7h
      clickhouse-simple-01         ClusterIP   None          <none>        9000/TCP,8123/TCP            3d8h

  5. Connect to the ClickHouse database.

    kubectl -n test-clickhouse-operator exec -ti chi-simple-01-simple-0-0-0 -- clickhouse-client

    If the following information is displayed, the connection is successful: Enter exit and press Enter to exit the ClickHouse database.

    ClickHouse client version 24.8.2.3 (official build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 24.8.2.
    
    Warnings:
     * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
    
    chi-simple-01-simple-0-0-0.chi-simple-01-simple-0-0.test-clickhouse-operator.svc.cluster.local :) 

  6. Clear ClickHouse cluster resources.

    Run the following command to delete the ClickHouse cluster:

    kubectl delete -f simple-01.yaml -n test-clickhouse-operator

    Information similar to the following is displayed:

    clickhouseinstallation.clickhouse.altinity.com "simple-01" deleted

Example 1: Creating a ClickHouse Cluster with a PV Provisioned Dynamically

The following example describes how to create a ClickHouse cluster with a PV dynamically provisioned. The EVS is used as an example to describe how to dynamically provision a PV for a ClickHouse cluster.

The VolumeClaimTemplate can only be used to provision EVS disks and local PVs to StatefulSets.

  1. Create a StorageClass.

    1. Create a YAML file named csi-disk-ssd.yaml.
      vim csi-disk-ssd.yaml

      By default, CCE supports SAS disks. If you want to use another type of disk, you need to create the corresponding StorageClass. For details about StorageClass parameters, see Table 3.

      allowVolumeExpansion: true
      apiVersion: storage.k8s.io/v1
      kind: StorageClass
      metadata:
        name: csi-disk-ssd
      provisioner: everest-csi-provisioner
      parameters:
        csi.storage.k8s.io/csi-driver-name: disk.csi.everest.io
        csi.storage.k8s.io/fstype: ext4
        everest.io/disk-volume-type: SSD
        everest.io/passthrough: "true"
      reclaimPolicy: Delete
      volumeBindingMode: Immediate
      Table 3 StorageClass parameters

      Parameter

      Description

      provisioner

      Specifies the storage resource provider, which is the Everest add-on for CCE. Set this parameter to everest-csi-provisioner.

      parameters

      Specifies the storage parameters, which vary with storage types.

      NOTICE:

      everest.io/disk-volume-type indicates the cloud disk type, which can be any of the following:

      • SAS: high I/O
      • SSD: ultra-high I/O
      • GPSSD: general-purpose SSD
      • ESSD: extreme SSD
      • GPSSD2: general-purpose SSD v2, which is supported when the Everest version is 2.4.4 or later and the everest.io/disk-iops and everest.io/disk-throughput annotations are configured.
      • ESSD2: extreme SSD v2, which is supported when the Everest version is 2.4.4 or later and the everest.io/disk-iops annotation is configured.

      Default: SAS

      reclaimPolicy

      Specifies the value of persistentVolumeReclaimPolicy for creating a PV. The value can be Delete or Retain. If reclaimPolicy is not specified when a StorageClass object is created, the value defaults to Delete.

      • Delete: indicates that a dynamically provisioned PV will be automatically deleted when the PVC is deleted.
      • Retain: indicates that a dynamically provisioned PV will be retained when the PVC is deleted.

      volumeBindingMode

      Specifies when a PV is dynamically provisioned. The value can be Immediate or WaitForFirstConsumer.

      • Immediate: The PV is dynamically provisioned when a PVC is created.
      • WaitForFirstConsumer: The PV is dynamically provisioned when the PVC is used by the workload.
    2. Use csi-disk-ssd.yaml to create a StorageClass named csi-disk-ssd.
      kubectl create -f csi-disk-ssd.yaml

  2. Create a ClickHouse cluster with a PV dynamically provisioned.

    1. Create a YAML file named pv-simple.yaml.
      vim pv-simple.yaml

      For details about the file content, see https://github.com/Altinity/clickhouse-operator/blob/master/docs/chi-examples/03-persistent-volume-01-default-volume.yaml.

      EVS disks can be mounted as read-write by a single node, so accessModes must be set to ReadWriteOnce.

      apiVersion: "clickhouse.altinity.com/v1"
      kind: "ClickHouseInstallation"
      metadata:
        name: "pv-simple"
        namespace: test-clickhouse-operator
      spec:
        defaults:
          templates:
            dataVolumeClaimTemplate: data-volume-template
            logVolumeClaimTemplate: log-volume-template
        configuration:
          clusters:
            - name: "simple"
              layout:
                shardsCount: 1
                replicasCount: 1
        templates:
          volumeClaimTemplates:                   # Dynamic provisioning
            - name: data-volume-template          # Template for defining a data storage volume
              spec:
                accessModes:
                  - ReadWriteOnce                 # EVS disks can be mounted as read-write by a single node, so accessModes must be set to ReadWriteOnce.
                resources:
                  requests:
                    storage: 10Gi
                storageClassName: csi-disk-ssd    # Specify the newly created csi-disk-ssd as the StorageClass.
            - name: log-volume-template           # Template for defining the log storage volume
              spec:
                accessModes:
                  - ReadWriteOnce                 # EVS disks can be mounted as read-write by a single node, so accessModes must be set to ReadWriteOnce.
                resources:
                  requests:
                    storage: 10Gi
                storageClassName: csi-disk-ssd    # Specify the newly created csi-disk-ssd as the StorageClass.
    2. Use pv-simple.yaml to create a ClickHouse cluster.
      kubectl -n test-clickhouse-operator create -f pv-simple.yaml

  3. Check whether the ClickHouse cluster is successfully created and whether the PV is successfully provisioned.

    1. Check the pods of the test-clickhouse-operator namespace. If all pods are in the Running state, the pods were successfully created.
      kubectl get pod -n test-clickhouse-operator

      If the following information is displayed, the pods are successfully created.

      NAME                            READY   STATUS    RESTARTS   AGE
      chi-pv-simple-simple-0-0-0      2/2     Running   0          5m2s
      chi-simple-01-simple-0-0-0      1/1     Running   0          3d7h
    2. Check whether the PVCs named data-volume-template and log-volume-template are successfully created.
      kubectl get pvc -n test-clickhouse-operator

      If STATUS is Bound, the PVC is bound successfully.

      NAME                                              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
      data-volume-template-chi-pv-simple-simple-0-0-0   Bound    pvc-981b1d73-a13e-41d5-aade-ea8c6b1199d7   10Gi       RWO            csi-disk-ssd   <unset>                 28s
      log-volume-template-chi-pv-simple-simple-0-0-0    Bound    pvc-fcf70a2e-131d-4da1-a9c2-eddd89887b45   10Gi       RWO            csi-disk-ssd   <unset>                 28s
    3. Check whether the PV is mounted to the cluster.

      Go to the CLI of the chi-pv-simple-simple-0-0-0 container.

      kubectl -n test-clickhouse-operator exec -ti chi-pv-simple-simple-0-0-0 -c clickhouse bash

      Check whether the PV is mounted to the container:

      df -h

      The command output shows that the PV has been mounted to the container. You can press Ctrl+D to exit the CLI.

      Filesystem                    Size    Used   Avail  Use% Mounted on
      overlay                        99G    5.1G     89G    6% /
      tmpfs                          64M       0     64M    0% /dev
      tmpfs                          3.9G      0    3.9G    0% /sys/fs/cgroup
      /dev/mapper/vgpaas-share       99G    5.1G     89G    6% /etc/hosts
      shm                            64M       0     64M    0% /dev/shm
      /dev/sdb                       9.8G    66M    9.8G    1% /var/lib/clickhouse
      /dev/sda                       9.8G    37M    9.8G    1% /var/log/clickhouse-server
      tmpfs                          6.3G     12K    6.3G   1% /run/secrets/kubernetes.io/serviceaccount
      tmpfs                          3.9G       0    3.9G   0% /proc/acpi
      tmpfs                          3.9G       0    3.9G   0% /proc/scsi
      tmpfs                          3.9G       0    3.9G   0% /sys/firmware

  4. Connect to the ClickHouse database.

    kubectl -n test-clickhouse-operator exec -ti chi-pv-simple-simple-0-0-0 -- clickhouse-client

    If the following information is displayed, you have successfully connected to the ClickHouse database: Enter exit and press Enter to exit the ClickHouse database.

    Defaulted container "clickhouse" out of: clickhouse, clickhouse-log
    ClickHouse client version 24.8.2.3 (official build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 24.8.2.
    
    Warnings:
     * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
    
    chi-pv-simple-simple-0-0-0.chi-pv-simple-simple-0-0.test-clickhouse-operator.svc.cluster.local :) 

  5. Clear ClickHouse cluster resources.

    Run the following command to delete the ClickHouse cluster with a PV provisioned dynamically:

    kubectl delete -f pv-simple.yaml -n test-clickhouse-operator

    Information similar to the following is displayed:

    clickhouseinstallation.clickhouse.altinity.com "pv-simple" deleted

Example 2: Creating a ClickHouse Cluster with a LoadBalancer Service

The following example describes how to create a ClickHouse cluster with a LoadBalancer Service. The LoadBalancer Service allows you to access the ClickHouse cluster from the Internet.

  1. Create a ClickHouse cluster with a LoadBalancer Service so that you can access the ClickHouse cluster from the Internet.

    1. Create a YAML file named elb.yaml.
      vim elb.yaml

      For details about parameters in kubernetes.io/elb.autocreate, see Table 4.

      apiVersion: "clickhouse.altinity.com/v1"
      kind: "ClickHouseInstallation"
      metadata:
        name: "ck-elb"
        namespace: test-clickhouse-operator
      spec:
        defaults:
          templates:
            dataVolumeClaimTemplate: data-volume-nas                    
            serviceTemplate: chi-service-elb                        
        configuration:
          clusters:
            - name: "ck-elb"
              templates:
                podTemplate: pod-template-with-nas
              layout:
                shardsCount: 1
                replicasCount: 1
        templates:
          podTemplates:
            - name: pod-template-with-nas
              spec:
                containers:
                  - name: clickhouse
                    image: clickhouse/clickhouse-server:23.8
                    volumeMounts:
                      - name: data-volume-nas
                        mountPath: /var/lib/clickhouse
          volumeClaimTemplates:               # Specify the storage access mode, requested storage size, and StorageClass.
            - name: data-volume-nas
              spec:
                accessModes:
                  - ReadWriteOnce
                resources:
                  requests:
                    storage: 20Gi
                storageClassName: csi-disk-ssd
          serviceTemplates:                  # Service template
            - name: chi-service-elb 
              metadata:
                annotations:
                  # Load balancer type. union (default value) indicates shared load balancers, and performance indicates dedicated load balancers.
                  kubernetes.io/elb.class: union   
                  # Automatically create a load balancer associated with the ingress and define load balancer parameters.
                  kubernetes.io/elb.autocreate: >-                             
                    {"type":"public","bandwidth_name":"cce-bandwidth-ck","bandwidth_chargemode":"bandwidth","bandwidth_size":5,"bandwidth_sharetype":"PER","eip_type":"5_bgp"}
              spec:
                ports:
                  - name: http
                    port: 8123
                  - name: client
                    port: 9000
                type: LoadBalancer             # Set the Service type to LoadBalancer.
      Table 4 Parameters in the kubernetes.io/elb.autocreate file

      Parameter

      Mandatory

      Type

      Description

      type

      No

      String

      Network type of the load balancer.

      • public: public network load balancer
      • inner: private network load balancer

      Default: inner

      bandwidth_name

      Yes for public network load balancers

      String

      Bandwidth name. The default value is cce-bandwidth-******.

      The value can contain 1 to 64 characters. Only letters, digits, underscores (_), hyphens (-), and periods (.) are allowed.

      bandwidth_chargemode

      No

      String

      Bandwidth billing mode.

      • bandwidth: billed by bandwidth
      • traffic: billed by traffic

      Default: bandwidth

      bandwidth_size

      Yes for public network load balancers

      Integer

      Bandwidth size. The default value is 1 to 2000 Mbit/s. Configure this parameter based on the bandwidth range allowed in your region.

      The minimum increment for bandwidth adjustment varies depending on the bandwidth range.

      • If the allowed bandwidth does not exceed 300 Mbit/s, the minimum increment is 1 Mbit/s.
      • If the allowed bandwidth is greater than 300 Mbit/s but less than or equal to 1000 Mbit/s, the minimum increment is 50 Mbit/s.
      • If the allowed bandwidth exceeds 1000 Mbit/s, the minimum increment is 500 Mbit/s.

      bandwidth_sharetype

      Yes for public network load balancers

      String

      Specifies the bandwidth sharing mode. PER indicates that the bandwidth is dedicated.

      eip_type

      Yes for public network load balancers

      String

      EIP type.

      • 5_bgp: Dynamic BGP
      • 5_sbgp: Static BGP

      The types vary by region. For details, see the EIP console.

    2. Use elb.yaml to create a ClickHouse cluster.
      kubectl create -f elb.yaml -n test-clickhouse-operator

  2. Check whether the ClickHouse cluster is successfully created and associated with a LoadBalancer Service.

    1. Check the pods of the test-clickhouse-operator namespace. If all pods are in the Running state, the pods were successfully created.
      kubectl get pod -n test-clickhouse-operator

      If the following information is displayed, the ClickHouse cluster is successfully created:

      NAME                            READY   STATUS    RESTARTS   AGE
      chi-ck-elb-ck-elb-0-0-0         1/1     Running   0          3m4s
      chi-pv-simple-simple-0-0-0      2/2     Running   0          33m
      chi-simple-01-simple-0-0-0      1/1     Running   0          3d7h
    2. Check whether the LoadBalancer Service is successfully created:
      kubectl get svc -n test-clickhouse-operator

      If the following information is displayed, the LoadBalancer Service is successfully created:

      NAME                          TYPE          CLUSTER-IP     EXTERNAL-IP    PORT(S)                          AGE
      chi-ck-elb-ck-elb-0-0         ClusterIP     None           <none>         9000/TCP,8123/TCP,9009/TCP       2s
      chi-pv-simple-simple-0-0      ClusterIP     None           <none>         9000/TCP,8123/TCP,9009/TCP       35m
      chi-simple-01-simple-0-0      ClusterIP     None           <none>         9000/TCP,8123/TCP,9009/TCP       38m
      clickhouse-pv-simple          ClusterIP     None           <none>         8123/TCP,9000/TCP                35m
      clickhouse-simple-01          ClusterIP     None           <none>         8123/TCP,9000/TCP                3d7h

  3. Connect to the ClickHouse database.

    kubectl -n test-clickhouse-operator exec -ti chi-ck-elb-ck-elb-0-0-0 -- clickhouse-client

    If the following information is displayed, you have successfully connected to the ClickHouse database: Enter exit and press Enter to exit the ClickHouse database.

    ClickHouse client version 23.8.16.16 (official build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 23.8.16 revision 54465.
    
    Warnings:
     * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
    
    chi-ck-elb-ck-elb-0-0-0.chi-ck-elb-ck-elb-0-0.test-clickhouse-operator.svc.cluster.local :) 

  4. Clear ClickHouse cluster resources.

    Delete the ClickHouse cluster (the one that is associated with the LoadBalancer Service).

    kubectl delete -f elb.yaml  -n test-clickhouse-operator

    Information similar to the following is displayed:

    clickhouseinstallation.clickhouse.altinity.com "ck-elb1" deleted

Follow-up Procedure: Clearing Other ClickHouse Resources

  1. Delete the test-clickhouse-operator namespace.
    kubectl delete namespace test-clickhouse-operator

    Information similar to the following is displayed:

    namespace "test-clickhouse-operator" deleted
  2. Delete the ClickHouse Operator:
    kubectl delete -f clickhouse-operator-install-bundle.yaml

    Information similar to the following is displayed:

    customresourcedefinition.apiextensions.k8s.io "clickhouseinstallations.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhouseinstallationtemplates.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhouseoperatorconfigurations.clickhouse.altinity.com" deleted
    customresourcedefinition.apiextensions.k8s.io "clickhousekeeperinstallations.clickhouse-keeper.altinity.com" deleted
    ...