Updated on 2023-07-20 GMT+08:00

Using ClickHouse from Scratch

ClickHouse is a column-based database oriented to online analysis and processing. It supports SQL query and provides good query performance. The aggregation analysis and query performance based on large and wide tables is excellent, which is one order of magnitude faster than other analytical databases.

Prerequisites

The client has been installed in a directory, for example, /opt/client. The client directory in the following operations is only an example. Change it to the actual installation directory. Before using the client, download and update the client configuration file, and ensure that the active management node of Manager is available.

Procedure

  1. Install the client. For details, see Installing a Client.
  2. Log in to the node where the client is installed as the client installation user.
  3. Run the following command to go to the client installation directory:

    cd /opt/client

  4. Run the following command to configure environment variables:

    source bigdata_env

  5. If Kerberos authentication has been enabled for the current cluster, run the following command to authenticate the user. The current user must have the permission to create ClickHouse tables. For details about how to configure the permission, see ClickHouse User and Permission Management, and bind roles to the user. If Kerberos authentication is disabled for the current cluster, skip this step.

    For an MRS 3.1.0 cluster, run the export CLICKHOUSE_SECURITY_ENABLED=true command first.

    kinit Component service user

    Example: kinit clickhouseuser

  6. Run the client command of the ClickHouse component.

    Run the clickhouse -h command to view the command help of ClickHouse.

    The command output is as follows:

    Use one of the following commands:
    clickhouse local [args] 
    clickhouse client [args] 
    clickhouse benchmark [args] 
    clickhouse server [args] 
    clickhouse performance-test [args] 
    clickhouse extract-from-config [args] 
    clickhouse compressor [args] 
    clickhouse format [args] 
    clickhouse copier [args] 
    clickhouse obfuscator [args]
    ...

    For MRS 3.1.0, run the clickhouse client command to connect to the ClickHouse server.

    • Command for using a non-SSL mode to log in to a ClickHouse cluster with Kerberos authentication disabled

      clickhouse client --host IP address of the ClickHouse instance --port 9000 --user Username --password

      Enter the user password.

    • Using SSL for login when Kerberos authentication is enabled for the current cluster:

      There are no default users in clusters with Kerberos authentication enabled. You must create a user on FusionInsight Manager. For details about how to create a user, see ClickHouse User and Permission Management.

      After the user authentication is successful, you do not need to carry the --user and --password parameters when logging in to the client as the authenticated user.

      clickhouse client --host IP address of the ClickHouse instance --port 9440 --secure

    For MRS 3.1.2 or later, run the clickhouse client command to connect to the ClickHouse server.
    • Command for using a non-SSL mode to log in to a ClickHouse cluster with Kerberos authentication disabled

      clickhouse client --host IP address of the ClickHouse instance --port 9000 --user Username --password

      Enter the user password.

    • Using SSL for login when Kerberos authentication is enabled for the current cluster:

      There are no default users in clusters with Kerberos authentication enabled. You must create a user on FusionInsight Manager. For details about how to create a user, see ClickHouse User and Permission Management.

      clickhouse client --host IP address of the ClickHouse instance --port 9440 --user Username --password --secure

      Enter the user password.

    Run the quit; command to exit the ClickHouse server connection.

    Table 1 describes related parameters.

    Table 1 Parameters of the clickhouse client command

    Parameter

    Description

    --host

    Host name of the server. The default value is localhost. You can use the host name or IP address of the node where the ClickHouse instance is located.

    NOTE:

    You can log in to FusionInsight Manager and choose Cluster > Services > ClickHouse > Instance to obtain the service IP address of the ClickHouseServer instance.

    --port

    Port for connection.

    • If the SSL security connection is used, the default port number is 9440, the parameter --secure must be carried. For details about the port number, search for the tcp_port_secure parameter in the ClickHouseServer instance configuration.
    • If non-SSL security connection is used, the default port number is 9000, the parameter --secure does not need to be carried. For details about the port number, search for the tcp_port parameter in the ClickHouseServer instance configuration.

    --user

    Username.

    You can create a user on FusionInsight Manager and bind roles to it. For details about how to create a user, see ClickHouse User and Permission Management.

    • If Kerberos authentication has been enabled for the current cluster (the cluster is in security mode) and the user authentication is successful, you do not need to carry the --user and --password parameters during your login to the client as the authenticated user. You must create a user with this name on Manager because there is no default user in the Kerberos cluster scenario.
    • If Kerberos authentication has not been enabled for the current cluster (the cluster is in normal mode), you cannot use the ClickHouse user created on FusionInsight Manager if you need to specify the username and password when you log in to the client. You need to execute the create user SQL statement on the client to create a ClickHouse user. If you do not need to specify the username and password during your login to the client, the default user is used by default.

    --password

    Password. The default password is an empty string. This parameter is used together with the --user parameter. You can set a password when creating a user on Manager.

    --query

    Query to process when using non-interactive mode.

    --database

    Current default database. The default value is default, which is the default configuration on the server.

    --multiline

    If this parameter is specified, multiline queries are allowed. (Enter only indicates line feed and does not indicate that the query statement is complete.)

    --multiquery

    If this parameter is specified, multiple queries separated with semicolons (;) can be processed. This parameter is valid only in non-interactive mode.

    --format

    Specified default format used to output the result.

    --vertical

    If this parameter is specified, the result is output in vertical format by default. In this format, each value is printed on a separate line, which helps to display a wide table.

    --time

    If this parameter is specified, the query execution time is printed to stderr in non-interactive mode.

    --stacktrace

    If this parameter is specified, stack trace information will be printed when an exception occurs.

    --config-file

    Name of the configuration file.

    --secure

    If this parameter is specified, the server will be connected in SSL mode.

    --history_file

    Path of files that record command history.

    --param_<name>

    Query with parameters. Pass values from the client to the server. For details, see https://clickhouse.tech/docs/en/interfaces/cli/#cli-queries-with-parameters.