Updated on 2024-12-13 GMT+08:00

Interconnecting ClickHouse with RDS for MySQL

ClickHouse provides efficient data analysis in OLAP scenarios. It can map a table on the remote database server to the ClickHouse cluster through a database engine such as MySQL, so data can be analyzed in the ClickHouse cluster. The following describes how to interconnect the ClickHouse cluster with the MySQL database instance of RDS.

Prerequisites

  • You have prepared the RDS database instance to be interconnected with and the username and password of the database. For details, see Creating and Connecting to an RDS DB Instance.
  • A ClickHouse cluster has been created and is running properly.

Constraints

  • The RDS database instance and ClickHouse cluster are in the same VPC and subnet.
  • Before synchronizing data, you need to evaluate the impact on the performance of the source and destination databases. You are advised to synchronize data during off-peak hours.
  • Currently, ClickHouse can interconnect with MySQL and PostgreSQL instances of RDS, but cannot interconnect with SQL Server instances.

Interconnecting ClickHouse with RDS Using the MySQL Engine

The MySQL engine is used to map tables on the remote MySQL server to ClickHouse and allows you to run INSERT and SELECT statements on tables to facilitate data exchange between ClickHouse and MySQL.

Syntax for using the MySQL engine:
CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
ENGINE = MySQL('host:port', ['database' | database], 'user', 'password')

Parameters of the MySQL engine:

  • host:port: IP address and port number of the RDS MySQL database instance.
  • database: Name of the RDS MySQL database.
  • user: Username of the RDS MySQL database.
  • password: Password of the RDS MySQL database user.

Example of using the MySQL engine:

  1. Connect to the MySQL database of RDS. For details, see Connecting to a DB Instance.
  2. Create a table in the MySQL database and insert data into the table.

    Create table mysql_table.

    CREATE TABLE `mysql_table` (

    `int_id` INT NOT NULL AUTO_INCREMENT,

    `float` FLOAT NOT NULL,

    PRIMARY KEY (`int_id`));

    Insert data into the table.

    insert into mysql_table (`int_id`, `float`) VALUES (1,2);

  3. Log in to the node where the ClickHouse client is installed. Run the following command to go to the client installation directory:

    cd /opt/client

  4. Run the following command to configure environment variables:

    source bigdata_env

  5. If Kerberos authentication is enabled for the current cluster, run the following command to authenticate the current user. The user must have the permission to create ClickHouse tables. Therefore, you need to bind the corresponding role to the user. For details, see Creating a ClickHouse Role. If Kerberos authentication is disabled for the current cluster, skip this step.

    1. Run the following command if it is an MRS 3.1.0 cluster:

      export CLICKHOUSE_SECURITY_ENABLED=true

    2. kinit Component service user

      Example: kinit clickhouseuser

  6. Run the client command to connect to ClickHouse.

    clickhouse client --host IP address of the ClickHouse instance --user Username --password --port Port number

    Enter the user password.

  7. Create a MySQL database in ClickHouse. After the database is created, it automatically exchanges data with a MySQL server.

    CREATE DATABASE mysql_db ENGINE = MySQL('IP address of the RDS MySQL database instance:Port number of the MySQL database instance', 'MySQL database name', 'MySQL database username', 'Password of the MySQL database user');

  8. Switch to the created database mysql_db and query data in the table.

    USE mysql_db;

    Query the table data in the MySQL database in ClickHouse.

    SELECT * FROM mysql_table;

    ┌─int_id─┬─float─┐
    │      1   │     2   │
    └─────┴──── ┘

    Data can be properly queried after being inserted.

    INSERT INTO mysql_table VALUES (3,4);

    SELECT * FROM mysql_table;
    ┌─int_id─┬─float─┐
    │      1   │       2 │
    │      3   │       4 │
    └─────┴──── ┘

Enabling mysql_port for ClickHouse

Configure the ClickHouse port to connect the MySQL client to ClickHouse.

This operation is available for MRS 3.1.2 only.

  1. Log in to FusionInsight Manager and choose Cluster > Services > ClickHouse. Click Configurations then All Configurations. Search for the clickhouse-config-customize parameter and add the following configuration: Name: mysql_port and Value: 9004.

    The value is customizable.

    Click Save.

  2. Click the Dashboard tab, click More, and select Restart Instance; alternatively, click More and select Instance Rolling Restart.