Updated on 2024-05-29 GMT+08:00

Flink Doris Connector

Flink Doris Connector allows you to perform operations (read, insert, modify, and delete) on data stored in Doris through Flink.

Only tables of the Unique Key model can be modified or deleted.

Prerequisite

  • A cluster containing the Doris service has been created, and all services in the cluster are running properly.
  • The node to be connected to the Doris database can communicate with the MRS cluster.
  • A user with Doris management permission has been created.
    • Kerberos authentication is enabled for the cluster (the cluster is in security mode)

      Log in to FusionInsight Manager, create a human-machine user, for example, dorisuser, create a role with Doris administrator permissions, and bind the role to the user.

      Log in to FusionInsight Manager as the created dorisuser user, and change the initial password.

    • Kerberos authentication is disabled for the cluster (the cluster is in normal mode)

      After connecting to Doris as user admin, create a role with administrator permissions, and bind the role to the user.

  • The MySQL client has been installed. For details, see Installing a MySQL Client.
  • The Flink client has been installed.

Procedure

Doris side operation.

  1. Log in to the node where MySQL is installed and run the following command to connect to the Doris database:

    If Kerberos authentication is enabled for the cluster (the cluster is in security mode), run the following command to connect to the Doris database:

    export LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN=1

    mysql -uDatabase login username -pDatabase login password -PDatabase connection port -hIP address of the Doris FE instance

    • The database connection port is the query connection port of the Doris FE. You can log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and query the value of query_port of the Doris service.
    • To obtain the IP address of the Doris FE instance, log in to FusionInsight Manager of the MRS cluster and choose Cluster > Services > Doris > Instances to view the IP address of any FE instance.
    • You can also use the MySQL connection software or Doris WebUI to connect to the database.

  2. Run the following commands to create a database and switch the database:

    create database if not exists testdb;

    use testdb;

  3. Run the following commands to create the z_test table and insert data into the table:

    create table z_test(id int, name string) distributed by hash(id) buckets 10;

    insert into z_test values(123, 'aaa'), (234, 'bbb'), (345, 'ccc');

  4. Run the following command to create the z_test_sink_3 table:

    create table z_test_sink_3(id int, name string) distributed by hash(id) buckets 10;

Perform operations on Flink.

  1. Log in to the node where the Flink client is installed as the client installation user and run the following command:

    cd Client installation directory

    source bigdata_env

    kinit Component service user (If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), skip this step.)

  2. Run the following command to log in to the Flink SQL client:

    cd Flink/flink/bin/

    sql-client.sh

  3. Create a stream or batch Flink streaming SQL job on the Flink client. The following command is an example:

    CREATE TABLE flink_doris_source (id INT, name STRING) WITH (

    'connector' = 'doris',

    'fenodes' ='FE instance IP address:29991',

    'table.identifier' = 'testdb.z_test',

    'username' = 'user',

    'password' = 'password',

    'doris.enable.https' = 'true',

    'doris.ignore.https.ca' = 'true'

    );

    CREATE TABLE flink_doris_sink (id INT, name STRING) WITH (

    'connector' = 'doris',

    'fenodes' = ' FE instance IP address :29991',

    'table.identifier' = 'testdb.z_test_sink_3',

    'username' = 'user',

    'password' = 'password',

    'sink.label-prefix' = 'doris_label_6',

    'doris.enable.https' = 'true',

    'doris.ignore.https.ca' = 'true'

    );

    Run the following commands to insert data:

    INSERT INTO

    flink_doris_sink

    select

    id,

    name

    from

    flink_doris_source;

    • After HTTPS is enabled, add the following configuration parameters to the with clause for creating a table:
      • 'doris.enable.https' = 'true'
      • 'doris.ignore.https.ca' = 'true'
    • The fields in the source and sink tables must be the same as those in the Doris table.
    • The port number is the HTTPS port of the FE service (for clusters with Kerberos authentication enabled) or HTTP port (for clusters with Kerberos authentication disabled). You can log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and enter https_port or http_port in the search box.
    • When you create a Flink job, set username to the Doris user and password to the password of the Doris user.
    • If Kerberos authentication (security mode) has been enabled for the cluster, only the HTTPS mode can be configured.