Updated on 2024-11-29 GMT+08:00

Flink Doris Connector

Flink Doris Connector allows you to perform operations (read, insert, modify, and delete) on data stored in Doris through Flink.

Only tables in the Unique Key model can be modified or deleted.

Prerequisite

  • A cluster containing the Doris service has been created, and all services in the cluster are running properly.
  • The nodes to be connected to the Doris database can communicate with the MRS cluster.
  • A user with Doris management permission has been created.
    • Kerberos authentication is enabled for the cluster (the cluster is in security mode)

      Log in to FusionInsight Manager, create a human-machine user, for example, dorisuser, create a role with Doris administrator permissions, and bind the role to the user.

      Log in to FusionInsight Manager as the new user dorisuser and change the initial password.

    • Kerberos authentication is disabled for the cluster (the cluster is in normal mode)

      After connecting to Doris as user admin, create a role with administrator permissions, and bind the role to the user.

  • The MySQL client has been installed. For details, see Installing a MySQL Client.
  • The Flink client has been installed.

Procedure

Perform the following operations on the Doris side:

  1. Log in to the node where MySQL is installed and connect the Doris database.

    If Kerberos authentication is enabled for the cluster (the cluster is in security mode), run the following command to connect to the Doris database:

    export LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN=1

    mysql -uDatabase login username -pDatabase login password -PConnection port for FE queries -hIP address of the Doris FE instance

    • To obtain the query connection port of the Doris FE instance, you can log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and query the value of query_port of the Doris service.
    • To obtain the IP address of the Doris FE instance, log in to FusionInsight Manager of the MRS cluster and choose Cluster > Services > Doris > Instances to view the IP address of any FE instance.
    • You can also use the MySQL connection software or Doris web UI to connect the database.

  2. Run the following statements to create a database and switch the database:

    create database if not exists testdb;

    use testdb;

  3. Run the following statements to create the z_test table and insert data into the table:

    create table z_test(id int, name string) distributed by hash(id) buckets 10;

    insert into z_test values(123, 'aaa'), (234, 'bbb'), (345, 'ccc');

  4. Run the following statement to create the z_test_sink_3 table:

    create table z_test_sink_3(id int, name string) distributed by hash(id) buckets 10;

Perform the following operations on the Flink side:

  1. Log in to the node where the Flink client is installed as the client installation user and run the following commands:

    cd Client installation directory

    source bigdata_env

    kinit Component service user (If Kerberos authentication is disabled for the cluster (the cluster is in normal mode), skip this step.)

  2. Run the following commands to log in to the Flink SQL client:

    cd Flink/flink/bin/

    sql-client.sh

  3. Create a Flink stream or batch SQL job on the Flink client. The following statement is an example:

    CREATE TABLE flink_doris_source (id INT, name STRING) WITH (

    'connector' = 'doris',

    'fenodes' ='IP address of the FE instance:29991',

    'table.identifier' = 'testdb.z_test',

    'username' = 'user',

    'password' = 'password',

    'doris.enable.https' = 'true',

    'doris.ignore.https.ca' = 'true'

    );

    CREATE TABLE flink_doris_sink (id INT, name STRING) WITH (

    'connector' = 'doris',

    'fenodes' ='IP address of the FE instance:29991',

    'table.identifier' = 'testdb.z_test_sink_3',

    'username' = 'user',

    'password' = 'password',

    'sink.label-prefix' = 'doris_label_6',

    'doris.enable.https' = 'true',

    'doris.ignore.https.ca' = 'true'

    );

    Run the following statement to insert data:

    INSERT INTO

    flink_doris_sink

    select

    id,

    name

    from

    flink_doris_source;

    • After HTTPS is enabled, add the following configuration parameters to the with clause for creating a table:
      • 'doris.enable.https' = 'true'
      • 'doris.ignore.https.ca' = 'true'
    • The fields in the source and sink tables must be the same as those in the Doris table.
    • 29991 is the HTTPS port of the FE service. After the port is switched to HTTP, change the port number to 29980. You can log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and search for http.
    • When you create a Flink job, set username to the Doris user and password to the password of the Doris user.