Updated on 2024-05-29 GMT+08:00

Using Doris from Scratch

Doris is a high-performance and real-time analytical database based on the MPP architecture. It supports not only high-concurrency point query scenarios, but also high-throughput complex analysis scenarios.

This document uses examples to describe how to use an MRS Doris cluster to perform basic table creation and query operations.

Doris database names and table names are case sensitive.

Prerequisite

  • A cluster containing the Doris service has been created, and all services in the cluster are running properly.
  • The node to be connected to the Doris database can communicate with the MRS cluster.
  • The MySQL client has been installed. For details, see Installing a MySQL Client.

Procedure

  1. Create a user with the Doris management permission.

    • Kerberos authentication is enabled for the cluster (the cluster is in security mode)
      1. Log in to FusionInsight Manager and choose System. In the navigation pane on the left, click Permission > Role, and click Create Role. On the displayed page, enter the role name, for example, dorisrole. In the Configure Resource Permission area, select target cluster > Doris, select Doris Admin Privilege, and click OK.
      2. Choose User > Create, enter a username, for example, dorisuser, set User Type to Human-Machine, retain the default value for Password Policy, enter the user password, confirm the password, associate the user with the dorisrole role, and click OK.
      3. Log in to FusionInsight Manager as the new dorisuser user and change the initial password of the user.
    • Kerberos authentication is disabled for the cluster (the cluster is in normal mode)
      1. Log in to the node where the MySQL client is installed and connect to the Doris service as user admin.

        mysql -uadmin -PDatabase connection port -hIP address of Doris FE instance

        • The default password of user admin is empty.
        • The database connection port is the query connection port of the Doris FE. You can also log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and query the value of query_port of the Doris service.
        • To obtain the IP address of the Doris FE instance, log in to FusionInsight Manager of the MRS cluster and choose Cluster > Services > Doris > Instances to view the IP address of any FE instance.
        • You can also use the MySQL connection software or Doris WebUI to connect to the database.
        • Only clusters of MRS 3.3.0 and later versions support role assignment on FusionInsight Manager. If the cluster is of MRS 3.3.0 or earlier, you need to connect to the database as user root (the default password is empty) regardless of whether Kerberos authentication is enabled.
      2. Run the following command to create a role:

        CREATE ROLE dorisrole;

      3. Run the following command to grant permissions to the role. For details about the permissions, see Introduction to User Rights. For example, to grant the ADMIN_PRIV permission to the role, run the following command:

        GRANT ADMIN_PRIV ON *.*.* TO ROLE 'dorisrole';

      4. Run the following commands to create a user and bind the user to a role:

        CREATE USER 'dorisuser'@'%' IDENTIFIED BY 'password' DEFAULT ROLE 'dorisrole';

        There can be security risks if a command contains the authentication password. You are advised to disable the command recording function (history) before running the command.

  2. Log in to the node where MySQL is installed and run the following command to connect to the Doris database:

    If Kerberos authentication is enabled for the cluster (the cluster is in security mode), run the following command to connect to the Doris database:

    export LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN=1

    mysql -uDatabase login user -pDatabase login user password -PDatabase connection port -hDoris FE instance IP address

    • The database connection port is the query connection port of the Doris FE. You can log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and query the value of query_port of the Doris service.
    • To obtain the IP address of the Doris FE instance, log in to FusionInsight Manager of the MRS cluster and choose Cluster > Services > Doris > Instances to view the IP address of any FE instance.
    • You can also use the MySQL connection software or Doris WebUI to connect to the database.

  3. Run the following command to check the running status of the FE:

    SHOW FRONTENDS\G;

    SHOW BACKENDS\G;

  4. Create a database.

    create database if not exists mrs_demo;

    use mrs_demo;

    For more information about Doris SQL commands and syntax, see the Doris SQL Manual.

  5. After the database is successfully created, continue to create data tables.

    CREATE TABLE IF NOT EXISTS mrs_table

    (

    `user_id` LARGEINT NOT NULL COMMENT "User ID",

    `date` DATE NOT NULL COMMENT " Data Import Date",

    `city` VARCHAR(20) COMMENT "city",

    `age` SMALLINT COMMENT "age",

    `sex` TINYINT COMMENT "Gender",

    `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT " Last access time of the user",

    `cost` BIGINT SUM DEFAULT "0" COMMENT "Total consumption",

    `max_dwell_time` INT MAX DEFAULT "0" COMMENT "Dwell time",

    `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "Minimum dwell time"

    )

    AGGREGATE KEY(`user_id`, `date`, `city`, `age`, `sex`)

    DISTRIBUTED BY HASH(`user_id`) BUCKETS 1

    PROPERTIES (

    "replication_allocation" = "tag.location.default: 1"

    );

  6. Create the test.csv file in any directory on the current node. The file content is as follows:

    10000,2017-10-01,city1,20,0,2017-10-01 06:00:00,20,10,10
    10000,2017-10-01,city2,20,0,2017-10-01 07:00:00,15,2,2
    10001,2017-10-01,city3,30,1,2017-10-01 17:05:45,2,22,22
    10002,2017-10-02,city4,20,1,2017-10-02 12:59:12,200,5,5
    10003,2017-10-02,city5,32,0,2017-10-02 11:20:00,30,11,11
    10004,2017-10-01,city6,35,0,2017-10-01 10:00:15,100,3,3
    10004,2017-10-03,city7,35,0,2017-10-03 10:20:22,11,6,6

  7. Import data in the test.csv file to the table created in 5 using Stream load.

    cd Directory where test.csv is stored

    curl -k --location-trusted -u doris User name:User password -H "label:table1_20230217" -H "column_separator:," -T test.csv http://Doris FE Instance IP address:HTTP port/api/mrs_demo/mrs_table/_stream_load

    • To obtain the IP address of the Doris FE instance, log in to FusionInsight Manager of the MRS cluster and choose Cluster > Services > Doris > Instances to view the IP address of any FE instance.
    • To view the HTTP port number, log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and search for http_port.

  8. Query the data.

    select * from mrs_table where city='city1';