Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

Configuring Cold and Hot Data Separation

Updated on 2024-11-29 GMT+08:00

This topic describes how to configure Doris cold and hot separation.

Prerequisite

The Doris cluster can communicate with OBS.

Creating an OBS Parallel File System and Obtaining the AK/SK and Domain ID

Create an OBS parallel file system.

  1. Log in to the OBS console.
  2. Choose Parallel File Systems > Create Parallel File System.
  3. Enter a file system name, for example, doris-obs.

    The name of an enterprise project must be the same as that of the MRS cluster. Set other parameters.

  4. Click Create Now.
  5. In the parallel file system list, click the name of the one you just created and click Overview to obtain the endpoint information.

    NOTE:

    After a service is deleted or a cluster is uninstalled, dirty data may remain in the parallel file system created in 2 to 4. You need to delete the dirty data.

Obtain AK/SK information.

  1. Click the username in the upper right corner and select My Credentials from the drop-down list.
  2. On the API Credentials page, obtain the Account ID whish is used as the domain ID.
  3. Click Access Keys, click Create Access Key, and enter the verification code or password. Click OK and download the access keys. Obtain the AK/SK information from the .csv file.

Creating a Cloud Service Agency and Binding It to a Cluster

  1. Log in to the management console.
  2. In the service list, choose Management & Governance > Identity and Access Management.
  3. In the navigation pane on the left, choose Agencies and click Create Agency. On the displayed page, set the following parameters and click Next:

    • Agency Name: Enter an agency name, for example, mrs_ecs_obs.
    • Agency Type: Select Cloud service.
    • Cloud Service: Select Elastic Cloud Server (ECS) and Bare Metal Server (BMS).
    • Validity Period: Select Unlimited.

  4. On the displayed page, search for the OBS OperateAccess policy and select it.
  5. Click Next, click Show More, select Global services, and click OK.
  6. In the displayed dialog box, click OK to start authorization. Click Finish after the message "Authorization successful." is displayed.
  7. On the MRS console, choose Active Clusters in the navigation pane on the left and click the name of the target cluster to access its details page.
  8. On the Dashboard page, click Synchronize on the right of IAM User Sync to synchronize the IAM user.
  9. Click Manage Agency on the right of Agency, select the created agency, for example, mrs_ecs_obs, and click OK to bind the agency to the cluster.

Creating a Common Account Agency and Binding It to a Cluster

  1. Log in to the management console.
  2. In the service list, choose Management & Governance > Identity and Access Management.
  3. Choose Permissions > Policies/Roles, and click Create Custom Policy. Set the following parameters, and click OK:

    • Policy Name: Enter a policy name, for example, doris-policy.
    • Policy View: Select JSON.
    • Policy Content: Enter the following content.
      {
          "Version": "1.1",
          "Statement": [
              {
                  "Effect":"Allow",
                  "Action":[
                      "OBS:*:*"
                      ]
              }
              ]
      }

  4. In the navigation pane on the left, click Agencies. Click Create Agency. On the displayed page, set the following parameters and click Next:

    • Agency Name: Enter an agency name, for example, agency-MRS-to-OBS.
    • Agency Type: Select Account.
    • Enter your cloud account in the Delegated Account field, that is, the account you register using your mobile phone number. It cannot be a federated user or an IAM user created using your cloud account.
    • Validity Period: Select Unlimited.

  5. In the search box on the displayed Authorize Agency page, search for the custom policy created in 3 and select it, for example, doris-policy.
  6. Click Next, click Show More, select Global services, and click OK.
  7. Check and record the agency ID.
  8. On the Identity and Access Management page, click Agencies.
  9. Click the name of the cloud service agency created in 3, for example, mrs_ecs_obs.
  10. Click the Permissions tab and click Authorize. On the displayed page, click Create Policy in the upper right corner, and set the parameters as follows:

    • Policy Name: Enter a policy name, for example, doris-assume-policy.
    • Policy View: Select JSON.
    • Policy Content: Enter the following content. {Agency ID} indicates the ID recorded in 7.
      {
          "Version": "1.1",
          "Statement": [
              {
                  "Action": [
                      "iam:agencies:assume"
                  ],
                  "Resource": {
                      "uri": [
                          "/iam/agencies/{Agency ID}"
                      ]
                  },
                  "Effect": "Allow"
              }
          ]
      }

  11. Click Next. On the Select Policy/Role page, select the policy created in 10.
  12. Click Next, click Show More, select Global services, and click OK.

Enabling Cold and Hot Data Separation

By default, cold and hot data separation is disabled. To use this function, perform the following operations:

  1. Log in to FusionInsight Manager and choose Cluster > Services > Doris. Click the Configurations tab.
  2. Search for the obs_cooldown_enable parameter and set it to true.
  3. (Optional) If the data on the local disk is cooled down and stored on OBS, and related data needs to be stored on the local disk in a certain period of time, select All Configurations > BE(Role) > Customization, add the moveback_enable parameter to the customized parameter be.conf.customized.configs and set the parameter value to true.
  4. Click Save and then OK.
  5. Click Instances, select the affected FE and BE instances, choose More > Restart Instance, and enter the password of the current user to restart the FE and BE instances.

Use Case

  1. Log in to the node where MySQL is installed and run the following command to connect to the Doris database:

    If Kerberos authentication is enabled for the cluster (the cluster is in security mode), run the following command to connect to the Doris database:

    export LIBMYSQL_ENABLE_CLEARTEXT_PLUGIN=1

    mysql -uDatabase login username -pDatabase login password -PConnection port for FE queries -hIP address of the Doris FE instance

    NOTE:
    • To obtain the query connection port of the Doris FE instance, you can log in to FusionInsight Manager, choose Cluster > Services > Doris > Configurations, and query the value of query_port of the Doris service.
    • To obtain the IP address of the Doris FE instance, log in to FusionInsight Manager of the MRS cluster and choose Cluster > Services > Doris > Instances to view the service IP address of any FE instance.
    • You can also use the MySQL connection software or Doris web UI to connect to the database.

  2. Create a resource.

    • Create a resource by configuring an agency.
      1. Log in to FusionInsight Manager and choose Cluster > Services > Doris. Click Configurations then All Configurations, click OBS, and search for and modify the following parameters:
        • obs_authentication_method: Change the value to agency.
        • obs_endpoint: endpoint information queried in 5, for example, obs.XXX.
        • obs_iam_endpoint: The value is the endpoint value queried in https://iam.Endpoint queried in 5.
        • obs_iam_domain_id: domain ID queried in 7.
        • obs_agency_name: common account agency created in Creating a Common Account Agency and Binding It to a Cluster, for example, agency-MRS-to-OBS.
      2. Connect the node where the MySQL client is installed to Doris. For details, see Using Doris from Scratch.
      3. Run the following statement to create a resource:

        CREATE RESOURCE IF NOT EXISTS resource_obs_hot_cold PROPERTIES (

        "type" = "obs",

        "obs.bucket" = "Name of the created OBS parallel file system",

        "obs.root.path" = "Root directory for storing data"

        );

    • Create a resource using AK/SK.

      CREATE RESOURCE IF NOT EXISTS resource_obs_hot_cold PROPERTIES (

      "type" = "obs",

      "obs.endpoint" = "xxx",

      "obs.region" = "xxx",

      "obs.bucket" = "xxx",

      "obs.root.path" = "xxx",

      "obs.access_key" = "xxx",

      "obs.secret_key" = "xxx",

      'obs_validity_check' = 'false'

      );

      NOTE:
      • type: data storage type. The value is obs.
      • obs.endpoint: endpoint information viewed in 5
      • obs.region: region of the cluster where the Doris service is deployed
      • obs.bucket: name of the OBS parallel file system created in 3
      • obs.root.path: root directory for storing data
      • obs.access_key: AK information obtained in 8
      • obs.secret_key: SK information obtained in 8

  3. Set the data cooling policy using the storage policy.

    • Set the time to live (TTL) to cool down data.

      CREATE STORAGE POLICY IF NOT EXISTS policy_doris_hot_cold PROPERTIES("storage_resource" = "resource_obs_hot_cold", "cooldown_ttl" = "1d");

      If the value of cooldown_ttl is 1d, newly imported data will be cooled one day later and the cooled data will be stored in the OBS path configured during resource creation in 2.

    • Set a time point to cool down data.

      In addition to setting the TTL, you can also set a time point in the cooling policy.

      CREATE STORAGE POLICY IF NOT EXISTS policy_doris_hot_cold2 PROPERTIES("storage_resource" = "resource_obs_hot_cold", "cooldown_datetime" = "2024-01-01 10:00:00");

  4. Set the storage policy of a table or partition.

    • Set the storage policy when creating a table.

      CREATE TABLE ORDERS (

      ORDER_ID VARCHAR(50),

      USER_ID BIGINT,

      PRODUCT_ID VARCHAR(10),

      PRICE DECIMAL(15,2),

      CHANNEL VARCHAR(20),

      CREATE_DT DATE

      )

      DUPLICATE KEY(`ORDER_ID`)

      PARTITION BY RANGE(`CREATE_DT`)

      (

      PARTITION `p202401` VALUES LESS THAN ("2024-02-01"),

      PARTITION `p202402` VALUES LESS THAN ("2024-03-01")

      )

      DISTRIBUTED BY HASH(`ORDER_ID`) BUCKETS 3

      PROPERTIES (

      "replication_num" = "3",

      "storage_policy" = "policy_doris_hot_cold "

      );

    • Modify the properties of an existing table.

      ALTER TABLE ORDERS SET("storage_policy" = "policy_doris_hot_cold");

    • Set the cold and hot separation policy for a partition when creating a table.

      CREATE TABLE ORDERS (

      ORDER_ID VARCHAR(50),

      USER_ID BIGINT,

      PRODUCT_ID VARCHAR(10),

      PRICE DECIMAL(15,2),

      CHANNEL VARCHAR(20),

      CREATE_DT DATE

      )

      DUPLICATE KEY(`ORDER_ID`)

      PARTITION BY RANGE(`CREATE_DT`)

      (

      PARTITION `p202401` VALUES LESS THAN ("2024-02-01") ("storage_policy" = "policy_doris_hot_cold"),

      PARTITION `p202402` VALUES LESS THAN ("2024-03-01")

      )

      DISTRIBUTED BY HASH(`ORDER_ID`) BUCKETS 3

      PROPERTIES (

      "replication_num" = "3"

      );

    • Modify the partition properties of an existing table.

      ALTER TABLE ORDERS MODIFY PARTITION (`p202401`) SET("storage_policy"="policy_doris_hot_cold");

    NOTE:
    • A single table or partition can be associated with only one storage policy. Associated storage policies cannot be deleted before disassociation.
    • Information about the object associated with a storage policy cannot be modified, such as bucket, endpoint, and root_path.
    • A storage policy can be created, modified, and deleted. Before deleting a storage policy, ensure that no table references the storage policy.
    • When the Merge-on-Write feature is enabled for the Unique model, storage policies cannot be set.

  5. Run the following statement to query data:

    show tablets from ORDERS;

    This command views the tablet information of the table. In the tablet information, LocalDataSize and RemoteDataSize are distinguished. LocalDataSize indicates the data stored locally, and RemoteDataSize indicates the data that has been cooled and stored on OBS.

    Before the data is cooled, the tablet information of the table is as follows.

    After the data is cooled, the tablet information of the table is as follows.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback