- What's New
- Function Overview
- Product Bulletin
- Service Overview
- Billing
- Getting Started
-
Kernel
- TaurusDB Kernel Version Release History
-
Common Kernel Functions
- Parallel Query
- Near Data Processing
- DDL Optimization
- Fast Binlog Positioning
- Backward Index Scan
- Statement Outline
- Idle Transaction Disconnection
- LIMIT...OFFSET Pushdown
- Conversion of IN Predicates Into Subqueries
- DISTINCT Optimization for Multi-Table Joins
- Diagnosis on Large Transactions
- Enhanced Partitioned Tables
- Hot Row Update
- Multi-tenancy
- Column Compression
- Table Recycle Bin
- Cold Data Preloading for Read Replicas
- Self-Healing of Read Replicas upon a Replication Latency
-
User Guide
- Permissions Management
- Buying a DB Instance
- Connecting to a DB Instance
- Database Usage
- Data Migration
-
Instance Management
- Viewing the Overall Status of DB Instances
- Viewing Metrics
- Instance Lifecycle Management
-
Configuration Changes
- Changing the vCPUs and Memory of a DB Instance or Node
- Changing the Storage Space of a DB Instance
- Configuring Auto Scaling Policies for a DB Instance
- Changing the Maintenance Window of a DB Instance
- Customizing Displayed Items of the Instance List
- Upgrading the Minor Kernel Version of a DB Instance
- Updating the OS of a DB Instance
- Data Backups
- Data Restorations
- Serverless Instances
- Multi-primary Instances (OBT)
- Read Replicas
-
Database Proxy (Read/Write Splitting)
- What Is Database Proxy?
- Creating a Proxy Instance for Read/Write Splitting
-
Changing Configurations of a Proxy Instance
- Changing the Consistency Level of a Proxy Instance
- Enabling the Connection Pool for a Proxy Instance
- Enabling Transaction Splitting for a Proxy Instance
- Modifying the Routing Policy of a Proxy Instance
- Changing Read Weights of Nodes
- Changing the Multi-statement Processing Mode of a Proxy Instance
- Enabling Automatic Association of New Nodes with a Proxy Instance
- Enabling Access Control for a Proxy Instance
- Changing the Specifications of a Proxy Instance
- Changing the Number of Nodes for a Proxy Instance
- Applying for a Private Domain Name for a Proxy Instance
- Changing the Port of a Proxy Instance
- Changing the Proxy Address of a Proxy Instance
- Modifying Parameters of a Proxy Instance
- Binding an EIP to a Proxy Instance
- Proxy Instance Lifecycle
- Proxy Instance Kernel Versions
- Using Hints for Read/Write Splitting
- DBA Assistant
- Parameter Management
- Security and Encryption
- Cold and Hot Data Separation (OBT)
- Application Lossless and Transparent (ALT)
-
HTAP Analysis (Standard Edition)
- What Is HTAP of Standard Edition?
- Connecting to an HTAP Instance for Complex OLAP Queries
- Connecting to a Standard HTAP Instance
-
Standard HTAP Instance Management
- Rebooting a Standard HTAP Instance
- Rebooting a Node of a Standard HTAP Instance
- Changing Storage Space of a Standard HTAP Instance
- Adding Read Replicas to a Standard HTAP Instance
- Deleting a Standard HTAP Instance
- Adjusting Blacklisted or Whitelisted Tables of a Standard HTAP Instance and Repairing Tables
- Standard HTAP Account Management
- Viewing Metrics of a Standard HTAP Instance or Nodes
- Syntax and Data Type Mappings Between HTAP and TaurusDB Instances
- Performance Tuning
-
RegionlessDB Clusters (OBT)
- What Is a RegionlessDB Cluster?
- Using a RegionlessDB Cluster for Remote Multi-Active DR
- Using a RegionlessDB Cluster for Remote DR
- Performing a Primary/Standby Switchover or Failover in a RegionlessDB Cluster
- Removing a Standby Instance from a RegionlessDB Cluster
- Deleting a RegionlessDB Cluster
- Viewing the Replication Latency and Traffic of a RegionlessDB Cluster
- Monitoring and Alarms
- Logs and Auditing
- Task Center
- Tag Management
- Quota Management
- Best Practices
- Performance White Paper
-
API Reference
- Before You Start
- API Overview
- Calling APIs
-
APIs (Recommended)
- DB Engine Version Queries
- Database Specification Queries
-
Instance Management
- Creating a DB Instance
- Rebooting a DB Instance
- Deleting/Unsubscribing from a DB Instance
- Creating a Read Replica
- Deleting or Unsubscribing from a Read Replica
- Scaling up Storage of a Yearly/Monthly DB Instance
- Changing a DB Instance Name
- Resetting a Database Password
- Changing DB Instance Specifications
- Querying Dedicated Resource Pools
- Querying Dedicated Resources
- Configuring the Monitoring By Seconds Function
- Querying the Configuration of Monitoring by Seconds
- Rebooting a Node
- Upgrading the Kernel Version of a DB Instance
- Enabling or Disabling SSL
- Binding an EIP
- Unbinding an EIP
- Promoting a Read Replica to Primary
- Changing a Maintenance Window
- Changing a Security Group
- Changing a Private IP Address
- Changing a Database Port
- Changing a DB Instance Description
- Applying for a Private Domain Name
- Changing a Private Domain Name
- Querying the Kernel Version
- Modifying Auto Scaling Policies
- Querying Auto Scaling Policies
- Pre-Checking Resources
- Querying DB instances
- Querying Details of a DB Instance
- Querying Details of DB Instances in Batches
- Modifying a Recycling Policy
- Querying a Recycling Policy
- Querying Instances in the Recycle Bin
- Changing Node Names in Batches
- Querying Historical Records of Auto Scaling
- Setting a Policy for a Serverless DB Instance
- Changing the Failover Priority of a Read Replica
- Querying an EIP
-
Backup Management
- Configuring a Same-Region Backup Policy
- Creating a Manual Backup
- Querying Full Backups
- Querying an Automated Backup Policy
- Deleting a Manual Backup
- Restoring Data to the Original Instance or an Existing Instance
- Querying the Restoration Time Range
- Enabling or Disabling Encrypted Backup
- Checking Whether Encrypted Backup Is Enabled
- Querying Incremental Backups
- Configuring a Cross-Region Backup Policy
- Restoring Tables to a Specified Point in Time
- Querying Tables for Table-level Point-in-Time Recovery
-
Parameter Template Management
- Querying Parameter Templates
- Creating a Parameter Template
- Deleting a Parameter Template
- Obtaining Details About a Parameter Template
- Modifying Parameters in a Parameter Template
- Applying a Parameter Template
- Replicating a Parameter Template
- Comparing Parameter Templates
- Querying Instances That a Parameter Template Can Be Applied To
- Viewing Parameter Change History
- Obtaining Parameter Information of a Specified DB Instance
- Modifying Parameters of a Specified DB Instance
- Replicating the Parameter Template of a DB Instance
- Querying Application Records of a Parameter Template
- Quota Management
-
Database Proxy
- Creating a Proxy Instance
- Deleting a Proxy Instance
- Querying Proxy Instances
- Querying Proxy Instance Specifications
- Adding Proxy Nodes
- Deleting Proxy Nodes
- Changing the Specifications of a Proxy Instance
- Assigning Read Weights
- Changing the Routing Policy of a Proxy Instance
- Enabling or Disabling Transaction Splitting for a Proxy Instance
- Enabling or Disabling Automatic Association of New Nodes with Proxy Instances
- Changing Session Consistency of a Proxy Instance
- Changing the Connection Pool Type of a Proxy Instance
- Changing the Port of a Proxy Instance
- Upgrading the Kernel Version of a Proxy Instance
- Modifying the Name of a Proxy Instance
- Querying Access Control Settings of a Proxy Instance
- Querying the Minor Version of a Proxy Instance
- Modifying Parameters of a Proxy Instance
- Querying Kernel Parameters of a Proxy Instance
- Enabling or Disabling Access Control
- Configuring Access Control Rules
- Enabling or Disabling SSL for a Proxy Instance
- Rebooting a Proxy Instance
-
Log Management
- Enabling or Disabling SQL Explorer
- Querying Whether SQL Explorer Is Enabled
- Querying Slow Query Logs
- Querying Error Logs
- Obtaining the Temporary Link for Downloading Full SQL
- Querying LTS Configurations of an Instance
- Deleting LTS Configurations in Batches
- Creating LTS Configurations in Batches
- Querying Whether Show Original Log Is Enabled
- Enabling or Disabling Show Original Log
- Querying Slow Query Log Statistics
- Obtaining Links for Downloading Slow Query Logs
- Tag Management
- Database User Management
- Database Management
- Traffic Management
- Task Center
- Intelligent Diagnosis
-
HTAP (Standard Edition)
- Restoring a Data Synchronization Task for a StarRocks Instance
- Stopping a Data Synchronization Task for a StarRocks Instance
- Checking Table Configurations for HTAP Data Synchronization
- Creating a StarRocks Instance
- Querying a StarRocks Instance
- Deleting a StarRocks Instance
- Rebooting a StarRocks Instance
- Rebooting a StarRocks Node
- Checking StarRocks Resources
- Querying HTAP Engine Resources
- Obtaining the Storage Type of an HTAP Instance
- Querying Specifications of an HTAP Instance
- Querying HTAP Instances
- Creating a Data Synchronization Task for a StarRocks Instance
- Deleting a Data Synchronization Task for a StarRocks Instance
- Querying Data Synchronization Tasks of a StarRocks Instance
- Checking Database Configurations for HTAP Data Synchronization
- Querying Configurations of a StarRocks Data Synchronization Task
- Querying Database Parameter Settings for StarRocks Data Synchronization
- Querying Databases of a StarRocks Instance
- Querying Database Accounts
- Creating a Database Account
- Deleting a Database Account
- Changing the Password of a Database Account
- Changing Permissions of a Database Account
- Changing the Specifications of a StarRocks Instance
- Querying Parameters
- Modifying Parameters
- Enabling Assign Requests to Row and Column Store Nodes for a StarRocks Instance
- Comparing Parameters
- Multi-tenancy
-
APIs (Unavailable Soon)
- DB Engine Version Queries
- Database Specification Queries
-
Instance Management
- Creating a DB Instance
- Querying DB Instances
- Querying DB Instances
- Deleting a DB Instance
- Querying Details of a DB Instance
- Querying Details of DB Instances in Batches
- Querying Details of a DB Instance
- Creating a Read Replica
- Deleting a Read Replica
- Scaling up Storage of a Yearly/Monthly DB Instance
- Changing a DB Instance Name
- Resetting a Database Password
- Modifying DB Instance Specifications
- Backup Management
- Parameter Template Management
- Quota Management
- Database Proxy
- Log Management
- Task Information Queries
- Permissions Policies and Supported Actions
- Appendix
- SDK Reference
-
FAQs
- Product Consulting
- Resource Freezing, Unfreezing, Release, Deletion, and Unsubscription
-
Database Connections
- What Should I Do If I Can't Connect to My TaurusDB Instance?
- What Should I Do If an ECS Can't Connect to a TaurusDB Instance?
- Can an External Server Access a TaurusDB Instance?
- What Is the Maximum Number of Connections to a TaurusDB Instance?
- What Do I Do If There Are Too Many Database Connections?
- Are There Any Risks If There Are Too Many Connections to a TaurusDB Instance?
- What Should I Do If the Network Connectivity Test Fails?
- Can I Access a TaurusDB Instance over an Intranet Connection Across Regions?
- How Do I Check the Connections to a TaurusDB Instance?
- How Do I Enable Availability Detection for a Connection Pool in the Service Code?
- Client Installation
- Database Migration
- Database Permissions
-
Database Performance
- What Should I Do If the CPU Usage of My TaurusDB Instance Is High?
- How Do I Handle Slow SQL Statements Caused by Inappropriate Composite Index Settings?
- How Do I Handle a Large Number of Temporary Tables Being Generated for Long Transactions and High Memory Usage?
- What Should I Do If Locks on Long Transactions Block the Execution of Subsequent Transactions?
- How Can I Use Temporary Disks of TaurusDB?
- What Is the CPU Usage of a TaurusDB Instance with Empty Load?
-
Database Usage
- Why Are the Results Inconsistent After the MATCH AGAINST Statement Is Executed, Respectively, on Primary Nods and Read Replicas?
- How Do I Add Columns Using INSTANT?
- How Do I Use LOAD DATA to Import Local Data?
- How Do I Write Data to or Create Indexes for an Ultra-large Table?
- What Are the Risks of Deleting an Index from an Ultra-large Table?
- Backups
-
Database Parameter Modification
- How Do I Change the Time Zone of a TaurusDB Instance?
- How Do I Configure a Password Expiration Policy for TaurusDB Instances?
- How Do I Ensure that the Database Character Set of a TaurusDB Instance Is Correct?
- How Do I Use the utf8mb4 Character Set to Store Emojis in a TaurusDB Instance?
- How Do I Set Case Sensitivity for TaurusDB Table Names?
- Can I Use SQL Commands to Modify Global Parameters of My TaurusDB Instance?
- Network Security
-
Log Management
- Can I Enable general_log for TaurusDB?
- How Do I View All SQL Statements Executed by TaurusDB?
- How Do I Enable and View Binlog of My TaurusDB Instance?
- How Do I Change the Binlog Retention Period?
- How Do I View Deadlock Logs of TaurusDB?
- Why Are Slow SQL Statements Displayed in Slow Query Log Monitoring, but No Information About Them Is Displayed on the Slow Query Logs Tab Page in the Logs Module?
- Version Upgrade
-
Troubleshooting
-
Backup and Restoration Issues
- Insufficient Permissions During Data Export Using mysqldump
- How Do I use mysqlbinlog to Obtain Binlog Files?
- Canal Fails to Parse Binlogs
- Precautions for Exporting Large Tables Through mysqldump
- Commands for Exporting Data Through mysqldump
- System Inaccessible After Field Addition to a Database Table
- SQL Statements Such as SET @@SESSION.SQL_LOG_BIN Displayed After You Run mysqldump
- Insufficient Privileges Reported for Canal
-
Connection Issues
- Login Failed After ssl_type of root Is Changed to ANY
- Failed to Connect to a DB Instance Using SSL
- Description of Each IP Address
- SSL Connection Failed Due to Inconsistent TLS Versions
- Error Message "connection established slowly"
- "Access denied" Displayed During Database Connection
- Failed to Connect to a Database Using mariadb-connector in SSL Mode
- Failed to Connect to a Database as User root
- Client Automatically Disconnected from a DB Instance
- Disconnection Occurs Every 45 Days Due to the istio-citadel Certificate Mechanism
-
SQL Issues
- Invalid TIMESTAMP Default Value during Table Creation
- Failed to Change the VARCHAR Length Due to the Index Length Limit
- Slow SQL Queries After a Large Amount of Data Is Deleted from a Large Table
- Error 1366 Reported When Data Containing Emojis Is Updated
- Slow Stored Procedure Execution Due to Inconsistent Collations
- ERROR [1412] Reported for a DB Instance
- Failed to Delete a Table with a Foreign Key
- Incorrect GROUP_CONCAT Results
- Error Message "Too many keys specified" Displayed When a Secondary Index Is Created
- DISTINCT and GROUP BY Optimization
- Equivalent Comparison Failures with Floating-Point Numbers
- A Large Number of SELECT Requests Routed to The Primary Instance After Database Proxy Is Enabled
- Tablespace Bloat
- ERROR 1396 Reported When a User Is Created
- Error Message Reported When alter table xxx discard/import tablespace Is Executed
- Native Error 1461 Reported by a DB Instance
- "Row size too large" Reported When a Table Failed to Be Created
- Duplicate Data Exists After ORDER BY LIMIT Is Executed
- Error Message Reported When select * from sys.innodb_lock_waits Is Executed
- Parameter-related Issues
-
Performance Issues
- High CPU Usage
- Out of Memory (OOM) Errors
- Tablespace Bloat
- Read Replica Uses Far More Storage Than the Primary Node
- Slow SQL Execution Due to Hot and Cold Data Problems
- Full Storage Caused by Complex Queries
- Slow Response Due to Deadlocks
- CPU Usage Increase
- CPU Resource Exhaustion Caused by Too Many Concurrent Slow Queries
-
Basic Issues
- How Do I View the Used Storage of My TaurusDB Instance?
- Renaming Databases and Tables
- Character Set and Collation Settings
- Auto-Increment Field Value Jump
- Starting Value and Increment of AUTO_INCREMENT
- Changing the AUTO_INCREMENT Value of a Table
- Failed to Insert Data Because Values for the Auto-increment Primary Key Field Reach the Upper Limit
- Auto-increment Field Values
- AUTO_INCREMENT Not Displayed in the Table Structure
- Impact of Creating an Empty Username
- No Scanned Rows Recorded in Slow Query Logs
- "handle_sync_msg_from_slave my_net_read error:-1" Displayed on the Error Logs Tab Page
- ERROR 1290 (HY000): The MySQL server is running with the --sql-replica-on option so it cannot execute this statement
-
Backup and Restoration Issues
- Videos
- General Reference
Copied.
Connecting to an HTAP Instance for Complex OLAP Queries
You can let an application directly connect to an HTAP instance to enable complex OLAP queries.
Operation Process
Step 1: Buy a Standard HTAP Instance
Step 2: Synchronize TaurusDB Instance Data to the Standard HTAP Instance
Prerequisites
- Parameters have been configured for a TaurusDB instance according to the following table.
Table 1 Parameter description Parameter
Value
How to Modify
default_authentication_plugin
mysql_native_password
binlog_expire_logs_seconds
86400
NOTE:
It is recommended that the binlog retention period be greater than one day. 86,400s = 60 (seconds) x 60 (minutes) x 24 (hours). This prevents incremental replication failures caused by a short binlog retention period.
log_bin
NOTE:
To use this parameter, ensure that the TaurusDB kernel version is earlier than 2.0.45.230900.
ON
rds_global_sql_log_bin
NOTE:
To use this parameter, ensure that the TaurusDB kernel version is 2.0.45.230900 or later.
ON
binlog_format
ROW
Run the SHOW VARIABLES; command to check the parameter value. If you need to change the parameter value, contact customer service.
binlog_row_image
FULL
Run the SHOW VARIABLES; command to check the parameter value. If you need to change the parameter value, contact customer service.
log_bin_use_v1_row_events
OFF
Run the SHOW VARIABLES; command to check the parameter value. If you need to change the parameter value, contact customer service.
- Databases and tables have been created for the TaurusDB instance.
Procedure
Step 1: Buy a Standard HTAP Instance
- Log in to the management console.
- Click
in the upper left corner and select a region and project.
- Click
in the upper left corner of the page and choose Databases > TaurusDB.
- On the Instances page, locate a TaurusDB instance and click its name to access the Basic Information page.
- In the navigation pane, choose HTAP Analysis. On the displayed page, click Create HTAP Instance.
- In the DB Instance Information area, check the current TaurusDB instance information.
Figure 1 Checking TaurusDB instance information
- Set parameters for the HTAP instance.
Figure 2 Creating a standard HTAP instance
Table 2 Parameter description Parameter
Description
Billing Mode
Select Pay-per-use.
HTAP Instance Type
Select Single or Cluster.
- Single: There is only one FE node and one BE node. It is used only for function experience and testing and does not ensure SLA.
- Cluster: There are at least three FE or BE nodes and at most 10 FE or BE nodes.
Storage Type
Select Extreme SSD or Ultra-high I/O.
- Extreme SSD: uses a 25GE network and RDMA to provide you with up to 1 million random read/write performance per disk and low latency per channel.
- Ultra-high I/O: uses multi-disk striping to balance I/O loads among multiple disks, improving read/write bandwidth. The maximum throughput is 1.7 GB/s.
AZ Type
Only single-AZ is available.
AZ
Select an AZ as needed.
Time Zone
Select a time zone for your instance based on the region hosting your instance. The time zone is selected during instance creation and cannot be changed after the instance is created.
Instance Specifications
Only general-enhanced is available.
Backend Node Specifications
Select the BE node specifications.
The BE nodes are for data storage and SQL computing.
Backend Node Storage (GB)
Select the storage for BE nodes. The default storage is 50 GB and can be expanded to up to 32,000 GB.
Backend Nodes
- A single-node instance has only one BE node.
- A cluster instance has 3 to 10 BE nodes. You can apply for a maximum of 10 nodes at a time.
Frontend Node Specifications
Select the FE node specifications.
The FE nodes manage metadata, manage client connections, and plan and schedule queries.
Frontend Node Storage (GB)
Select the storage for FE nodes. The default storage is 50 GB and can be expanded to up to 1,000 GB.
Frontend Nodes
- A single-node instance has only one FE node.
- A cluster instance has 3 to 10 FE nodes. You can apply for a maximum of 10 nodes at a time.
Administrator
The default username is root.
Administrator Password
The password must consist of 8 to 32 characters and contain at least three types of the following characters: uppercase letters, lowercase letters, digits, and special characters (~!@#%^*-_=+?,()&$|.). Enter a strong password and periodically change it to improve security and defend against threats such as brute force cracking attempts.
Confirm Password
Enter the administrator password again.
- After configuration, click Next.
- Confirm the configuration and click Submit.
- On the HTAP instance list page, view and manage the HTAP instance.
Step 2: Synchronize TaurusDB Instance Data to the Standard HTAP Instance
- On the Instances page, locate a TaurusDB instance and click its name to access the Basic Information page.
- In the navigation pane, choose HTAP Analysis.
- Click the name of an HTAP instance to access the Basic Information page.
- In the navigation pane, choose Data Synchronization. On the displayed page, click Create Synchronization Task.
- Configure required parameters.
Figure 3 Creating a synchronization task
- Currently, the databases whose name is Chinese cannot be synchronized. The destination database and task name cannot contain Chinese characters, and the destination database name must contain at least three characters.
- Synchronize Read Replica Data: Select Yes. You need to select a read replica. Full data is synchronized from the selected read replica, preventing query load on the primary node during a full synchronization. If there is only one read replica, this node is selected by default. During a full synchronization, ensure that the read replica is available, or the synchronization will fail and you will need to perform the synchronization again.
- Synchronization Task Name: The name can contain 3 to 128 characters. Only letters, digits, underscores (_) are allowed.
- Destination Database: The name can contain 3 to 128 characters. Only letters, digits, underscores (_) are allowed.
- Database to be Synchronized: Select a database that the data will be synchronized to from the drop-down list. You can modify the database parameters as required.
Figure 4 Setting databases to be synchronized
- Synchronization Scope: Select All Tables or Some Tables.
- Blacklist and Whitelist: If Synchronization Scope is set to Some Tables, you need to configure tables for the blacklist or whitelist. Set the blacklist and whitelist for the selected tables.
NOTE:
- You can set either a blacklist or a whitelist. If you select the whitelist, only the tables in the whitelist are synchronized.
- The tables to be synchronized must contain primary keys or a non-empty unique key, or they cannot be synchronized to the HTAP instance.
- Extra disk space may be used during backend data combination and query. You are advised to reserve 50% of the disk space for the system.
- When setting the table blacklist or whitelist, you can enter multiple tables in the search box at a time. The tables can be separated by commas (,), spaces, or line breaks. After entering multiple tables, you need to click
. These tables will be selected by default and displayed in the Selected Table area.
- Configure Table Operations: Enable or disable it as required.
- If you select Enabled:
- Select a synchronized table on the left and perform operations on its columns. The operations include order by, key columns, distributed by, partition by, data_model, buckets, replication_num, and enable_persistent_index. Multiple operations are separated by semicolons (;).
For details about the syntax, see Table 3.
Table 3 Operation syntax Operation Type
Syntax
order by
order by (column1, column2) or order by column1,column2
key columns
key columns (column1, column2) or key columns column1,column2
distributed by
distributed by (column1, column2) buckets 3
NOTE:
buckets is optional. If it is not set, the default value is used.
partition by
There are expression partitions and list partitions. For details, see the partition syntax example.
data_model
Specifies the table type. The value can be primary key, duplicate key, or unique key.
Syntax:
data_model=primary key, data_model=duplicate key, or data_model=unique key
replication_num
replication_num=3
NOTE:
The value cannot exceed the number of BE nodes, or the verification fails.
enable_persistent_index
Specifies whether to make the index persistent. Syntax:
enable_persistent_index=true or enable_persistent_index=false
Combined scenario
data_model=duplicate key;key columns column1, column2;
Partition syntax example:
You only need to set a partition expression (time function expression or lis expression) when creating a table. During data import, an HTAP instance automatically creates partitions based on the data and the rule defined in the partition expression.
Partitioning based on a time function expression: If data is often queried and managed based on a continuous date range, you only need to specify a partition column of the date type (DATE or DATETIME) and a partition granularity (year, month, day, or hour) in the time function expression. An HTAP instance automatically creates partitions and sets the start time and end time of the partitions based on the imported data and partition expression.
Syntax:
PARTITION BY expression ... [ PROPERTIES( 'partition_live_number' = 'xxx' ) ] expression ::= { date_trunc ( <time_unit> , <partition_column> ) | time_slice ( <partition_column> , INTERVAL <N> <time_unit> [ , boundary ] ) }
Table 4 Parameter description Parameter
Mandatory
Description
expression
Yes
Currently, only the date_trunc and time_slice functions are supported. If you use time_slice, you do not need to configure the boundary parameter because this parameter can only be set to floor by default.
time_unit
Yes
Partition granularity. Currently, the value can only be hour, day, month, or year. If the partition granularity is hour, the partition columns can only be of the DATETIME type.
partition_column
Yes
Partition column.
- Only the date type (DATE or DATETIME) is supported. If date_trunc is used, the partition column can be of the DATE or DATETIME type. If time_slice is used, the partition column can only be of the DATETIME type. The value of the partition column can be NULL.
- If the partition column is of the DATE type, the value range is from 0000-01-01 to 9999-12-31. If the partition column is of the DATETIME type, the value range is from 0000-01-01 01:01:01 to 9999-12-31 23:59:59.
- Currently, only one partition column can be specified.
Example: If you often query data by day, you can use the partition expression date_trunc (), set the partition column to event_day, and set the partition granularity to day during table creation. In this way, data is automatically partitioned based on dates when being imported. Data of the same day is stored in the same partition. Partition pruning can significantly improve queries.
CREATE TABLE site_access1 ( event_day DATETIME NOT NULL, site_id INT DEFAULT '10', city_code VARCHAR(100), user_name VARCHAR(32) DEFAULT '', pv BIGINT DEFAULT '0' ) DUPLICATE KEY(event_day, site_id, city_code, user_name) PARTITION BY date_trunc('day', event_day) DISTRIBUTED BY HASH(event_day, site_id);
Partitioning based on the column expression: If you often query and manage data based on enumerated values, you only need to specify the column representing the type as the partition column. An HTAP instance automatically divides and creates partitions based on the partition column value of the imported data.
Syntax:
PARTITION BY expression ... [ PROPERTIES( 'partition_live_number' = 'xxx' ) ] expression ::= ( <partition_columns> ) partition_columns ::= <column>, [ <column> [,...] ]
Table 5 Parameter description Parameter
Mandatory
Description
partition_columns
Yes
Partition columns.
- The value can be a Character (BINARY is not supported), Date, Integer, or Boolean value. The value cannot be NULL.
- After the import, a partition automatically created can contain only one value of each partition column. If multiple values of each partition column need to be contained, use list partitioning.
Example: If you often query the equipment room billing details by date range and city, you can use a partition expression to specify the date and city as the partition columns when creating a table. In this way, data of the same date and city is grouped into the same partition, and partition pruning can be used to significantly accelerate queries.CREATE TABLE t_recharge_detail1 ( id bigint, user_id bigint, recharge_money decimal(32,2), city varchar(20) not null, dt varchar(20) not null ) DUPLICATE KEY(id) PARTITION BY (dt,city) DISTRIBUTED BY HASH(`id`);
List partitioning
Data is partitioned based on a list of enumerated values that you explicitly define. You need to explicitly list the enumerated values contained in each list partition, and the values do not need to be consecutive.
List partitioning is suitable for storing columns where there are a small number of enumerated values and querying and managing data based on the enumerated values. For example, a column indicates a geographical location, status, or category. Each value of a column represents an independent category. Data is partitioned based on the enumerated values of columns to improve query performance and data management. List partitioning is especially suitable for scenarios where a partition needs to contain multiple values of each partition column. For example, the city column in a table indicates the city that an individual is from, and you often query and manage data by state and city. You can use the city column as the partition column for list partitioning when creating a table, and specify that data of multiple cities in the same state is stored in the same partition PARTITION pCalifornia VALUES IN ("Los Angeles","San Francisco","San Diego"), this feature accelerates queries and data management.
NOTE:
Partitions must be created during table creation. Partitions cannot be automatically created during data import. If the table does not contain the partitions corresponding to the data, an error is reported.
Syntax:PARTITION BY LIST (partition_columns)( PARTITION <partition_name> VALUES IN (value_list) [, ...] ) partition_columns::= <column> [,<column> [, ...] ] value_list ::= value_item [, value_item [, ...] ] value_item ::= { <value> | ( <value> [, <value>, [, ...] ] ) }
Table 6 Parameter description Parameter
Mandatory
Description
partition_columns
Yes
Partition columns.
The value can be a Character (except BINARY), Date (DATE and DATETIME), Integer, or Boolean value. The value cannot be NULL.
partition_name
Yes
Partition name.
You are advised to set proper partition names to distinguish data categories in different partitions.
value_list
Yes
List of enumerated values of partition columns in a partition.
Example 1: If you often query the equipment room billing details by state or city, you can specify the city column as the partition column and specify that the cities in each partition belong to the same state. In this way, you can quickly query data of a specific state or city and manage data by state or city.
CREATE TABLE t_recharge_detail2 ( id bigint, user_id bigint, recharge_money decimal(32,2), city varchar(20) not null, dt varchar(20) not null ) DUPLICATE KEY(id) PARTITION BY LIST (city) ( PARTITION pCalifornia VALUES IN ("Los Angeles","San Francisco","San Diego"), --: These cities belong to the same state. PARTITION pTexas VALUES IN ("Houston","Dallas","Austin") ) DISTRIBUTED BY HASH(`id`);
Example 2: If you often query the equipment room billing details by date range and state or city, you can specify the date and city as the partition columns when creating a table. In this way, data of a specific date and a specific state or city is grouped into the same partition, to accelerate queries and data management.
CREATE TABLE t_recharge_detail4 ( id bigint, user_id bigint, recharge_money decimal(32,2), city varchar(20) not null, dt varchar(20) not null ) ENGINE=OLAP DUPLICATE KEY(id) PARTITION BY LIST (dt,city) ( PARTITION p202204_California VALUES IN ( ("2022-04-01", "Los Angeles"), ("2022-04-01", "San Francisco"), ("2022-04-02", "Los Angeles"), ("2022-04-02", "San Francisco") ), PARTITION p202204_Texas VALUES IN ( ("2022-04-01", "Houston"), ("2022-04-01", "Dallas"), ("2022-04-02", "Houston"), ("2022-04-02", "Dallas") ) ) DISTRIBUTED BY HASH(`id`);
- After entering the statement for performing column operations on the table, click Verify on the right of the area.
- Select a synchronized table on the left and perform operations on its columns. The operations include order by, key columns, distributed by, partition by, data_model, buckets, replication_num, and enable_persistent_index. Multiple operations are separated by semicolons (;).
- If you select Disabled, go to 6.
- If you select Enabled:
- After the settings are complete, click Create Synchronization Task.
- Confirm the settings and click Sync Now.
NOTE:
If you click Previous on the page or click
in the upper left corner of the page to return to the data synchronization page, a synchronization task will be generated. The status of the task is Synchronization Stage: Waiting for synchronization. To start the task, click Synchronize in the Operation column.
- Click Back to Synchronization List to return to the data synchronization page. View details about the task name, source database, destination database, status, and operations.
Figure 5 Viewing task status
NOTE:
To synchronize data from a TaurusDB instance to a standard HTAP instance, you need to go through two stages: full synchronization and incremental synchronization.
During full synchronization, there are certain constraints on the TaurusDB instance:
- Before full synchronization, estimate the time required for data synchronization and adjust the retention period of TaurusDB binlogs accordingly. This prevents task failures caused by binlog position deletion during incremental synchronization. You are advised to set the binlog retention period to at least one day.
- During full synchronization, do not perform DDL operations, especially time-consuming ones, on the source database. Otherwise, the task may be abnormal.
- Keep the primary node and read replicas of the TaurusDB instance stable and avoid operations such as primary/standby switchovers and reboots. Otherwise, the task may be abnormal, or OLTP workloads may be affected due to heavy query load on the new primary node.
- During full synchronization, keep OLAP instances stable and do not reboot them. Otherwise, the full synchronization task may restart.
If the task status is Synchronization Stage: Incremental synchronization in progress, the full synchronization is complete and the incremental synchronization begins.
Step 3: Connect to the HTAP Instance for OLAP Queries
For details about how to connect to a standard HTAP instance and perform OLAP queries, see Connecting to a Standard HTAP Instance Through JDBC.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot