Cutting Partition Maintenance Costs for the E-commerce and IoT Industries by Leveraging GaussDB(DWS)'s Automatic Partition Management Feature
Scenarios
For partition tables whose partition columns are time, the automatic partition management function can be added to automatically create partitions and delete expired partitions, reducing partition table maintenance costs and improving query performance. To facilitate data query and maintenance, the time column is often used as the partition column of a partitioned table that stores time-related data, such as e-commerce order information and real-time IoT data. When the time-related data is imported to a partitioned table, the table should have partitions of the corresponding time ranges. Common partition tables do not automatically create new partitions or delete expired partitions. Therefore, maintenance personnel need to periodically create new partitions and delete expired partitions, leading to increased O&M costs.
GaussDB(DWS) has introduced an automatic partition management feature to address this issue. You can set the table-level parameters period and ttl to enable the automatic partition management function, which automatically creates partitions and deletes expired partitions, reducing partition table maintenance costs and improving query performance.
period: interval for automatically creating partitions. The default value is 1 day. The value range is 1 hour ~ 100 years.
ttl: time for automatically eliminate partitions. The value range is 1 hour ~ 100 years. Partition elimination occurs when nowtime - Partition boundary > ttl, resulting in the removal of qualifying partitions.
- Automatic partition creation
One or more partitions are automatically created at the interval specified by period to make the maximum partition boundary time greater than nowTime + 30 x period. As long as there is an automatically created partition, real-time data will not fail to be imported within the next 30 periods.
Figure 1 Automatic partition creation
- Automatically deleting expired partitions
Partitions whose boundary time is earlier than nowTime-ttl are considered expired partitions. The automatic partition management function traverses all partitions and deletes expired partitions after each period. If all partitions are expired partitions, the system retains one partition and truncates the table.
Constraints
- It cannot be used on midrange servers, acceleration clusters, or stand-alone clusters.
- It can be used in clusters of version 8.1.3 or later.
- It can only be used for row-store range partitioned tables, column-store range partitioned tables, time series tables, and cold and hot tables.
- The partition key must be unique and its type must be timestamp, timestamptz, or date.
- The maxvalue partition is not supported.
- The value of (nowTime - boundaryTime)/period must be less than the maximum number of partitions. nowTime indicates the current time, and boundaryTime indicates the earliest partition boundary time.
- The values of period and ttl range from 1 hour to 100 years. In addition, in a database compatible with Teradata or MySQL, if the partition key type is date, the value of period cannot be less than 1day.
- The table-level parameter ttl cannot exist independently. You must set period in advance or at the same time, and the value of ttl must be greater than or equal to that of period.
- During online cluster scale-out, partitions cannot be automatically added. Partitions reserved each time partitions are added will ensure that services are not affected.
Creating an ECS
For details, see Purchasing an ECS. After purchasing an ECS, log in to the ECS by referring to Logging In to a Linux ECS.
When creating an ECS, ensure that the ECS is in the same region, AZ, and VPC subnet as the stream data warehouse. Select the OS used by the gsql client (CentOS 7.6 is used as an example) as the ECS OS, and select using passwords to log in.
Creating a cluster
- Log in to the Huawei Cloud management console.
- Choose Service List > Analytics > Data Warehouse Service. On the page that is displayed, click Create Cluster in the upper right corner.
- Configure the parameters according to Table 1.
Table 1 Software configuration Parameter
Configuration
Region
Select the CN-Hong Kong region.
NOTE:CN-Hong Kong is used as an example. You can select other regions as required. Ensure that all operations are performed in the same region.
AZ
AZ2
Product
Standard data warehouse
CPU Architecture
x86
Node Flavor
dws2.m6.4xlarge.8 (16 vCPUs | 128 GB | 2000 GB SSD)
NOTE:If this flavor is sold out, select other AZs or flavors.
Nodes
3
Cluster Name
dws-demo
Administrator Account
dbadmin
Administrator Password
N/A
Confirm Password
N/A
Database Port
8000
VPC
vpc-default
Subnet
subnet-default(192.168.0.0/24)
Security Group
Automatic creation
EIP
Buy now
Bandwidth
1Mbit/s
Advanced Settings
Default
- Confirm the information, click Next, and then click Submit.
- Wait about 6 minutes. After the cluster is created, click next to the cluster name. On the displayed cluster information page, record the value of Public Network Address, for example, dws-demov.dws.huaweicloud.com.
Using the gsql CLI Client to Connect to a Cluster
- Remotely log in to the Linux server where gsql is to be installed as user root, and run the following command in the Linux command window to download the gsql client:
1
wget https://obs.ap-southeast-1.myhuaweicloud.com/dws/download/dws_client_8.1.x_redhat_x64.zip --no-check-certificate
- Decompress the client.
1
cd <Path_for_storing_the_client> unzip dws_client_8.1.x_redhat_x64.zip
Where,
- <Path_for_storing_the_client>: Replace it with the actual path.
- dws_client_8.1.x_redhat_x64.zip: This is the client tool package name of RedHat x64. Replace it with the actual name.
- Configure the GaussDB(DWS) client.
1
source gsql_env.sh
If the following information is displayed, the gsql client is successfully configured:
1
All things done.
- Use the gsql client to connect to a GaussDB(DWS) database (using the password you defined when creating the cluster).
1
gsql -d gaussdb -p 8000 -h 192.168.0.86 -U dbadmin -W password -r
If the following information is displayed, the connection succeeded:
1
gaussdb=>
Automatic partition management
The partition management function is bound to the table-level parameters period and ttl. Automatic partition creation is enabled with the enabling of period, and automatic partition deletion is enabled with the enabling of ttl. 30 seconds after period or ttl is set, the automatic partition creation or deletion works for the first time.
You can enable the partition management function in either of the following ways:
- Specify period and ttl when creating a table.
This way is applicable when you create a partition management table. There are two syntaxes for creating a partition management table. One specifies partitions, and the other does not.
If partitions are specified when a partition management table is created, the syntax rules are the same as those for creating a common partitioned table. The only difference is that the syntax specifies the table-level parameters period and ttl.
The following example shows how to create a partition management table CPU1 and specify partitions.
1 2 3 4 5 6 7 8 9 10
CREATE TABLE CPU1( id integer, IP text, time timestamp ) with (TTL='7 days',PERIOD='1 day') partition by range(time) ( PARTITION P1 VALUES LESS THAN('2023-02-13 16:32:45'), PARTITION P2 VALUES LESS THAN('2023-02-15 16:48:12') );
When creating a partition management table, you can specify only the partition key but not partitions. In this case, two default partitions will be created with period as the partition time range. The boundary time of the first default partition is the first hour, day, week, month, or year past the current time. The time unit is selected based on the maximum unit of PERIOD. The boundary time of the second default partition is the boundary time of the first partition plus PERIOD. Assume that the current time is 2023-02-17 16:32:45, and the boundary of the first default partition is described in the following table.
Table 2 Description of the period parameter period
Maximum PERIOD Unit
Boundary of First Default Partition
1hour
Hour
2023-02-17 17:00:00
1day
Day
2023-02-18 00:00:00
1month
Month
2023-03-01 00:00:00
13months
Year
2024-01-01 00:00:00
Run the following command to create the partition management table CPU2 with no partitions specified:
1 2 3 4 5 6
CREATE TABLE CPU2( id integer, IP text, time timestamp ) with (TTL='7 days',PERIOD='1 day') partition by range(time);
- Run the ALTER TABLE RESET command to set period and ttl.
This method is used to add the partition management function to an ordinary partitioned table that meets the partition management constraints.
- Run the following command to create an ordinary partition table CPU3:
1 2 3 4 5 6 7 8 9 10
CREATE TABLE CPU3( id integer, IP text, time timestamp ) partition by range(time) ( PARTITION P1 VALUES LESS THAN('2023-02-14 16:32:45'), PARTITION P2 VALUES LESS THAN('2023-02-15 16:56:12') );
- To enable the automatic partition creation and deletion functions, run the following command:
1
ALTER TABLE CPU3 SET (PERIOD='1 day',TTL='7 days');
- To enable only the automatic partition creation function, run the following command:
1
ALTER TABLE CPU3 SET (PERIOD='1 day');
- To enable only the automatic partition deletion function, run the following command (If automatic partition creation is not enabled in advance, the operation will fail):
1
ALTER TABLE CPU3 SET (TTL='7 days');
- Modify the period and ttl parameters to modify the partition management function.
1
ALTER TABLE CPU3 SET (TTL='10 days',PERIOD='2 days');
- Run the following command to create an ordinary partition table CPU3:
- Disabling the partition management function
You can run the ALTER TABLE RESET command to delete the table-level parameters period and ttl to disable the partition management function.
- The period cannot be deleted separately with TTL.
- The time series table does not support ALTER TABLE RESET.
- Run the following command to disable the automatic partition creation and deletion functions:
1
ALTER TABLE CPU1 RESET (PERIOD,TTL);
- To disable only the automatic partition deletion, run the following command:
1
ALTER TABLE CPU3 RESET (TTL);
- To disable only the automatic partition creation function, run the following command (If the table contains the ttl parameter, the operation will fail):
1
ALTER TABLE CPU3 RESET (PERIOD);
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot