Creating and Managing DWS Partitioned Tables
Partitioning refers to splitting what is logically one large table into smaller physical pieces based on specific schemes. The table based on the logic is called a partition cable, and a physical piece is called a partition. Data is stored on these smaller physical pieces, namely, partitions, instead of the larger logical partitioned table. During conditional query, the system scans only the partitions that meet the conditions rather than scanning the entire table improving query performance.
For details about the syntax for maintaining partitioned tables, see ALTER TABLE PARTITION.
Advantages of partitioned tables:
- Improved query performance. You can search in specific partitions, improving the search efficiency.
- Enhanced availability. If a partition is faulty, data in other partitions is still available.
- Improved maintainability. For expired historical data that needs to be periodically deleted, you can quickly delete it by dropping or truncate partitions.
Supported Table Partition Types
- Range partitioning: partitions are created based on a numeric range, for example, by date or price range.
- List partitioning: partitions are created based on a list of values, such as sales scope or product attribute. Only clusters of 8.1.3 and later versions support this function.
Choosing to Partition a Table
You can choose to partition a table when the table has the following characteristics:
- There are obvious ranges among the fields of the table.
A table is partitioned based on obvious rangeable fields. Generally, columns such as date, area, and value are used for partitioning. The time column is most commonly used.
- Queries to the table have obvious range characteristics.
If the queried data fall into specific ranges, its better tables are partitioned so that through partition pruning, only the queried partition needs to be scanned, improving data scanning efficiency and reducing the I/O overhead of data scanning.
- The table contains a large amount of data.
Scanning small tables does not take much time, therefore the performance benefits of partitioning are not significant. Therefore, you are advised to partition only large tables. In column-store table, each column is an independent file storage unit, and the minimum storage unit CU can store 60,000 rows of data. Therefore, for column-store partitioned tables, it is recommended that the data volume in each partition be greater than or equal to the number of DNs multiplied by 60,000.
Creating a Range Partitioned Table
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
CREATE TABLE web_returns_p1 ( wr_returned_date_sk integer, wr_returned_time_sk integer, wr_item_sk integer NOT NULL, wr_refunded_customer_sk integer ) WITH (orientation = column) DISTRIBUTE BY HASH (wr_item_sk) PARTITION BY RANGE (wr_returned_date_sk) ( PARTITION p2016 VALUES LESS THAN(20161231), PARTITION p2017 VALUES LESS THAN(20171231), PARTITION p2018 VALUES LESS THAN(20181231), PARTITION p2019 VALUES LESS THAN(20191231), PARTITION pxxxx VALUES LESS THAN(maxvalue) ); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE web_returns_p2 ( wr_returned_date_sk integer, wr_returned_time_sk integer, wr_item_sk integer NOT NULL, wr_refunded_customer_sk integer ) WITH (orientation = column) DISTRIBUTE BY HASH (wr_item_sk) PARTITION BY RANGE(wr_returned_date_sk) ( PARTITION p2016 START(20161231) END(20191231) EVERY(10000), PARTITION p0 END(maxvalue) ); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CREATE TABLE web_returns_p2 ( id integer, idle numeric, IO numeric, scope text, IP text, time timestamp ) WITH (TTL='7 days',PERIOD='1 day') PARTITION BY RANGE(time) ( PARTITION P1 VALUES LESS THAN('2022-01-05 16:32:45'), PARTITION P2 VALUES LESS THAN('2022-01-06 16:56:12') ); |
Creating a List Partitioned Table
A list partitioned table can use any column that allows value comparison as the partition key column. When creating a list partitioned table, you must declare the value partition for each partition.
Example: Create a list partitioned table sales_info.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE sales_info ( sale_time timestamptz, period int, city text, price numeric(10,2), remark varchar2(100) ) DISTRIBUTE BY HASH(sale_time) PARTITION BY LIST (period, city) ( PARTITION province1_202201 VALUES (('202201', 'city1'), ('202201', 'city2')), PARTITION province2_202201 VALUES (('202201', 'city3'), ('202201', 'city4'), ('202201', 'city5')) ); |
Partitioning an Existing Table
A table can be partitioned only when it is created. If you want to partition a table, you must create a partitioned table, load the data in the original table to the partitioned table, delete the original table, and rename the partitioned table as the name of the original table. You must also re-grant permissions on the table to users. For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
CREATE TABLE web_returns_p2 ( wr_returned_date_sk integer, wr_returned_time_sk integer, wr_item_sk integer NOT NULL, wr_refunded_customer_sk integer ) WITH (orientation = column) DISTRIBUTE BY HASH (wr_item_sk) PARTITION BY RANGE(wr_returned_date_sk) ( PARTITION p2016 START(20161231) END(20191231) EVERY(10000), PARTITION p0 END(maxvalue) ); |
1 2 3 4 5 |
INSERT INTO web_returns_p2 SELECT * FROM web_returns_p1; DROP TABLE web_returns_p1; ALTER TABLE web_returns_p2 RENAME TO web_returns_p1; GRANT ALL PRIVILEGES ON web_returns_p1 TO dbadmin; GRANT SELECT ON web_returns_p1 TO jack; |
Adding a Partition
Run the ALTER TABLE statement to add a partition to a partitioned table. For example, to add partition P2020 to the web_returns_p1 table, run the following command:
1
|
ALTER TABLE web_returns_p1 ADD PARTITION P2020 VALUES LESS THAN (20201231); |
Add a partition to a list partitioned table. For example, add the province3_202201 partition to the sales_info table.
1
|
ALTER TABLE sales_info ADD PARTITION province3_202201 VALUES (('202201', 'city6'), ('202201', 'city7')); |
Splitting a Partition
The syntax for splitting a partition varies between a range partitioned table and a list partitioned table.
- Run the ALTER TABLE statement to split a partition in a range partitioned table. For example, the partition pxxxx of the table web_returns_p1 is split into two partitions p2020 and p20xx at the splitting point 20201231.
1
ALTER TABLE web_returns_p1 SPLIT PARTITION pxxxx AT(20201231) INTO (PARTITION p2020,PARTITION p20xx);
- Run the ALTER TABLE statement to split a partition in a list partitioned table. For example, split the partition province2_202201 of table sales_inf into two partitions province3_202201 and province4_202201.
1
ALTER TABLE sales_info SPLIT PARTITION province2_202201 VALUES(('202201', 'city5')) INTO (PARTITION province3_202201,PARTITION province4_202201);
Merging Partitions
Run the ALTER TABLE statement to merge two partitions in a partitioned table. For example, merge partitions p2016 and p2017 of table web_returns_p1 into one partition p20162017.
1
|
ALTER TABLE web_returns_p1 MERGE PARTITIONS p2016,p2017 INTO PARTITION p20162017; |
Deleting a Partition
1
|
ALTER TABLE web_returns_p1 DROP PARTITION P2020; |
Querying a Partition
- Query partition p2019.
1 2
SELECT * FROM web_returns_p1 PARTITION (p2019); SELECT * FROM web_returns_p1 PARTITION FOR (20201231);
- View partitioned tables using the system catalog dba_tab_partitions.
1
SELECT * FROM dba_tab_partitions where table_name='web_returns_p1';
Deleting a Partitioned Table
Run the DROP TABLE statement to delete a partitioned table.
1
|
DROP TABLE web_returns_p1; |
Setting Whether a Partitioned Index Is Available
Create the local index student_grade_index for the partitioned table customer_address and set partition index names.
1 2 3 4 5 6 |
CREATE INDEX customer_address_index ON customer_address(ca_address_id) LOCAL ( PARTITION P1_index, PARTITION P2_index, PARTITION P3_index ); |
Rebuild all indexes on partition P1 in the partitioned table customer_address.
1
|
ALTER TABLE customer_address MODIFY PARTITION P1 REBUILD UNUSABLE LOCAL INDEXES; |
Set all indexes in partition P3 of the partitioned table customer_address to be unusable.
1
|
ALTER TABLE customer_address MODIFY PARTITION P3 UNUSABLE LOCAL INDEXES; |
Rebuilding Partition Indexes
For a partitioned table that has been running for a long time, indexes may generate fragments. Rebuilding indexes can improve query efficiency. The following describes how to rebuild indexes after data is inserted into partitions.
- Creates a partitioned table.
1 2 3 4 5 6 7 8 9 10 11
DROP TABLE IF EXISTS sales; CREATE TABLE sales ( id INT, sale_date DATE, amount DECIMAL(10,2) ) PARTITION BY RANGE (sale_date) ( PARTITION p202310 VALUES LESS THAN ('2023-11-01'), PARTITION p202311 VALUES LESS THAN ('2023-12-01') );
- Create an index.
1
CREATE INDEX idx_sale_date ON sales (sale_date) LOCAL;
- Insert partition data of October and November.
1 2 3 4 5 6 7
INSERT INTO sales PARTITION (p202310) VALUES (1, '2023-10-05', 100.50), (2, '2023-10-10', 200.75); INSERT INTO sales PARTITION (p202311) VALUES (3, '2023-11-15', 300.00);
- Rebuild indexes of the specific partition p202310.
1
ALTER TABLE sales REBUILD PARTITION p202310 WITHOUT UNUSABLE;
Exchanging Partitions
The following example demonstrates how to migrate data from table math_grade to partition math in partitioned table student_grade.
- Create the partitioned table student_grade.
1 2 3 4 5 6 7 8 9 10 11 12 13
CREATE TABLE student_grade ( stu_name char(5), stu_no integer, grade integer, subject varchar(30) ) PARTITION BY LIST(subject) ( PARTITION gym VALUES('gymnastics'), PARTITION phys VALUES('physics'), PARTITION history VALUES('history'), PARTITION math VALUES('math') );
- Add data to the partitioned table student_grade.
1 2 3 4 5 6 7
INSERT INTO student_grade VALUES ('Ann', 20220101, 75, 'gymnastics'), ('Jeck', 20220103, 60, 'math'), ('Anna', 20220108, 56, 'history'), ('Jann', 20220107, 82, 'physics'), ('Molly', 20220104, 91, 'physics'), ('Sam', 20220105, 72, 'math');
- Query the records of partition math in student_grade.
1
SELECT * FROM student_grade PARTITION (math);
The query result is as follows:
stu_name | stu_no | grade | subject ----------+----------+-------+--------- Jeck | 20220103 | 60 | math Sam | 20220105 | 72 | math (2 rows)
- Create an ordinary table math_grade that matches the definition of the partitioned table student_grade.
1 2 3 4 5 6 7
CREATE TABLE math_grade ( stu_name char(5), stu_no integer, grade integer, subject varchar(30) );
- Insert data to the math_grade table. The data in the student_grade partitioned table conforms to the partition rule of partition math.
1 2 3 4 5
INSERT INTO math_grade VALUES ('Ann', 20220101, 75, 'math'), ('Jeck', 20220103, 60, 'math'), ('Anna', 20220108, 56, 'math'), ('Jann', 20220107, 82, 'math');
- Migrate data from the ordinary table math_grade to partition math in the partitioned table student_grade.
1
ALTER TABLE student_grade EXCHANGE PARTITION (math) WITH TABLE math_grade;
- Query the student_grade partitioned table. The results show that data in the math_grade table has exchanged with data in the math partition of the student_grade table.
1
SELECT * FROM student_grade PARTITION (math);
1 2 3 4 5 6 7
stu_name | stu_no | grade | subject ----------+----------+-------+--------- Anna | 20220108 | 56 | math Jeck | 20220103 | 60 | math Ann | 20220101 | 75 | math Jann | 20220107 | 82 | math (4 rows)
- Query the math_grade table. The result shows that the data stored in the math partition of the student_grade partitioned table has been exchanged to the math_grade table.
1
SELECT * FROM math_grade;
1 2 3 4 5
stu_name | stu_no | grade | subject ----------+----------+-------+--------- Jeck | 20220103 | 60 | math Sam | 20220105 | 72 | math (2 rows)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot