CREATE TABLE PARTITION
Description
Creates a partitioned table. Partitioning refers to splitting what is logically one large table into smaller physical pieces based on specific schemes. The table based on the logic is called a partitioned table, and each physical piece is called a partition. Data is stored on these physical partitions, instead of the logical partitioned table.
The common forms of partitioning include range partitioning, interval partitioning, hash partitioning, list partitioning, and value partitioning. Currently, row-store tables support range partitioning, hash partitioning, list partitioning.
In range partitioning, a table is partitioned based on ranges defined by one or more columns, with no overlap between the ranges of values assigned to different partitions. Each range has a dedicated partition for data storage.
The partitioning policy for range partitioning refers to how data is inserted into partitions.
In range partitioning, a table is partitioned based on partition key values. If a record can be mapped to a partition, it is inserted into the partition; if it cannot, an error message is returned. Range partitioning is the most commonly used partitioning policy.
In hash partitioning, a modulus and a remainder are specified for each partition based on a column in the table, and records to be inserted into the table are allocated to the corresponding partition, the rows in each partition must meet the following condition: The value of the partition key divided by the specified modulus generates the remainder specified for the partition key.
In hash partitioning, table is partitioned based on partition key values. If a record can be mapped to a partition, it is inserted into the partition; if it cannot, an error message is returned.
List partitioning is to allocate the records to be inserted into a table to the corresponding partition based on the key values in each partition. The key values do not overlap in different partitions. Create a partition for each group of key values to store corresponding data.
In list partitioning, table is partitioned based on partition key values. If a record can be mapped to a partition, it is inserted into the partition; if it cannot, an error message is returned.
Partitioning can provide several benefits:
- Query performance can be improved drastically in certain situations, particularly when most of the heavily accessed rows of the table are in a single partition or a small number of partitions. Partitioning narrows the range of data search and improves data access efficiency.
- In the case of an INSERT or UPDATE operation on most portions of a single partition, performance can be improved by taking advantage of continuous scan of that partition instead of partitions scattered across the whole table.
- Frequent loading or deletion operations on records in a separate partition can be accomplished by reading or removing that partition. It also avoids the VACUUM overload caused by bulk DELETE operations.
Precautions
- If the constraint key of the unique constraint and primary key constraint contains all partition keys, a local index is created for the constraints. Otherwise, a global index is created.
- Currently, hash partitioning supports only single-column partition keys, and does not support multi-column partition keys.
- In the PARTITION FOR (values) syntax for partitioned tables, values can only be constants.
- In the PARTITION FOR (values) syntax for partitioned tables, if data type conversion is required for values, you are advised to use forcible type conversion to prevent the implicit type conversion result from being inconsistent with the expected result.
- The maximum number of partitions is 1048575. Generally, it is impossible to create so many partitions, because too many partitions may cause insufficient memory. Create partitions based on the value of local_syscache_threshold. The memory used by the partitioned tables is about (number of partitions x 3/1024) MB. Theoretically, the memory occupied by the partitions cannot be greater than the value of local_syscache_threshold. In addition, some space must be reserved for other functions.
- If the memory is insufficient due to too many partitions, the performance deteriorates sharply.
- Currently, the statement specifying a partition cannot perform global index scan.
- Data of the XML type cannot be used as partition keys or level-2 partition keys.
- When UPDATE or DELETE is performed on a partitioned table, if the generated plan is not an FQS or Stream plan, the statement execution efficiency is low. You are advised to check statements and eliminate factors that cannot be pushed down to generate FQS or Stream plans.
Syntax
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
CREATE TABLE [ IF NOT EXISTS ] partition_table_name ( [ { column_name data_type [ COLLATE collation ] [ column_constraint [ ... ] ] | table_constraint | LIKE source_table [ like_option [...] ] } [, ... ] ] ) [ WITH ( {storage_parameter = value} [, ... ] ) ] [ COMPRESS | NOCOMPRESS ] [ TABLESPACE tablespace_name ] [ DISTRIBUTE BY { REPLICATION | HASH ( column_name [, ...] ) | RANGE ( column_name [, ...] ) { SLICE REFERENCES tablename | ( slice_less_than_item [, ...] ) | ( slice_start_end_item [, ...] ) } | LIST ( column_name [, ...] ) { SLICE REFERENCES tablename | ( slice_values_item [, ...] ) } } ] [ TO { GROUP groupname | NODE ( nodename [, ... ] ) } ] PARTITION BY { {RANGE [COLUMNS] (partition_key) [ PARTITIONS integer ] ( partition_less_than_item [, ... ] )} | {RANGE [COLUMNS] (partition_key) [ PARTITIONS integer ] ( partition_start_end_item [, ... ] )} | {LIST [COLUMNS] (partition_key) [ PARTITIONS integer ] ( PARTITION partition_name VALUES [IN] (list_values) [TABLESPACE [=] tablespace_name][, ... ])} | { HASH (partition_key) [ PARTITIONS integer ] ( PARTITION partition_name [TABLESPACE [=] tablespace_name][, ... ])} } [ { ENABLE | DISABLE } ROW MOVEMENT ]; |
- column_constraint is as follows:
1 2 3 4 5 6 7 8
[ CONSTRAINT constraint_name ] { NOT NULL | NULL | CHECK ( expression ) | DEFAULT default_expr | UNIQUE [KEY] [ index_parameters ] | PRIMARY KEY [ index_parameters] } [ DEFERRABLE | NOT DEFERRABLE ][ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
- table_constraint is as follows:
[ CONSTRAINT constraint_name ] { CHECK ( expression ) | UNIQUE ( column_name [, ... ] ) [ index_parameters ] | PRIMARY KEY ( column_name [, ... ] ) [ index_parameters]} [ DEFERRABLE | NOT DEFERRABLE ][ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]
- like_option is as follows:
1
{ INCLUDING | EXCLUDING } { DEFAULTS | CONSTRAINTS | INDEXES | STORAGE | COMMENTS | RELOPTIONS | DISTRIBUTION | ALL }
- index_parameters is as follows:
1 2
[ WITH ( {storage_parameter = value} [, ... ] ) ] [ USING INDEX TABLESPACE tablespace_name ]
- partition_less_than_item:
1
PARTITION partition_name VALUES LESS THAN ( { partition_value | MAXVALUE } ) [TABLESPACE tablespace_name]
- partition_start_end_item:
1 2 3 4 5 6
PARTITION partition_name { {START(partition_value) END (partition_value) EVERY (interval_value)} | {START(partition_value) END ({partition_value | MAXVALUE})} | {START(partition_value)} | {END({partition_value | MAXVALUE})} } [TABLESPACE tablespace_name]
Parameters
- IF NOT EXISTS
Sends a notice instead of throwing an error, if a table with the same name exists.
- partition_table_name
Specifies the name of a partitioned table.
Value range: a string. It must comply with the naming convention.
- column_name
Specifies the name of a column to be created in the new table.
Value range: a string. It must comply with the naming convention.
- data_type
Specifies the data type of the column.
- COLLATE collation
Assigns a collation to the column (which must be of a collatable data type). If no collation is specified, the default collation is used.
- CONSTRAINT constraint_name
Specifies the name of a column or table constraint. The optional constraint clauses specify constraints that new or updated rows must satisfy for an INSERT or UPDATE operation to succeed. You can run the SELECT * FROM pg_collation command to query collation rules from the pg_collation system catalog. The default collation rule is the row starting with default in the query result.
There are two ways to define constraints:
- A column constraint is defined as part of a column definition, and it is bound to a particular column.
- A table constraint is not bound to a particular column but can apply to more than one column.
- LIKE source_table [ like_option ... ]
Specifies a table from which the new table automatically copies all column names, their data types, and their not-null constraints.
Unlike INHERITS, the new table and original table are decoupled after creation is complete. Changes to the original table will not be applied to the new table, and it is not possible to include data of the new table in scans of the original table.
Default expressions for the copied column definitions will be copied only if INCLUDING DEFAULTS is specified. The default behavior is to exclude default expressions, resulting in the copied columns in the new table having default values NULL.
Not-null constraints are always copied to the new table. CHECK constraints will only be copied if INCLUDING CONSTRAINTS is specified; other types of constraints will never be copied. These rules also apply to column constraints and table constraints.
Unlike those of INHERITS, columns and constraints copied by LIKE are not merged with similarly named columns and constraints. If the same name is specified explicitly or in another LIKE clause, an error is reported.
- Any indexes on the original table will not be created on the new table, unless the INCLUDING INDEXES clause is specified.
- STORAGE settings for the original columns are copied only if INCLUDING STORAGE is specified. The default behavior is to exclude STORAGE settings.
- If INCLUDING COMMENTS is specified, comments for the copied original columns, constraints, and indexes are copied. The default behavior is to exclude comments.
- If INCLUDING RELOPTIONS is specified, the new table will copy the storage parameter (that is, WITH clause) of the original table. The default behavior is to exclude partition definition of the storage parameter of the original table.
- If INCLUDING DISTRIBUTION is specified, the new table will copy the distribution information of the original table, including distribution type and key, and the new table cannot use the DISTRIBUTE BY clause. The default behavior is to exclude distribution information of the original table.
- INCLUDING ALL contains the meaning of INCLUDING DEFAULTS, INCLUDING CONSTRAINTS, INCLUDING INDEXES, INCLUDING STORAGE, INCLUDING COMMENTS, INCLUDING RELOPTIONS, and INCLUDING DISTRIBUTION.
- WITH ( storage_parameter [= value] [, ... ] )
Specifies an optional storage parameter for a table or an index. Optional parameters are as follows:
- FILLFACTOR
The fill factor of a table is a percentage from 10 to 100. 100 (complete filling) is the default value. When a smaller fill factor is specified, INSERT operations pack table pages only to the indicated percentage. The remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page, which is more efficient than placing it on a different page. For a table whose entries are never updated, setting the fill factor to 100 (complete filling) is the best choice, but in heavily updated tables a smaller fill factor would be appropriate.
Value range: 10–100
- ORIENTATION
Determines the data storage mode of the table.
Value range:
- ROW (default value): The data will be stored in rows.
orientation cannot be modified.
- ROW (default value): The data will be stored in rows.
- COMPRESSION
- Row-store tables do not support compression.
- FILLFACTOR
- COMPRESS / NOCOMPRESS
Specifies keyword COMPRESS during the creation of a table, so that the compression feature is triggered in case of BULK INSERT operations. If this feature is enabled, a scan is performed for all tuple data within the page to generate a dictionary and then the tuple data is compressed and stored. If NOCOMPRESS is specified, the table is not compressed.
Default value: NOCOMPRESS, that is, tuple data is not compressed before storage. Row-store tables do not support compression.
- TABLESPACE tablespace_name
Specifies that the new table will be created in the tablespace_name tablespace. If not specified, the default tablespace is used.
- DISTRIBUTE BY
Specifies how the table is distributed or replicated between nodes.
For the value range and details, see DISTRIBUTE BY.
- TO { GROUP groupname | NODE ( nodename [, ... ] ) }
TO GROUP specifies the node group to which the table to be created belongs. TO NODE is used for internal scale-out tools.
- PARTITION BY RANGE [COLUMNS] (partition_key)
Creates a range partition. partition_key is the name of the partition key.
The COLUMNS keyword can be used only when sql_compatibility is set to 'MYSQL'. The semantics of PARTITION BY RANGE COLUMNS is the same as that of PARTITION BY RANGE.
(1) Assume that the VALUES LESS THAN syntax is used.
If the VALUE LESS THAN clause is used, a range partitioning policy supports a partition key with up to 16 columns.
Data types supported by the partition keys are as follows: SMALLINT, INTEGER, BIGINT, DECIMAL, NUMERIC, REAL, FLOAT4, FLOAT8, DOUBLE PRECISION, CHARACTER VARYING(n), VARCHAR(n), CHARACTER(n), CHAR(n), CHARACTER, CHAR, TEXT, NVARCHAR2, NAME, TIMESTAMP[(p)] [WITHOUT TIME ZONE], TIMESTAMP[(p)] [WITH TIME ZONE], and DATE.
(2) Assume that the START END syntax is used.
In this case, only one partition key is supported.
Data types supported by the partition key are as follows: SMALLINT, INTEGER, BIGINT, DECIMAL, NUMERIC, REAL, FLOAT4, FLOAT8, DOUBLE PRECISION, TIMESTAMP[(p)] [WITHOUT TIME ZONE], TIMESTAMP[(p)] [WITH TIME ZONE], and DATE.
- PARTITION partition_name VALUES LESS THAN {( { partition_value | MAXVALUE } [,...] ) | MAXVALUE }
Specifies the information of partitions. partition_name is the name of a range partition. partition_value is the upper limit of a range partition, and the value depends on the type of partition_key. MAXVALUE usually specifies the upper limit of the last range partition.
- Each partition requires an upper limit.
- The data type of the upper limit must be the same as that of the partition key.
- In a partition list, partitions are arranged in ascending order of upper limits. A partition with a smaller upper limit value is placed before another partition with a larger one.
- MAXVALUE that is not in parentheses can be used only when sql_compatibility is set to 'MYSQL' and can have only one partition key.
- PARTITION partition_name {START (partition_value) END (partition_value) EVERY (interval_value)} | {START (partition_value) END (partition_value|MAXVALUE)} | {START(partition_value)} | {END (partition_value | MAXVALUE)}
Specifies the information of partitions.
- partition_name: name or name prefix of a range partition. It is the name prefix only in the following cases (assuming that partition_name is p1):
- If START+END+EVERY is used, the names of partitions will be defined as p1_1, p1_2, and the like. For example, if PARTITION p1 START(1) END(4) EVERY(1) is defined, the generated partitions are [1, 2), [2, 3), and [3, 4), and their names are p1_1, p1_2, and p1_3. In this case, p1 is a name prefix.
- If the defined statement is in the first place and has START specified, the range (MINVALUE, START) will be automatically used as the first actual partition, and its name will be p1_0. The other partitions are then named p1_1, p1_2, and the like. For example, if PARTITION p1 START(1), PARTITION p2 START(2) is defined, generated partitions are (MINVALUE, 1), [1, 2), and [2, MAXVALUE), and their names will be p1_0, p1_1, and p2. In this case, p1 is a name prefix and p2 is a partition name. MINVALUE means the minimum value.
- partition_value: start value or end value of a range partition. The value depends on partition_key and cannot be MAXVALUE.
- interval_value: width of each partition for dividing the [START, END) range. It cannot be MAXVALUE. If the value of (END – START) divided by EVERY has a remainder, the width of only the last partition is less than the value of EVERY.
- MAXVALUE usually specifies the upper limit of the last range partition.
- If the defined statement is in the first place and has START specified, the range (MINVALUE, START) will be automatically used as the first actual partition.
- The START END syntax must comply with the following rules:
- The value of START (if any, same for the following situations) in each partition_start_end_item must be smaller than that of END.
- In two adjacent partition_start_end_item statements, the value of the first END must be equal to that of the second START.
- The value of EVERY in each partition_start_end_item must be a positive number (in ascending order) and must be smaller than END minus START.
- Each partition includes the start value (unless it is MINVALUE) and excludes the end value. The format is as follows: [START, END).
- Partitions created by the same partition_start_end_item belong to the same tablespace.
- If partition_name is a name prefix of a partition, the length must not exceed 57 bytes. If there are more than 57 bytes, the prefix will be automatically truncated.
- When creating or modifying a partitioned table, ensure that the total number of partitions in the table does not exceed the maximum value 1048575.
- In statements for creating partitioned tables, START END and LESS THAN cannot be used together.
- The START END syntax in a partitioned table creation SQL statement will be replaced by the VALUES LESS THAN syntax when gs_dump is executed.
- partition_name: name or name prefix of a range partition. It is the name prefix only in the following cases (assuming that partition_name is p1):
- PARTITION BY LIST [COLUMNS] (partition_key)
Creates a list partition. partition_key is the name of the partition key.
The COLUMNS keyword can be used only when sql_compatibility is set to 'MYSQL'. The semantics of PARTITION BY LIST COLUMNS is the same as that of PARTITION BY LIST.
- A list partitioning policy supports a partition key with up to 16 columns.
- For the clause syntax VALUES [IN] (list_values), if list_values contains the key values of the corresponding partition, it is recommended that the number of key values of each partition be less than or equal to 64.
- The clause VALUES IN can be used only when sql_compatibility is set to 'MYSQL'. The semantics is the same as that of VALUES.
Partition keys support the following data types: INT1, INT2, INT4, INT8, NUMERIC, VARCHAR(n), CHAR, BPCHAR, NVARCHAR2, TIMESTAMP[(p)] [WITHOUT TIME ZONE], TIMESTAMP[(p)] [WITH TIME ZONE] and DATE. The number of partitions cannot exceed 1048575.
- PARTITION BY HASH(partition_key)
Create a hash partition. partition_key is the name of the partition key.
For partition_key, the hash partitioning policy supports only one column of partition keys.
Partition keys support the following data types: INT1, INT2, INT4, INT8, NUMERIC, VARCHAR(n), CHAR, BPCHAR, TEXT, NVARCHAR2, TIMESTAMP[(p)] [WITHOUT TIME ZONE], TIMESTAMP[(p)] [WITH TIME ZONE], and DATE. The number of partitions cannot exceed 1048575.
- PARTITIONS integer
Specifies the number of partitions.
integer indicates the number of partitions. The value must be an integer greater than 0 and cannot be greater than 1048575.
- When this clause is specified after the range and list partitions, each partition must be explicitly defined, and the number of defined partitions must be equal to the integer value. This clause can be specified after the range and list partitions only when sql_compatibility is set to 'MYSQL'.
- When this clause is specified after the hash and key partitions, if the definition of each partition is not listed, an integer number of partitions are automatically generated. The automatically generated partition name is "p+number", and the number ranges from 0 to integer minus 1. The tablespace of the partition is the tablespace of the table by default. If each partition definition is explicitly defined, the number of defined partitions must be the same as the value of integer. If neither the partition definition nor the number of partitions is specified, a unique partition is created.
- { ENABLE | DISABLE } ROW MOVEMENT
Sets row movement.
If the tuple value is updated on the partition key during the UPDATE operation, the partition where the tuple is located is altered. Setting this parameter enables error messages to be reported or movement of the tuple between partitions.
Value range:
- ENABLE: Row movement is enabled.
- DISABLE (default value): Row movement is disabled.
If the row movement is enabled, an error may be reported when UPDATE and DELETE operations are performed concurrently. The causes are as follows:
The old data is marked as deleted in the UPDATE and DELETE operations. If the row movement is enabled, the cross-partition update occurs when the partition key is updated. The kernel marks the old data in the old partition as deleted and adds a data to the new partition. As a result, the new data cannot be found by querying the old data.
If data in the same row is concurrently operated, the cross-partition and non-cross-partition data results have different behaviors in the following three concurrency scenarios: UPDATE and UPDATE concurrency, DELETE and DELETE concurrency, as well as UPDATE and DELETE concurrency.
- For non-cross-partition data, no error is reported for the second operation after the first operation is performed.
If the first operation is UPDATE, the latest data can be found and operated after the second operation is performed.
If the first operation is DELETE, the second operation is terminated if the current data is deleted and the latest data cannot be found.
- For the cross-partition data result, an error is reported for the second operation after the first operation is performed.
If the first operation is UPDATE, the second operation cannot find the latest data because the new data is in the new partition. Therefore, the second operation fails and an error is reported.
If the first operation is DELETE, performing the second operation can find that the current data is deleted and the latest data cannot be found, but cannot determine whether the operation of deleting the old data is UPDATE or DELETE. If the operation is UPDATE, an error is reported. If the operation is DELETE, the operation is terminated. To ensure the data correctness, an error is reported.
If the UPDATE and UPDATE concurrency, and UPDATE and DELETE concurrency are performed, the error can be solved only when the operations are performed serially. If the DELETE and DELETE concurrency are performed, the error can be solved by disabling the row movement.
- NOT NULL
The column is not allowed to contain null values. ENABLE can be omitted.
- NULL
Specifies that the column is allowed to contain null values. This is the default setting.
This clause is only provided for compatibility with non-standard SQL databases. It is not recommended.
- CHECK (condition) [ NO INHERIT ]
Specifies an expression producing a Boolean result where the INSERT or UPDATE operation of new or updated rows can succeed only when the expression result is TRUE or UNKNOWN; otherwise, an error is thrown and the database is not altered.
A check constraint specified as a column constraint should reference only the column's values, while an expression in a table constraint can reference multiple columns.
A constraint marked with NO INHERIT will not propagate to child tables.
ENABLE can be omitted.
- DEFAULT default_expr
Assigns a default data value to a column. The value can be any variable-free expressions. (Subqueries and cross-references to other columns in the current table are not allowed.) The data type of the default expression must match the data type of the column.
The default expression will be used in any insert operation that does not specify a value for the column. If there is no default value for a column, then the default value is null.
- UNIQUE [KEY] index_parameters
UNIQUE ( column_name [, ... ] ) index_parameters
Specifies that a group of one or more columns of a table can contain only unique values.
For the purpose of a unique constraint, null is not considered equal.
UNIQUE KEY can be used only when sql_compatibility is set to 'MYSQL', which has the same semantics as UNIQUE.
- PRIMARY KEY index_parameters
PRIMARY KEY ( column_name [, ... ] ) index_parameters
Specifies that a column or columns of a table can contain only unique (non-duplicate) and non-null values.
Only one primary key can be specified for a table.
- DEFERRABLE | NOT DEFERRABLE
Controls whether the constraint can be deferred. A constraint that is not deferrable will be checked immediately after every command. Checking of constraints that are deferrable can be postponed until the end of the transaction using the SET CONSTRAINTS command. NOT DEFERRABLE is the default value. Currently, only UNIQUE and PRIMARY KEY constraints accept this clause. All the other constraints are not deferrable.
- INITIALLY IMMEDIATE | INITIALLY DEFERRED
If a constraint is deferrable, this clause specifies the default time to check the constraint.
- If the constraint is INITIALLY IMMEDIATE (default value), it is checked after each statement.
- If the constraint is INITIALLY DEFERRED, it is checked only at the end of the transaction.
The constraint check time can be altered using the SET CONSTRAINTS statement.
- USING INDEX TABLESPACE tablespace_name
Allows selection of the tablespace in which the index associated with a UNIQUE or PRIMARY KEY constraint will be created. If not specified, the index is created in default_tablespace. If default_tablespace is empty, the default tablespace of the database is used.
Examples
- Example 1: Create a range-partitioned table tpcds.web_returns_p1. The table has eight partitions and their partition keys are of the integer type. The ranges of the partitions are: wr_returned_date_sk < 2450815, 2450815 ≤ wr_returned_date_sk < 2451179, 2451179 ≤ wr_returned_date_sk < 2451544, 2451544 ≤ wr_returned_date_sk < 2451910, 2451910 ≤ wr_returned_date_sk < 2452275, 2452275 ≤ wr_returned_date_sk < 2452640, 2452640 ≤ wr_returned_date_sk < 2453005, and wr_returned_date_sk ≥ 2453005.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
-- Create a temporary schema. gaussdb=# CREATE SCHEMA tpcds; gaussdb=# SET CURRENT_SCHEMA TO tpcds; -- Create the tpcds.web_returns table. gaussdb=# CREATE TABLE tpcds.web_returns ( W_WAREHOUSE_SK INTEGER NOT NULL, W_WAREHOUSE_ID CHAR(16) NOT NULL, W_WAREHOUSE_NAME VARCHAR(20) , W_WAREHOUSE_SQ_FT INTEGER , W_STREET_NUMBER CHAR(10) , W_STREET_NAME VARCHAR(60) , W_STREET_TYPE CHAR(15) , W_SUITE_NUMBER CHAR(10) , W_CITY VARCHAR(60) , W_COUNTY VARCHAR(30) , W_STATE CHAR(2) , W_ZIP CHAR(10) , W_COUNTRY VARCHAR(20) , W_GMT_OFFSET DECIMAL(5,2) ); -- Create a range-partitioned table tpcds.web_returns_p1. gaussdb=# CREATE TABLE tpcds.web_returns_p1 ( WR_RETURNED_DATE_SK INTEGER , WR_RETURNED_TIME_SK INTEGER , WR_ITEM_SK INTEGER NOT NULL, WR_REFUNDED_CUSTOMER_SK INTEGER , WR_REFUNDED_CDEMO_SK INTEGER , WR_REFUNDED_HDEMO_SK INTEGER , WR_REFUNDED_ADDR_SK INTEGER , WR_RETURNING_CUSTOMER_SK INTEGER , WR_RETURNING_CDEMO_SK INTEGER , WR_RETURNING_HDEMO_SK INTEGER , WR_RETURNING_ADDR_SK INTEGER , WR_WEB_PAGE_SK INTEGER , WR_REASON_SK INTEGER , WR_ORDER_NUMBER BIGINT NOT NULL, WR_RETURN_QUANTITY INTEGER , WR_RETURN_AMT DECIMAL(7,2) , WR_RETURN_TAX DECIMAL(7,2) , WR_RETURN_AMT_INC_TAX DECIMAL(7,2) , WR_FEE DECIMAL(7,2) , WR_RETURN_SHIP_COST DECIMAL(7,2) , WR_REFUNDED_CASH DECIMAL(7,2) , WR_REVERSED_CHARGE DECIMAL(7,2) , WR_ACCOUNT_CREDIT DECIMAL(7,2) , WR_NET_LOSS DECIMAL(7,2) ) DISTRIBUTE BY HASH (WR_ITEM_SK) PARTITION BY RANGE(WR_RETURNED_DATE_SK) ( PARTITION P1 VALUES LESS THAN(2450815), PARTITION P2 VALUES LESS THAN(2451179), PARTITION P3 VALUES LESS THAN(2451544), PARTITION P4 VALUES LESS THAN(2451910), PARTITION P5 VALUES LESS THAN(2452275), PARTITION P6 VALUES LESS THAN(2452640), PARTITION P7 VALUES LESS THAN(2453005), PARTITION P8 VALUES LESS THAN(MAXVALUE) ); -- Import data from the example data table. gaussdb=# INSERT INTO tpcds.web_returns_p1 SELECT * FROM tpcds.web_returns; -- Delete the P8 partition. gaussdb=# ALTER TABLE tpcds.web_returns_p1 DROP PARTITION P8; -- Add a partition WR_RETURNED_DATE_SK with values ranging from 2453005 to 2453105. gaussdb=# ALTER TABLE tpcds.web_returns_p1 ADD PARTITION P8 VALUES LESS THAN (2453105); -- Add a partition WR_RETURNED_DATE_SK with values ranging from 2453105 to MAXVALUE. gaussdb=# ALTER TABLE tpcds.web_returns_p1 ADD PARTITION P9 VALUES LESS THAN (MAXVALUE); -- Delete the P8 partition. gaussdb=# ALTER TABLE tpcds.web_returns_p1 DROP PARTITION FOR (2453005); -- Rename the P7 partition to P10. gaussdb=# ALTER TABLE tpcds.web_returns_p1 RENAME PARTITION P7 TO P10; -- Rename the P6 partition to P11. gaussdb=# ALTER TABLE tpcds.web_returns_p1 RENAME PARTITION FOR (2452639) TO P11; -- Query the number of rows in the P10 partition. gaussdb=# SELECT count(*) FROM tpcds.web_returns_p1 PARTITION (P10); count -------- 0 (1 row) -- Query the number of rows in the P1 partition. gaussdb=# SELECT COUNT(*) FROM tpcds.web_returns_p1 PARTITION FOR (2450815); count -------- 0 (1 row) -- Delete the tpcds.web_returns_p1 table. gaussdb=# DROP TABLE tpcds.web_returns_p1; -- Delete the tpcds.web_returns table. gaussdb=# DROP TABLE tpcds.web_returns; -- Delete the schema. gaussdb=# DROP SCHEMA tpcds CASCADE;
- Example 2: Create a range-partitioned table tpcds.web_returns_p2. The table has eight partitions and their partition keys are of the integer type. The upper limit of the eighth partition is MAXVALUE.
The ranges of the partitions are: wr_returned_date_sk < 2450815, 2450815 ≤ wr_returned_date_sk < 2451179, 2451179 ≤ wr_returned_date_sk < 2451544, 2451544 ≤ wr_returned_date_sk < 2451910, 2451910 ≤ wr_returned_date_sk < 2452275, 2452275 ≤ wr_returned_date_sk < 2452640, 2452640 ≤ wr_returned_date_sk < 2453005, and wr_returned_date_sk ≥ 2453005.
The tablespace of the tpcds.web_returns_p2 partitioned table is example1. Partitions P1 to P7 have no specified tablespaces, and use the example1 tablespace of the tpcds.web_returns_p2 partitioned table. The tablespace of the P8 partitioned table is example2.
Assume that CN and DN data directory/pg_location/mount1/path1, CN and DN data directory/pg_location/mount2/path2, CN and DN data directory/pg_location/mount3/path3, and CN and DN data directory/pg_location/mount4/path4 are empty directories for which user dwsadmin has read and write permissions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82
gaussdb=# CREATE TABLESPACE example1 RELATIVE LOCATION 'tablespace1/tablespace_1'; gaussdb=# CREATE TABLESPACE example2 RELATIVE LOCATION 'tablespace2/tablespace_2'; gaussdb=# CREATE TABLESPACE example3 RELATIVE LOCATION 'tablespace3/tablespace_3'; gaussdb=# CREATE TABLESPACE example4 RELATIVE LOCATION 'tablespace4/tablespace_4'; -- Create a temporary schema. gaussdb=# CREATE SCHEMA tpcds; gaussdb=# SET CURRENT_SCHEMA TO tpcds; gaussdb=# CREATE TABLE tpcds.web_returns_p2 ( WR_RETURNED_DATE_SK INTEGER , WR_RETURNED_TIME_SK INTEGER , WR_ITEM_SK INTEGER NOT NULL, WR_REFUNDED_CUSTOMER_SK INTEGER , WR_REFUNDED_CDEMO_SK INTEGER , WR_REFUNDED_HDEMO_SK INTEGER , WR_REFUNDED_ADDR_SK INTEGER , WR_RETURNING_CUSTOMER_SK INTEGER , WR_RETURNING_CDEMO_SK INTEGER , WR_RETURNING_HDEMO_SK INTEGER , WR_RETURNING_ADDR_SK INTEGER , WR_WEB_PAGE_SK INTEGER , WR_REASON_SK INTEGER , WR_ORDER_NUMBER BIGINT NOT NULL, WR_RETURN_QUANTITY INTEGER , WR_RETURN_AMT DECIMAL(7,2) , WR_RETURN_TAX DECIMAL(7,2) , WR_RETURN_AMT_INC_TAX DECIMAL(7,2) , WR_FEE DECIMAL(7,2) , WR_RETURN_SHIP_COST DECIMAL(7,2) , WR_REFUNDED_CASH DECIMAL(7,2) , WR_REVERSED_CHARGE DECIMAL(7,2) , WR_ACCOUNT_CREDIT DECIMAL(7,2) , WR_NET_LOSS DECIMAL(7,2) ) TABLESPACE example1 DISTRIBUTE BY HASH (WR_ITEM_SK) PARTITION BY RANGE(WR_RETURNED_DATE_SK) ( PARTITION P1 VALUES LESS THAN(2450815), PARTITION P2 VALUES LESS THAN(2451179), PARTITION P3 VALUES LESS THAN(2451544), PARTITION P4 VALUES LESS THAN(2451910), PARTITION P5 VALUES LESS THAN(2452275), PARTITION P6 VALUES LESS THAN(2452640), PARTITION P7 VALUES LESS THAN(2453005), PARTITION P8 VALUES LESS THAN(MAXVALUE) TABLESPACE example2 ) ENABLE ROW MOVEMENT; -- Create a partitioned table using LIKE. gaussdb=# CREATE TABLE tpcds.web_returns_p3 (LIKE tpcds.web_returns_p2 INCLUDING PARTITION); -- Change the tablespace of the P1 partition to example2. gaussdb=# ALTER TABLE tpcds.web_returns_p2 MOVE PARTITION P1 TABLESPACE example2; -- Change the tablespace of the P2 partition to example3. gaussdb=# ALTER TABLE tpcds.web_returns_p2 MOVE PARTITION P2 TABLESPACE example3; -- Split the P8 partition at 2453010. gaussdb=# ALTER TABLE tpcds.web_returns_p2 SPLIT PARTITION P8 AT (2453010) INTO ( PARTITION P9, PARTITION P10 ); -- Merge the P6 and P7 partitions into one. gaussdb=# ALTER TABLE tpcds.web_returns_p2 MERGE PARTITIONS P6, P7 INTO PARTITION P8; -- Modify the migration attribute of the partitioned table. gaussdb=# ALTER TABLE tpcds.web_returns_p2 DISABLE ROW MOVEMENT; -- Delete tables and tablespaces. gaussdb=# DROP TABLE tpcds.web_returns_p1; gaussdb=# DROP TABLE tpcds.web_returns_p2; gaussdb=# DROP TABLE tpcds.web_returns_p3; gaussdb=# DROP SCHEMA tpcds CASCADE; gaussdb=# DROP TABLESPACE example1; gaussdb=# DROP TABLESPACE example2; gaussdb=# DROP TABLESPACE example3; gaussdb=# DROP TABLESPACE example4;
- Example 3: Use START END to create and modify a range-partitioned table.
Assume that /home/omm/startend_tbs1, /home/omm/startend_tbs2, /home/omm/startend_tbs3, and /home/omm/startend_tbs4 are empty directories for which user omm has the read and write permissions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108
-- Create a tablespace. gaussdb=# CREATE TABLESPACE startend_tbs1 LOCATION '/home/omm/startend_tbs1'; gaussdb=# CREATE TABLESPACE startend_tbs2 LOCATION '/home/omm/startend_tbs2'; gaussdb=# CREATE TABLESPACE startend_tbs3 LOCATION '/home/omm/startend_tbs3'; gaussdb=# CREATE TABLESPACE startend_tbs4 LOCATION '/home/omm/startend_tbs4'; -- Create a temporary schema. gaussdb=# CREATE SCHEMA tpcds; gaussdb=# SET CURRENT_SCHEMA TO tpcds; -- Create a partitioned table with the partition key of the integer type. gaussdb=# CREATE TABLE tpcds.startend_pt (c1 INT, c2 INT) TABLESPACE startend_tbs1 DISTRIBUTE BY HASH (c1) PARTITION BY RANGE (c2) ( PARTITION p1 START(1) END(1000) EVERY(200) TABLESPACE startend_tbs2, PARTITION p2 END(2000), PARTITION p3 START(2000) END(2500) TABLESPACE startend_tbs3, PARTITION p4 START(2500), PARTITION p5 START(3000) END(5000) EVERY(1000) TABLESPACE startend_tbs4 ) ENABLE ROW MOVEMENT; -- View the information of the partitioned table. gaussdb=# SELECT relname, boundaries, spcname FROM pg_partition p JOIN pg_tablespace t ON p.reltablespace=t.oid and p.parentid='tpcds.startend_pt'::regclass ORDER BY 1; relname | boundaries | spcname -------------+------------+--------------- p1_0 | {1} | startend_tbs2 p1_1 | {201} | startend_tbs2 p1_2 | {401} | startend_tbs2 p1_3 | {601} | startend_tbs2 p1_4 | {801} | startend_tbs2 p1_5 | {1000} | startend_tbs2 p2 | {2000} | startend_tbs1 p3 | {2500} | startend_tbs3 p4 | {3000} | startend_tbs1 p5_1 | {4000} | startend_tbs4 p5_2 | {5000} | startend_tbs4 startend_pt | | startend_tbs1 (12 rows) -- Import data and check the data volume in a partition. gaussdb=# INSERT INTO tpcds.startend_pt VALUES (GENERATE_SERIES(0, 4999), GENERATE_SERIES(0, 4999)); gaussdb=# SELECT COUNT(*) FROM tpcds.startend_pt PARTITION FOR (0); count ------- 1 (1 row) gaussdb=# SELECT COUNT(*) FROM tpcds.startend_pt PARTITION (p3); count ------- 500 (1 row) -- Add partitions [5000, 5300), [5300, 5600), [5600, 5900), and [5900, 6000). gaussdb=# ALTER TABLE tpcds.startend_pt ADD PARTITION p6 START(5000) END(6000) EVERY(300) TABLESPACE startend_tbs4; -- Add the partition p7, specified by MAXVALUE. gaussdb=# ALTER TABLE tpcds.startend_pt ADD PARTITION p7 END(MAXVALUE); -- Rename the partition p7 to p8. gaussdb=# ALTER TABLE tpcds.startend_pt RENAME PARTITION p7 TO p8; -- Delete the partition p8. gaussdb=# ALTER TABLE tpcds.startend_pt DROP PARTITION p8; -- Rename the partition where 5950 is located to p71. gaussdb=# ALTER TABLE tpcds.startend_pt RENAME PARTITION FOR(5950) TO p71; -- Split the partition [4000, 5000) where 4500 is located. gaussdb=# ALTER TABLE tpcds.startend_pt SPLIT PARTITION FOR(4500) INTO(PARTITION q1 START(4000) END(5000) EVERY(250) TABLESPACE startend_tbs3); -- Change the tablespace of the partition p2 to startend_tbs4. gaussdb=# ALTER TABLE tpcds.startend_pt MOVE PARTITION p2 TABLESPACE startend_tbs4; -- View the partition status. gaussdb=# SELECT relname, boundaries, spcname FROM pg_partition p JOIN pg_tablespace t ON p.reltablespace=t.oid and p.parentid='tpcds.startend_pt'::regclass ORDER BY 1; relname | boundaries | spcname -------------+------------+--------------- p1_0 | {1} | startend_tbs2 p1_1 | {201} | startend_tbs2 p1_2 | {401} | startend_tbs2 p1_3 | {601} | startend_tbs2 p1_4 | {801} | startend_tbs2 p1_5 | {1000} | startend_tbs2 p2 | {2000} | startend_tbs4 p3 | {2500} | startend_tbs3 p4 | {3000} | startend_tbs1 p5_1 | {4000} | startend_tbs4 p6_1 | {5300} | startend_tbs4 p6_2 | {5600} | startend_tbs4 p6_3 | {5900} | startend_tbs4 p71 | {6000} | startend_tbs4 q1_1 | {4250} | startend_tbs3 q1_2 | {4500} | startend_tbs3 q1_3 | {4750} | startend_tbs3 q1_4 | {5000} | startend_tbs3 startend_pt | | startend_tbs1 (19 rows) -- Delete tables and tablespaces. gaussdb=# DROP TABLE tpcds.startend_pt; gaussdb=# DROP SCHEMA tpcds CASCADE; gaussdb=# DROP TABLESPACE startend_tbs1; gaussdb=# DROP TABLESPACE startend_tbs2; gaussdb=# DROP TABLESPACE startend_tbs3; gaussdb=# DROP TABLESPACE startend_tbs4;
- Example 4: Create list partitioned table test_list. The table initially contains four partitions and the partition key is of the INT type. The ranges of the four partitions are 2000, 3000, 4000, and 5000 respectively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121
-- Create the test_list table. gaussdb=# CREATE TABLE test_list (col1 int, col2 int) partition by list(col1) ( partition p1 values (2000), partition p2 values (3000), partition p3 values (4000), partition p4 values (5000) ); -- Insert data. gaussdb=# INSERT INTO test_list VALUES(2000, 2000); INSERT 0 1 gaussdb=# INSERT INTO test_list VALUES(3000, 3000); INSERT 0 1 -- View the partition information. gaussdb=# SELECT t1.relname, partstrategy, boundaries FROM pg_partition t1, pg_class t2 WHERE t1.parentid = t2.oid AND t2.relname = 'test_list' AND t1.parttype = 'p'; relname | partstrategy | boundaries ---------+--------------+------------ p1 | l | {2000} p2 | l | {3000} p3 | l | {4000} p4 | l | {5000} (4 rows) -- The inserted data does not match the partition, and an error is reported. gaussdb=# INSERT INTO test_list VALUES(6000, 6000); ERROR: inserted partition key does not map to any table partition -- Add a partition. gaussdb=# ALTER TABLE test_list add partition p5 values (6000); ALTER TABLE gaussdb=# SELECT t1.relname, partstrategy, boundaries FROM pg_partition t1, pg_class t2 WHERE t1.parentid = t2.oid AND t2.relname = 'test_list' AND t1.parttype = 'p'; relname | partstrategy | boundaries ---------+--------------+------------ p5 | l | {6000} p4 | l | {5000} p1 | l | {2000} p2 | l | {3000} p3 | l | {4000} (5 rows) gaussdb=# INSERT INTO test_list VALUES(6000, 6000); INSERT 0 1 -- Exchange data between the partitioned table and ordinary table. gaussdb=# CREATE TABLE t1 (col1 int, col2 int); CREATE TABLE gaussdb=# SELECT * FROM test_list partition (p1); col1 | col2 ------+------ 2000 | 2000 (1 row) gaussdb=# ALTER TABLE test_list exchange partition (p1) with table t1; ALTER TABLE gaussdb=# SELECT * FROM test_list partition (p1); col1 | col2 ------+------ (0 rows) gaussdb=# SELECT * FROM t1; col1 | col2 ------+------ 2000 | 2000 (1 row) -- Truncate the partition. gaussdb=# SELECT * FROM test_list partition (p2); col1 | col2 ------+------ 3000 | 3000 (1 row) gaussdb=# ALTER TABLE test_list truncate partition p2; ALTER TABLE gaussdb=# SELECT * FROM test_list partition (p2); col1 | col2 ------+------ (0 rows) -- Delete the partition. gaussdb=# alter table test_list drop partition p5; ALTER TABLE gaussdb=# SELECT t1.relname, partstrategy, boundaries FROM pg_partition t1, pg_class t2 WHERE t1.parentid = t2.oid AND t2.relname = 'test_list' AND t1.parttype = 'p'; relname | partstrategy | boundaries ---------+--------------+------------ p4 | l | {5000} p1 | l | {2000} p2 | l | {3000} p3 | l | {4000} (4 rows) gaussdb=# INSERT INTO test_list VALUES(6000, 6000); ERROR: inserted partition key does not map to any table partition -- Merge partitions. gaussdb=# alter table test_list merge partitions p1,p2 into partition p2; ALTER TABLE gaussdb=# SELECT t1.relname, partstrategy, boundaries FROM pg_partition t1, pg_class t2 WHERE t1.parentid = t2.oid AND t2.relname = 'test_list' AND t1.parttype = 'p'; relname | partstrategy | boundaries ---------+--------------+------------- p2 | l | {2000,3000} p4 | l | {5000} p3 | l | {4000} (3 rows) -- Split partitions. gaussdb=# alter table test_list split partition p2 values(2000) into (partition p1, partition p2); ALTER TABLE gaussdb=# SELECT t1.relname, partstrategy, boundaries FROM pg_partition t1, pg_class t2 WHERE t1.parentid = t2.oid AND t2.relname = 'test_list' AND t1.parttype = 'p'; relname | partstrategy | boundaries ---------+--------------+------------ p2 | l | {3000} p1 | l | {2000} p4 | l | {5000} p3 | l | {4000} (4 rows) -- Delete the ordinary table. gaussdb=# DROP TABLE t1; -- Drop the partitioned table. gaussdb=# drop table test_list;
- Example 5: Create a hash partitioned table test_hash. The table initially contains two partitions and the partition key is of the INT type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
-- Create the test_hash table. gaussdb=# create table test_hash (col1 int, col2 int) partition by hash(col1) ( partition p1, partition p2 ); -- Insert data. gaussdb=# INSERT INTO test_hash VALUES(1, 1); INSERT 0 1 gaussdb=# INSERT INTO test_hash VALUES(2, 2); INSERT 0 1 gaussdb=# INSERT INTO test_hash VALUES(3, 3); INSERT 0 1 gaussdb=# INSERT INTO test_hash VALUES(4, 4); INSERT 0 1 -- View the partition information. gaussdb=# SELECT t1.relname, partstrategy, boundaries FROM pg_partition t1, pg_class t2 WHERE t1.parentid = t2.oid AND t2.relname = 'test_hash' AND t1.parttype = 'p'; relname | partstrategy | boundaries ---------+--------------+------------ p1 | h | {0} p2 | h | {1} (2 rows) -- View the data. gaussdb=# select * from test_hash partition (p1); col1 | col2 ------+------ 3 | 3 4 | 4 (2 rows) gaussdb=# select * from test_hash partition (p2); col1 | col2 ------+------ 1 | 1 2 | 2 (2 rows) -- Exchange data between the partitioned table and ordinary table. gaussdb=# create table t1 (col1 int, col2 int); CREATE TABLE gaussdb=# alter table test_hash exchange partition (p1) with table t1; ALTER TABLE gaussdb=# select * from test_hash partition (p1); col1 | col2 ------+------ (0 rows) gaussdb=# select * from t1; col1 | col2 ------+------ 3 | 3 4 | 4 (2 rows) -- Truncate the partition. gaussdb=# alter table test_hash truncate partition p2; ALTER TABLE gaussdb=# select * from test_hash partition (p2); col1 | col2 ------+------ (0 rows) -- Drop the partitioned table. gaussdb=# drop table test_hash;
- Example 6: Create a list partitioned table t_multi_keys_list. The table initially contains 5 partitions and the partition key is of the INT type and VARCHAR type.
1 2 3 4 5 6 7 8 9 10 11 12 13
-- Create the t_multi_keys_list table. gaussdb=# CREATE TABLE t_multi_keys_list (a int, b varchar(4), c int) PARTITION BY LIST (a,b) ( PARTITION p1 VALUES ( (0,NULL) ), PARTITION p2 VALUES ( (0,'1'), (0,'2'), (0,'3'), (1,'1'), (1,'2') ), PARTITION p3 VALUES ( (NULL,'0'), (2,'1') ), PARTITION p4 VALUES ( (3,'2'), (NULL,NULL) ), PARTITION pd VALUES ( DEFAULT ) ); -- Create the t_multi_keys_list table. gaussdb=# DROP TABLE t_multi_keys_list;
Helpful Links
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot