分区(Partition)
分区用于将数据划分成不同区间,逻辑上可以理解为将原始表划分成了多个子表。可以方便的按分区对数据进行管理。
- Partition列可以指定一列或多列,分区列必须为KEY列。多列分区的使用方式在后面多列分区小结介绍。
- 不论分区列是什么类型,在写分区值时,都需要加双引号。
- 分区数量理论上没有上限。
- 当不使用Partition建表时,系统会自动生成一个和表名同名的,全值范围的Partition。该Partition对用户不可见,并且不可删改。
- 创建分区时不可添加范围重叠的分区。
Range分区
分区列通常为时间列,以方便的管理新旧数据。
Partition支持通过VALUES LESS THAN (...) 仅指定上界,系统会将前一个分区的上界作为该分区的下界,生成一个左闭右开的区间。
- 通过VALUES[...) 同时指定上下界比较容易理解。这里举例说明,当使用VALUES LESS THAN (...) 语句进行分区的增删操作时,分区范围的变化情况。
CREATE TABLE IF NOT EXISTS example_db.expamle_range_tbl ( `user_id` LARGEINT NOT NULL COMMENT "用户id", `date` DATE NOT NULL COMMENT "数据灌入日期时间", `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳", `city` VARCHAR(20) COMMENT "用户所在城市", `age` SMALLINT COMMENT "用户年龄", `sex` TINYINT COMMENT "用户性别", `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间", `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费", `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间", `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间" ) ENGINE=OLAP AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`) PARTITION BY RANGE(`date`) ( PARTITION `p201701` VALUES LESS THAN ("2017-02-01"), PARTITION `p201702` VALUES LESS THAN ("2017-03-01"), PARTITION `p201703` VALUES LESS THAN ("2017-04-01") ) DISTRIBUTED BY HASH(`user_id`) BUCKETS 16 PROPERTIES ( "replication_num" = "3" );
- 查看分区。
mysql> show partitions from example_db.expamle_range_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16040 | p201701 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [0000-01-01]; ..types: [DATE]; keys: [2017-02-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16041 | p201702 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [2017-02-01]; ..types: [DATE]; keys: [2017-03-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16042 | p201703 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [2017-03-01]; ..types: [DATE]; keys: [2017-04-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.01 sec)
- 增加一个分区。
mysql> alter table example_db.expamle_range_tbl add partition p201705 VALUES LESS THAN ("2017-06-01"); Query OK, 0 rows affected (0.02 sec)
查看分区。
mysql> show partitions from example_db.expamle_range_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16040 | p201701 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [0000-01-01]; ..types: [DATE]; keys: [2017-02-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16041 | p201702 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [2017-02-01]; ..types: [DATE]; keys: [2017-03-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16042 | p201703 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [2017-03-01]; ..types: [DATE]; keys: [2017-04-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16237 | p201705 | 1 | 2023-04-11 07:45:18 | NORMAL | date | [types: [DATE]; keys: [2017-04-01]; ..types: [DATE]; keys: [2017-06-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 4 rows in set (0.00 sec)
- 删除分区。
mysql> alter table example_db.expamle_range_tbl drop partition p201703; Query OK, 0 rows affected (0.01 sec)
查看分区。
mysql> show partitions from example_db.expamle_range_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16040 | p201701 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [0000-01-01]; ..types: [DATE]; keys: [2017-02-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16041 | p201702 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [2017-02-01]; ..types: [DATE]; keys: [2017-03-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16237 | p201705 | 1 | 2023-04-11 07:45:18 | NORMAL | date | [types: [DATE]; keys: [2017-04-01]; ..types: [DATE]; keys: [2017-06-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.00 sec)
p201702和p201705的分区范围并没有发生变化,而这两个分区之间,出现了一个空洞:[2017-03-01, 2017-04-01)。即如果导入的数据范围在这个空洞范围内,是无法导入的。
- 继续删除分区。
mysql> alter table example_db.expamle_range_tbl drop partition p201702; Query OK, 0 rows affected (0.00 sec)
查看分区。
mysql> show partitions from example_db.expamle_range_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16040 | p201701 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [0000-01-01]; ..types: [DATE]; keys: [2017-02-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16237 | p201705 | 1 | 2023-04-11 07:45:18 | NORMAL | date | [types: [DATE]; keys: [2017-04-01]; ..types: [DATE]; keys: [2017-06-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 2 rows in set (0.00 sec)
空洞范围变为:[2017-02-01, 2017-04-01)。
- 增加新分区。
mysql> alter table example_db.expamle_range_tbl add partition p201702new VALUES LESS THAN ("2017-03-01"); Query OK, 0 rows affected (0.01 sec)
查看分区。
mysql> show partitions from example_db.expamle_range_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16040 | p201701 | 1 | 2023-04-11 07:35:02 | NORMAL | date | [types: [DATE]; keys: [0000-01-01]; ..types: [DATE]; keys: [2017-02-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16302 | p201702new | 1 | 2023-04-11 08:14:25 | NORMAL | date | [types: [DATE]; keys: [2017-02-01]; ..types: [DATE]; keys: [2017-03-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16237 | p201705 | 1 | 2023-04-11 07:45:18 | NORMAL | date | [types: [DATE]; keys: [2017-04-01]; ..types: [DATE]; keys: [2017-06-01]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+----------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.00 sec)
综上,分区的删除不会改变已存在分区的范围。删除分区可能出现空洞。通过VALUES LESS THAN语句增加分区时,分区的下界紧接上一个分区的上界。
- 多列分区。
CREATE TABLE IF NOT EXISTS example_db.expamle_range_multi_partiton_key_tbl ( `user_id` LARGEINT NOT NULL COMMENT "用户id", `date` DATE NOT NULL COMMENT "数据灌入日期时间", `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳", `city` VARCHAR(20) COMMENT "用户所在城市", `age` SMALLINT COMMENT "用户年龄", `sex` TINYINT COMMENT "用户性别", `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间", `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费", `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间", `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间" ) ENGINE=OLAP AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`) PARTITION BY RANGE(`date`, `user_id`) ( PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"), PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"), PARTITION `p201703_all` VALUES LESS THAN ("2017-04-01") ) DISTRIBUTED BY HASH(`user_id`) BUCKETS 16 PROPERTIES ( "replication_num" = "3" );
在以上示例中,我们指定date(DATE 类型)和user_id(INT 类型)作为分区列。以上示例最终得到的分区如下:
mysql> show partitions from example_db.expamle_range_multi_partiton_key_tbl; +-------------+---------------+----------------+---------------------+--------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16367 | p201701_1000 | 1 | 2023-04-11 08:28:12 | NORMAL | date, user_id | [types: [DATE, LARGEINT]; keys: [0000-01-01, -170141183460469231731687303715884105728]; ..types: [DATE, LARGEINT]; keys: [2017-02-01, 1000]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16368 | p201702_2000 | 1 | 2023-04-11 08:28:12 | NORMAL | date, user_id | [types: [DATE, LARGEINT]; keys: [2017-02-01, 1000]; ..types: [DATE, LARGEINT]; keys: [2017-03-01, 2000]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16369 | p201703_all | 1 | 2023-04-11 08:28:12 | NORMAL | date, user_id | [types: [DATE, LARGEINT]; keys: [2017-03-01, 2000]; ..types: [DATE, LARGEINT]; keys: [2017-04-01, -170141183460469231731687303715884105728]; ) | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.00 sec)
最后一个分区用户缺省只指定了date列的分区值,所以user_id列的分区值会默认填充MIN_VALUE。当用户插入数据时,分区列值会按照顺序依次比较,最终得到对应的分区。
示例:
* 数据 --> 分区 * 2017-01-01, 200 --> p201701_1000 * 2017-01-01, 2000 --> p201701_1000 * 2017-02-01, 100 --> p201701_1000 * 2017-02-01, 2000 --> p201702_2000 * 2017-02-15, 5000 --> p201702_2000 * 2017-03-01, 2000 --> p201703_all * 2017-03-10, 1 --> p201703_all * 2017-04-01, 1000 --> 无法导入 * 2017-05-01, 1000 --> 无法导入
验证方法:
插入一条数据并检查存入到哪个分区。分区字段VisibleVersionTime、VisibleVersion刚刚有更新的分区即为刚插入数据所在的分区。
insert into example_db.expamle_range_multi_partiton_key_tbl values (200, '2017-01-01', '2017-01-01 12:00:05', 'A', 25, 1, '2017-01-01 12:00:05', 100, 30, 10); insert into example_db.expamle_range_multi_partiton_key_tbl values (2000, '2017-01-01', '2017-01-01 16:10:05', 'B', 33, 1, '2017-01-01 16:10:05', 800, 50, 1); insert into example_db.expamle_range_multi_partiton_key_tbl values (200, '2017-02-01', '2017-01-01 16:10:05', 'C', 22, 0, '2017-02-01 16:10:05', 80, 200, 1); show partitions from example_db.expamle_range_multi_partiton_key_tbl\G
List分区
- 分区列支持BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, LARGEINT, DATE, DATETIME, CHAR, VARCHAR数据类型,分区值为枚举值。只有当数据为目标分区枚举值其中之一时,才可以命中分区。
- Partition支持通过VALUES IN (...) 来指定每个分区包含的枚举值。
- 下面通过示例说明,进行分区的增删操作时,分区的变化。
CREATE TABLE IF NOT EXISTS example_db.expamle_list_tbl ( `user_id` LARGEINT NOT NULL COMMENT "用户id", `date` DATE NOT NULL COMMENT "数据灌入日期时间", `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳", `city` VARCHAR(20) NOT NULL COMMENT "用户所在城市", `age` SMALLINT COMMENT "用户年龄", `sex` TINYINT COMMENT "用户性别", `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间", `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费", `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间", `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间" ) ENGINE=olap AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`) PARTITION BY LIST(`city`) ( PARTITION `p_cn` VALUES IN ("A", "B", "F"), PARTITION `p_usa` VALUES IN ("G", "H"), PARTITION `p_jp` VALUES IN ("I") ) DISTRIBUTED BY HASH(`user_id`) BUCKETS 16 PROPERTIES ( "replication_num" = "3" );
- 如上表所示,建表完成后,会自动生成3个分区。
mysql> show partitions from example_db.expamle_list_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16764 | p_cn | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [A]; , types: [VARCHAR]; keys: [B]; , types: [VARCHAR]; keys: [F]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16765 | p_usa | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [G]; , types: [VARCHAR]; keys: [H]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16766 | p_jp | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [I]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.00 sec)
- 增加一个分区。
mysql> alter table example_db.expamle_list_tbl add partition p_uk VALUES IN ("L"); Query OK, 0 rows affected (0.01 sec)
查询分区。
mysql> show partitions from example_db.expamle_list_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16764 | p_cn | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [A]; , types: [VARCHAR]; keys: [B]; , types: [VARCHAR]; keys: [F]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16765 | p_usa | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [G]; , types: [VARCHAR]; keys: [H]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16766 | p_jp | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [I]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16961 | p_uk | 1 | 2023-04-11 09:24:39 | NORMAL | city | [types: [VARCHAR]; keys: [L]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 4 rows in set (0.00 sec)
- 删除分区。
mysql> alter table example_db.expamle_list_tbl drop partition p_jp; Query OK, 0 rows affected (0.01 sec)
查看分区。
mysql> show partitions from example_db.expamle_list_tbl; +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 16764 | p_cn | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [A]; , types: [VARCHAR]; keys: [B]; , types: [VARCHAR]; keys: [F]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16765 | p_usa | 1 | 2023-04-11 09:21:34 | NORMAL | city | [types: [VARCHAR]; keys: [G]; , types: [VARCHAR]; keys: [H]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 16961 | p_uk | 1 | 2023-04-11 09:24:39 | NORMAL | city | [types: [VARCHAR]; keys: [L]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+--------------+-----------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.00 sec)
- 多列分区(List支持多列分区),如下示例。
CREATE TABLE IF NOT EXISTS example_db.expamle_list_multi_partiton_key_tbl ( `user_id` LARGEINT NOT NULL COMMENT "用户id", `date` DATE NOT NULL COMMENT "数据灌入日期时间", `timestamp` DATETIME NOT NULL COMMENT "数据灌入的时间戳", `city` VARCHAR(20) NOT NULL COMMENT "用户所在城市", `age` SMALLINT COMMENT "用户年龄", `sex` TINYINT COMMENT "用户性别", `last_visit_date` DATETIME REPLACE DEFAULT "1970-01-01 00:00:00" COMMENT "用户最后一次访问时间", `cost` BIGINT SUM DEFAULT "0" COMMENT "用户总消费", `max_dwell_time` INT MAX DEFAULT "0" COMMENT "用户最大停留时间", `min_dwell_time` INT MIN DEFAULT "99999" COMMENT "用户最小停留时间" ) ENGINE=olap AGGREGATE KEY(`user_id`, `date`, `timestamp`, `city`, `age`, `sex`) PARTITION BY LIST(`user_id`, `city`) ( PARTITION `p1_city` VALUES IN (("1", "A"), ("1", "B")), PARTITION `p2_city` VALUES IN (("2", "A"), ("2", "B")), PARTITION `p3_city` VALUES IN (("3", "A"), ("3", "B")) ) DISTRIBUTED BY HASH(`user_id`) BUCKETS 16 PROPERTIES ( "replication_num" = "3" );
在以上示例中,我们指定user_id(INT类型)和city(VARCHAR类型)作为分区列,最终分区如下。
mysql> show partitions from example_db.expamle_list_multi_partiton_key_tbl; +-------------+---------------+----------------+---------------------+--------+---------------+------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | PartitionId | PartitionName | VisibleVersion | VisibleVersionTime | State | PartitionKey | Range | DistributionKey | Buckets | ReplicationNum | StorageMedium | CooldownTime | RemoteStoragePolicy | LastConsistencyCheckTime | DataSize | IsInMemory | ReplicaAllocation | +-------------+---------------+----------------+---------------------+--------+---------------+------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ | 17026 | p1_city | 1 | 2023-04-11 09:31:33 | NORMAL | user_id, city | [types: [LARGEINT, VARCHAR]; keys: [1, A]; , types: [LARGEINT, VARCHAR]; keys: [1, B]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 17027 | p2_city | 1 | 2023-04-11 09:31:33 | NORMAL | user_id, city | [types: [LARGEINT, VARCHAR]; keys: [2, A]; , types: [LARGEINT, VARCHAR]; keys: [2, B]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | | 17028 | p3_city | 1 | 2023-04-11 09:31:33 | NORMAL | user_id, city | [types: [LARGEINT, VARCHAR]; keys: [3, A]; , types: [LARGEINT, VARCHAR]; keys: [3, B]; ] | user_id | 16 | 3 | HDD | 9999-12-31 15:59:59 | | NULL | 0.000 | false | tag.location.default: 3 | +-------------+---------------+----------------+---------------------+--------+---------------+------------------------------------------------------------------------------------------+-----------------+---------+----------------+---------------+---------------------+---------------------+--------------------------+----------+------------+-------------------------+ 3 rows in set (0.00 sec)
- 当用户插入数据时,分区列值会按照顺序依次比较,最终得到对应的分区。如下所示。
* 数据 ---> 分区 * 1, A ---> p1_city * 1, B ---> p1_city * 2, B ---> p2_city * 3, A ---> p3_city * 1, M ---> 无法导入 * 4, A ---> 无法导入
验证方法:
插入一条数据并检查存入到哪个分区。分区字段VisibleVersionTime、VisibleVersion刚刚有更新的分区即为刚插入数据所在的分区。
INSERT INTO example_db.expamle_list_multi_partiton_key_tbl values (1, '2017-01-01', '2017-01-01 12:00:05', 'A', 25, 1, '2017-01-01 12:00:05', 100, 30, 10);
查数据插入分区。
mysql> SHOW partitions from example_db.expamle_list_multi_partiton_key_tbl\G *************************** 1. row *************************** PartitionId: 17026 PartitionName: p1_city VisibleVersion: 3 VisibleVersionTime: 2023-04-11 09:42:34 State: NORMAL PartitionKey: user_id, city Range: [types: [LARGEINT, VARCHAR]; keys: [1, A]; , types: [LARGEINT, VARCHAR]; keys: [1, B]; ] DistributionKey: user_id Buckets: 16 ReplicationNum: 3 StorageMedium: HDD CooldownTime: 9999-12-31 15:59:59 RemoteStoragePolicy: LastConsistencyCheckTime: NULL DataSize: 9.340 KB IsInMemory: false ReplicaAllocation: tag.location.default: 3 *************************** 2. row *************************** PartitionId: 17027 PartitionName: p2_city VisibleVersion: 1 VisibleVersionTime: 2023-04-11 09:31:33 State: NORMAL PartitionKey: user_id, city Range: [types: [LARGEINT, VARCHAR]; keys: [2, A]; , types: [LARGEINT, VARCHAR]; keys: [2, B]; ] DistributionKey: user_id Buckets: 16 ReplicationNum: 3 StorageMedium: HDD CooldownTime: 9999-12-31 15:59:59 RemoteStoragePolicy: LastConsistencyCheckTime: NULL DataSize: 0.000 IsInMemory: false ReplicaAllocation: tag.location.default: 3 *************************** 3. row *************************** PartitionId: 17028 PartitionName: p3_city VisibleVersion: 1 VisibleVersionTime: 2023-04-11 09:31:33 State: NORMAL PartitionKey: user_id, city Range: [types: [LARGEINT, VARCHAR]; keys: [3, A]; , types: [LARGEINT, VARCHAR]; keys: [3, B]; ] DistributionKey: user_id Buckets: 16 ReplicationNum: 3 StorageMedium: HDD CooldownTime: 9999-12-31 15:59:59 RemoteStoragePolicy: LastConsistencyCheckTime: NULL DataSize: 0.000 IsInMemory: false ReplicaAllocation: tag.location.default: 3 3 rows in set (0.00 sec)