Updated on 2025-10-23 GMT+08:00

Set Types

SET is a constant string data type created only when a table is created. It is applicable to the scenario where a subset of a given set is stored. The constants must be a subset of the constant string set specified when SET is defined.

The syntax for creating SET is as follows:

SET('val1','val2',...) [CHARACTER SET charset_name] [COLLATE collation_name]

The bottom-layer storage mode of the SET type is bitmap, and each set member corresponds to one bit. Subsets of each set can be expressed as character strings or integers. The meanings of character strings or integers are the same during operations. Take SET('beijing', 'shanghai', 'nanjing', 'wuhan') as an example. Table 1 shows the mapping between character strings and integers.

Table 1 Mapping between strings and integers of subsets

String Value

Binary Value

Integer Value

'beijing'

0001

1

'shanghai'

0010

2

'nanjing'

0100

4

'wuhan'

1000

8

'beijing,nanjing'

0101

5

Table 2 lists the M-compatible SET types.

Table 2 Set types

Name

Description

Storage Space

SET

Set type, used to store a subset of a given set.

  • Input formats:
    • Character string, one of the subsets. Members are separated by commas (,).
    • Integer value corresponding to the subset.
  • Output formats:
    • Subset character string. Members are separated by commas (,).
  • Value ranges:
    • A maximum of 64 members can be created.
    • A subset of a given set during creation.

4 to 8 bytes

  • The input validity check of the SET type is affected by sql_mode. In loose mode, if an invalid value is inserted to a column of the SET type, 0 is inserted, indicating an empty string. In strict mode, if an invalid value is inserted, an error is reported.
  • Each member value of the SET type must be unique. If duplicate member values exist, an error is reported in strict mode. In loose mode, duplicate member values can be created, but the first value is used by default.
  • The SET data type supports the character set and collation features. Whether member values are duplicate is related to the collation feature of SET columns.
  • The SET data type cannot be used as a partition key.
  • An index of the SET type can be queried only by integer. The index cannot be queried by character string.
  • Compared with other types, the SET type does not support index scan. You need to create a function expression index on a column of the SET type based on the data type to be queried to support index scan.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
-- Create a table with the SET data type.
m_db=# CREATE TABLE test_set(c1 SET('abc','efg','Chinese','Chinese character'));
CREATE TABLE

-- View the table structure.
m_db=# SELECT pg_get_tabledef('test_set');
                                     pg_get_tabledef                                      
------------------------------------------------------------------------------------------
 SET search_path = public;                                                               +
 CREATE TABLE test_set (                                                                 +
     c1 SET('abc', 'efg', 'Chinese', 'Chinese character') CHARACTER SET `UTF8` COLLATE utf8mb4_general_ci+
 )                                                                                       +
 CHARACTER SET = "UTF8" COLLATE = "utf8mb4_general_ci"                                   +
 WITH (orientation=row, compression=no, storage_type=USTORE, segment=off);
(1 row)

-- Insert character string data.
m_db=# INSERT INTO test_set VALUES('abc,efg'),('efg,Chinese,Chinese character'), ('abc');
INSERT 0 3

-- Insert integer data.
m_db=# INSERT INTO test_set VALUES(5),(12);
INSERT 0 2

-- Query data.
m_db=# SELECT * FROM test_set;
      c1       
---------------
 abc,efg
 efg,Chinese,Chinese character
 abc
 abc,Chinese
 Chinese,Chinese character
(5 rows)

-- Query data in character string format.
m_db=# SELECT * FROM test_set WHERE c1 = 'abc,efg';
   c1    
---------
 abc,efg
(1 row)

-- Query data in integer format.
m_db=# SELECT * FROM test_set WHERE c1 = 12;
    c1     
-----------
 Chinese,Chinese character
(1 row)

-- Drop the table.
m_db=# DROP TABLE test_set;
DROP TABLE

-- Create a table with the SET data type.
m_db=# CREATE TABLE t1(id int, c1 set('a','bb','ccc'));
CREATE TABLE

-- Index scan is not supported when indexes of the SET type are directly created.
m_db=# CREATE INDEX idx_1 ON t1(c1);
CREATE INDEX

m_db=# EXPLAIN SELECT c1 FROM t1 WHERE c1 = 'a';
                     QUERY PLAN                      
-----------------------------------------------------
 Seq Scan on t1  (cost=0.00..59.00 rows=15 width=32)
   Filter: (cast_to_cstring(c1) = 'a'::text)
(2 rows)

-- Query strings. Create the cast_to_cstring function index to support index scan.
m_db=# CREATE INDEX id_t1 ON t1((cast_to_cstring(c1)));
CREATE INDEX

m_db=# EXPLAIN SELECT c1 FROM t1 WHERE c1 = 'a';
                             QUERY PLAN                             
--------------------------------------------------------------------
 Bitmap Heap Scan on t1  (cost=4.30..13.78 rows=6 width=32)
   Recheck Cond: (cast_to_cstring(c1) = 'a'::text)
   ->  Bitmap Index Scan on id_t1  (cost=0.00..4.30 rows=6 width=0)
         Index Cond: (cast_to_cstring(c1) = 'a'::text)
(4 rows)

-- Query integers. Create the cast_to_int8 function index to support index scan.
m_db=# CREATE INDEX id_t2 ON t1((cast_to_int8(c1)));
CREATE INDEX

m_db=#  EXPLAIN SELECT c1 FROM t1 WHERE c1 = 5;
                             QUERY PLAN                             
--------------------------------------------------------------------
 Bitmap Heap Scan on t1  (cost=4.30..13.78 rows=6 width=32)
   Recheck Cond: (cast_to_int8(c1) = 5)
   ->  Bitmap Index Scan on id_t2  (cost=0.00..4.30 rows=6 width=0)
         Index Cond: (cast_to_int8(c1) = 5)
(4 rows)