Updated on 2024-08-20 GMT+08:00

Creating and Managing Sequences

Context

A sequence is a database object that generates unique integers. Sequence numbers are generated according to a certain rule. Sequences are unique because they increase automatically. This is why they are often used as primary keys.

You can create a sequence for a column in either of the following methods:
  • Set the data type of a column to sequence integer. A sequence will be automatically created by the database for this column.
  • Use the CREATE SEQUENCE statement to create a sequence. Set the initial value of the nextval('sequence_name') function to the default value of a column.

Procedure

Method 1: Set the data type of a column to a sequence integer. For example:
1
2
3
4
5
gaussdb=# CREATE TABLE T1
(
    id    serial,
    name  text
);

If the following information is displayed, the creation is successful:

1
CREATE TABLE

Method 2: Create a sequence and set the initial value of the nextval('sequence_name') function to the default value of a column. You can cache a specific number of sequence values to reduce the requests to the GTM, improving the performance.

  1. Create a sequence.
    1
    gaussdb=# CREATE SEQUENCE seq1 cache 100;
    

    If the following information is displayed, the creation is successful:

    1
    CREATE SEQUENCE
    
  2. Set the default value of a column so that the column has a unique identification attribute.
    1
    2
    3
    4
    5
    gaussdb=# CREATE TABLE T2 
    ( 
        id   int not null default nextval('seq1'),
        name text
    );
    

    If the following information is displayed, the default value has been specified:

    1
    CREATE TABLE
    
  3. Associate a sequence with a column.

    Associate a sequence with a specified column included in a table. In this way, the sequence will be deleted when you delete its associated column or the table where the column belongs to.

    1
    gaussdb=# ALTER SEQUENCE seq1 OWNED BY T2.id;
    

    If the following information is displayed, the operation is successful:

    1
    ALTER SEQUENCE
    

The preceding methods are similar, except that the second method specifies cache for the sequence. A sequence having cache defined has inconsecutive values (such as 1, 4, and 5) and cannot maintain the order of its values. After the dependent column of a sequence has been specified, once the sequence is deleted, the sequence of the dependent will be deleted. A sequence shared by multiple columns is not forbidden in a database, but you are advised not to do that.

In the current version, you can specify the auto-increment column or set the default value of a column to nextval('seqname') when defining a table. You cannot add an auto-increment column or a column whose default value is nextval('seqname') to an existing table.

Precautions

Sequence values are generated by the GTM. By default, each request for a sequence value is sent to the GTM. The GTM calculates the result of the current value plus the step and then returns the result. The GTM is a globally unique node and is the performance bottleneck. Therefore, you are advised not to generate sequence values frequently and numerously, such as to use BulkLoad to import data. For example, the INSERT INTO SELECT FROM statement has poor performance in the following scenario:

1
2
3
4
5
6
7
gaussdb=# CREATE SEQUENCE newSeq1;
gaussdb=# CREATE TABLE newT1
           ( 
             id   int not null default nextval('newSeq1'), 
             name text
            );
gaussdb=# INSERT INTO newT1(name) SELECT name FROM T1;

Assume that data imported from table T1 to table newT1 has 10,000 rows. The following statements achieve better performance:

1
2
gaussdb=# INSERT INTO newT1(id, name) SELECT id,name FROM T1;
gaussdb=# SELECT SETVAL('newSeq1',10000);

Rollback is not supported by sequence functions, including nextval() and setval(). The value of the setval function immediately takes effect on nextval in the current session in any cases and take effect in other sessions only when no cache is specified for them. If cache is specified for a session, it takes effect only after all the cached values have been used. To avoid duplicate values, use setval only when necessary. Do not set it to an existing sequence value or a cached sequence value.

To generate the default sequence value using BulkLoad, set sufficient cache for newSeq1 and do not set Maxvalue or Minvalue. The database will push down the calling of nextval('sequence_name') to DNs to improve performance. Currently, the concurrent connection requests that can be processed by the GTM are limited. If there are too many DNs, a large number of concurrent connection requests will be sent to the GTM. In this case, you need to limit the bulk loading concurrency, so that DNs do not fully occupy GTM connections. If the target table is a replication table (DISTRIBUTE BY REPLICATION), pushdown cannot be performed. When the data volume is large, the performance is affected and the space may bloat sharply. After the import is complete, you need to run the vacuum full command to restore the data. The best way is not to use BulkLoad to generate the default sequence value.

After a sequence is created, one single-row table is maintained on each node to store the sequence definition and value, which is obtained from the last interaction with the GTM rather than updated in real time. The single-row table on a node does not update when other nodes request a new value from the GTM or when the sequence is modified using setval.