Updated on 2024-10-14 GMT+08:00

CREATE TABLE AS

Function

CREATE TABLE AS creates a table from the results of a query.

It creates a table and fills it with data obtained using SELECT. The table columns have the names and data types associated with the output columns of SELECT (except that you can override the SELECT output column names by giving an explicit list of new column names).

CREATE TABLE AS queries a source table once and writes the data in a new table. The result in the query view changes with the source table. In contrast, the view re-computes and defines its SELECT statement at each query.

Precautions

  • This statement cannot be used to create a partitioned table.
  • If an error occurs during table creation, after it is fixed, the system may fail to delete the disk files that are created before the last automatic clearance and whose size is not 0. This problem seldom occurs and does not affect system running of the database.

Syntax

1
2
3
4
5
6
7
8
9
CREATE [ [ GLOBAL | LOCAL ] [ TEMPORARY | TEMP ] | UNLOGGED ] TABLE table_name
    [ (column_name [, ...] ) ]
    [ WITH ( {storage_parameter = value} [, ... ] ) ]
    [ COMPRESS | NOCOMPRESS ]
    [ TABLESPACE tablespace_name ]
    [ DISTRIBUTE BY { REPLICATION | { [HASH ] ( column_name ) } } ]
    [ TO { GROUP groupname | NODE ( nodename [, ... ] ) } ]
    AS query
    [ WITH [ NO ] DATA ];

Parameter Description

  • UNLOGGED

    Specifies that the table is created as an unlogged table. Data written to unlogged tables is not written to the WALs, which makes them considerably faster than ordinary tables. However, they are not crash-safe: an unlogged table is automatically truncated after a crash or unclean shutdown. The contents of an unlogged table are also not replicated to standby servers. Any indexes created on an unlogged table are automatically unlogged as well.

    • Usage scenario: Unlogged tables do not ensure data security. Users can back up data before using unlogged tables; for example, users should back up the data before a system upgrade.
    • Troubleshooting: If data is missing in the indexes of unlogged tables due to some unexpected operations such as an unclean shutdown, users should re-create the indexes with errors.
  • GLOBAL | LOCAL

    When creating a temporary table, you can specify the GLOBAL or LOCAL keyword before TEMP or TEMPORARY. Currently, the two keywords are used to be compatible with the SQL standard. A local temporary table will be created by the GaussDB regardless of whether GLOBAL or LOCAL is specified.

  • TEMPORARY | TEMP

    If TEMP or TEMPORARY is specified, the created table is a temporary table. Temporary tables are classified into global temporary tables and local temporary tables. If the keyword GLOBAL is specified when a temporary table is created, the table is a global temporary table. Otherwise, the table is a local temporary table.

    The metadata of the global temporary table is visible to all sessions. After the sessions end, the metadata still exists. The user data, indexes, and statistics of a session are isolated from those of another session. Each session can only view and modify the data submitted by itself. Global temporary tables have two schemas: ON COMMIT PRESERVE ROWS and ON COMMIT PRESERVE ROWS. In session-based ON COMMIT PRESERVE ROWS schema, user data is automatically cleared when a session ends. In transaction-based ON COMMIT DELETE ROWS schema, user data is automatically cleared when the commit or rollback operation is performed. If the ON COMMIT option is not specified during table creation, the session level is used by default. Different from local temporary tables, you can specify a schema that does not start with pg_temp_ when creating a global temporary table.

    A local temporary table is automatically dropped at the end of the current session. Therefore, temporary tables can still be created and used in the current session when the database node connected to the current session is faulty. Temporary tables are created only in the current session. If a DDL statement involves operations on temporary tables, a DDL error will be generated. Therefore, you are not advised to perform operations on temporary tables in DDL statements. TEMP is equivalent to TEMPORARY.

    • Local temporary tables are visible to the current session through the schema starting with pg_temp. Users should not delete schema started with pg_temp or pg_toast_temp.
    • If TEMPORARY or TEMP is not specified when you create a table and its schema is set to the schema of the pg_temp_ start in the current session, the table will be created as a temporary table.
    • If global temporary tables and indexes are being used by other sessions, do not perform ALTER or DROP.
    • The DDL of a global temporary table affects only the user data and indexes of the current session. For example, TRUNCATE, REINDEX, and ANALYZE are valid only for the current session.
  • table_name

    Specifies the name of the table to be created.

    Value range: a string. It must comply with the naming convention.

  • column_name

    Specifies the name of a column to be created in the new table.

    Value range: a string. It must comply with the naming convention.

  • WITH ( storage_parameter [= value] [, ... ] )

    Specifies an optional storage parameter for a table or an index. See details of parameters below.

    • FILLFACTOR

      The fill factor of a table is a percentage from 10 to 100. 100 (complete filling) is the default value. When a smaller fill factor is specified, INSERT operations pack table pages only to the indicated percentage. The remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page, which is more efficient than placing it on a different page. For a table whose entries are never updated, setting the fill factor to 100 (complete filling) is the best choice, but in heavily updated tables a smaller fill factor would be appropriate. The parameter is only valid for row–store tables.

      Value range: 10–100

    • ORIENTATION

      Value range:

      COLUMN: The data will be stored in columns.

      ROW (default value): The data will be stored in rows.

    • COMPRESSION

      Specifies the compression level of table data. It determines the compression ratio and time. Generally, the higher the level of compression, the higher the ratio, the longer the time; and the lower the level of compression, the lower the ratio, the shorter the time. The actual compression ratio depends on the distribution mode of table data loaded.

      Value range:

      The valid values for column-store tables are YES, NO, LOW, MIDDLE, and HIGH, and the default value is LOW.

      Row-store tables do not support compression.

    • MAX_BATCHROW

      Specifies the maximum number of rows in a storage unit during data loading. The parameter is only valid for column-store tables.

      Value range: 10000 to 60000

    • hashbucket

      Creates a hash table that uses buckets. This parameter supports only row-store tables, including row-store range tables.

      Value range: on and off

      Default value: off

      In current version, DDL operations on hash bucket tables are affected. Therefore, you are not advised to frequently perform DDL operations on hash bucket tables.

  • COMPRESS / NOCOMPRESS

    Specifies keyword COMPRESS during the creation of a table, so that the compression feature is triggered in case of bulk INSERT operations. If this feature is enabled, a scan is performed for all tuple data within the page to generate a dictionary and then the tuple data is compressed and stored. If NOCOMPRESS is specified, the table is not compressed. Row-store tables do not support compression.

    Default value: NOCOMPRESS, that is, tuple data is not compressed before storage.

  • TABLESPACE tablespace_name

    Specifies that the new table will be created in the tablespace_name tablespace. If not specified, the default tablespace is used.

  • DISTRIBUTE BY

    Specifies how the table is distributed or replicated between DNs.

    • REPLICATION: Each row in the table exists on all DNs, that is, each DN has complete table data.
    • HASH (column_name): Each row of the table will be placed into specified DNs based on the hash value of the specified column.
    • When DISTRIBUTE BY HASH (column_name) is specified, the primary key and its unique index must contain the column_name column.
    • When DISTRIBUTE BY HASH (column_name) is specified for a referenced table, the foreign key of the referencing table must contain the column_name column.

    Default value: HASH(column_name), primary key column of column_name (if any) or column of the first data type that can be used as a distribution key.

    column_name supports the following data types:

    • Integer types: TINYINT, SMALLINT, INT, BIGINT, and NUMERIC/DECIMAL
    • Character types: CHAR, BPCHAR, VARCHAR, VARCHAR2, and NVARCHAR2
    • Date/time types: DATE, TIME, TIMETZ, TIMESTAMP, TIMESTAMPTZ, INTERVAL, and SMALLDATETIME
  • TO { GROUP groupname | NODE ( nodename [, ... ] ) }

    TO GROUP specifies the node group to which the table to be created belongs. TO NODE is used for internal scale-out tools.

  • AS query

    Specifies a SELECT or VALUES command, or an EXECUTE command that runs a prepared SELECT or VALUES query.

  • [ WITH [ NO ] DATA ]

    Specifies whether the data produced by the query should be copied to the new table. By default, the data will be copied. If the value NO is used, only the table structure will be copied.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
-- Create the tpcds.store_returns table.
openGauss=# CREATE TABLE tpcds.store_returns
(
    W_WAREHOUSE_SK            INTEGER               NOT NULL,
    W_WAREHOUSE_ID            CHAR(16)              NOT NULL,
    sr_item_sk                VARCHAR(20)                   ,
    W_WAREHOUSE_SQ_FT         INTEGER                       
);
-- Create the tpcds.store_returns_t1 table and insert numbers that are greater than 16 in the sr_item_sk column of the tpcds.store_returns table.
openGauss=# CREATE TABLE tpcds.store_returns_t1 AS SELECT * FROM tpcds.store_returns WHERE sr_item_sk > '4795';

-- Copy tpcds.store_returns to create the tpcds.store_returns_t2 table.
openGauss=# CREATE TABLE tpcds.store_returns_t2 AS table tpcds.store_returns;

-- Delete the table.
openGauss=# DROP TABLE tpcds.store_returns_t1 ;
openGauss=# DROP TABLE tpcds.store_returns_t2 ;
openGauss=# DROP TABLE tpcds.store_returns;

Helpful Links

CREATE TABLE and SELECT