Updated on 2025-10-23 GMT+08:00

COPY

Description

Copies data between tables and files.

COPY FROM copies data from a file to a table, and COPY TO copies data from a table to a file.

Precautions

  • When the enable_copy_server_files parameter is disabled, only the initial user is allowed to run the COPY FROM FILENAME or COPY TO FILENAME command. When the enable_copy_server_files parameter is enabled, users with the SYSADMIN permission or users who inherit the permissions of the built-in role gs_role_copy_files are allowed to run the COPY FROM FILENAME or COPY TO FILENAME command. By default, database configuration files and key files are not allowed, and you can run the COPY FROM FILENAME or COPY TO FILENAME command for certificate files and audit logs to prevent unauthorized users from viewing or modifying sensitive files. When enable_copy_server_files is enabled, an administrator can use the GUC parameter safe_data_path to set the path for common users to import and export to the subpath of the set path. If this GUC parameter is not set (by default), the path used by common users is not blocked. This parameter reports an error for ... in the path of the COPY statement.
  • COPY applies only to tables but not views.
  • COPY TO requires the SELECT permission on the table to be read, and COPY FROM requires the INSERT permission on the table to be inserted.
  • If a list of columns is specified, COPY copies only the data of the specified columns between the file and the table. If a table has any columns that are not in the column list, COPY FROM inserts default values into those columns.
  • If a data source file is specified, the server must be able to access the file. If STDIN is specified, data flows between the client and the server. When entering data, use the TAB key to separate the columns of the table and use a backslash and a period (\.) in a new row to indicate the end of the input.
  • COPY FROM throws an error if any row in the data file contains more or fewer columns than expected.
  • The end of the data can be represented by a line that contains only backslashes and periods (\.). If data is read from a file, the end flag is unnecessary. If data is copied between client applications, an end tag must be provided.
  • In COPY FROM, \N is an empty string. To enter the actual value \N, use \\N.
  • COPY FROM does not support data preprocessing during data import, such as expression operation and default value filling. If you need to preprocess data during the import, you need to import the data to a temporary table and then run SQL statements to insert the data into the table through operations. However, this method causes I/O expansion and reduces the import performance.
  • When a data format error occurs during COPY FROM execution, the transaction is rolled back. However, the error information is insufficient, making it difficult to locate the error data from a large amount of raw data.
  • COPY FROM and COPY TO apply to low concurrency and local import and export of a small amount of data.
  • When COPY is used in binary format, transcoding in distributed mode is not supported.
  • COPY is a server command and its operating environment is the same as that of the database server process. \COPY is a client meta-command and its operating environment is the same as that of gsql on the client. Note that when the database and gsql are used in the sandbox environment, the COPY and \COPY commands both use the paths in the sandbox. When the database is used in the sandbox environment and gsql is used outside the sandbox, the COPY command uses the path inside the sandbox, and the \COPY command uses the path outside the sandbox.
  • When executing COPY to import data to a base table with a GSI, the enable_stream_operator parameter must be enabled to achieve optimal data import performance.
  • When exporting data of the float type, you are advised to set extra_float_digits to 3 to avoid loss of significant digits and ensure consistency between the data before and after export and import.
  • In M-compatible mode, you are advised not to perform COPY FROM/TO between databases with different lower_case_table_names parameter settings.

Syntax

  • Copy data from a file to a table.
    COPY table_name [ ( column_name [, ...] ) ] [WITH OIDS]
        FROM { 'filename' | STDIN } [LOAD] [ LOAD_DISCARD 'discard_path'] [LOAD_BAD 'bad_path']
        [ [ WITH ] ( option [, ...] ) | copy_option ];

    In the preceding syntax, copy_option is incompatible with option.

  • Copy data from a table to a file.
    COPY table_name [ ( column_name [, ...] ) ] [WITH OIDS]
        TO { 'filename' | STDOUT } [LOAD]
      [ [ WITH ] ( option [, ...] ) | copy_option ];
    
    COPY query
        TO { 'filename' | STDOUT }
      [ [ WITH ] ( option [, ...] ) | copy_option ];

    copy_option is the native parameter, while option is the parameter imported by a compatible foreign table.

    The syntax of the optional parameter option is as follows:
    FORMAT 'format_name'
    | FORMAT binary
    | DELIMITER 'delimiter_character'
    | NULL  'null_string'
    | QUOTE 'quote_character'
    | HEADER [ boolean ]
    | USEEOF [ boolean ]
    | ESCAPE 'escape_character'
    | FORCE_NOT_NULL ( column_name [, ...] )
    | FORCE_QUOTE { column_name [, ...] | * }
    | DATE_FORMAT 'date_format_string'
    | SMALLDATETIME_FORMAT 'smalldatetime_format_string'
    The syntax of the optional parameter copy_option is as follows:
    NULL [ AS ] 'null_string' 
    | FORCE_NOT_NULL column_name [, ...]
    | BINARY 
    | CSV 
    | ESCAPE [ AS ] 'escape_character' 
    | ENCODING 'encoding_name'
    | DELIMITER [ AS ] 'delimiter_character'
    | ROWS 'string'
    | DATE_FORMAT 'date_format_string' 
    | SMALLDATETIME_FORMAT 'smalldatetime_format_string'
    | SKIP int_number
    | WHEN { ( start - end ) | column_name } { = | != } 'string'
    | SEQUENCE ( { column_name ( integer [, incr] ) [, ...] } )

Parameters

  • query

    Specifies that the results will be copied.

    Valid value: a SELECT or VALUES command in parentheses

  • table_name

    Specifies the name (possibly schema-qualified) of an existing table.

    Value range: an existing table name

  • column_name

    Specifies an optional list of columns to be copied.

    Value range: any columns. All columns will be copied if no column list is specified.

  • STDIN

    Specifies that input comes from the standard input. In the input, table columns are separated by tabs and each row ends with a backslash and a period (\.).

  • STDOUT

    Specifies that output goes to the standard output.

  • LOAD

    Specifies whether the COPY syntax is called by gs_loader or directly called by users. This parameter is not recommended because it is not used to actively call APIs.

  • LOAD_DISCARD 'discard_path'

    Specifies the path where the discard file is generated on the gs_loader client. This parameter is not recommended because it is not used to actively call APIs.

  • LOAD_BAD 'bad_path'

    Specifies the path where the bad file is generated on the gs_loader client. This parameter is not recommended because it is not used to actively call APIs.

  • OPTION { option_name ' value ' }

    Specifies all types of parameters of a compatible foreign table.

    • FORMAT

      Specifies the format of the source data file in the foreign table.

      Value range: CSV, TEXT, and BINARY

      • The CSV file can process newline characters efficiently, but cannot process certain special characters well.
      • The TEXT file can process certain special characters efficiently, but cannot process newline characters well.
      • All data in the BINARY file is stored/read as binary format rather than as text. It is faster than the text and CSV formats, but a binary-format file is less portable.

      Default value: TEXT

    • DELIMITER
      Specifies the character that separates columns within each row (line) of the file.
      • The value of delimiter cannot be \r or \n.
      • A delimiter cannot be the same as the null value. The delimiter for the CSV format cannot be same as the quote value.
      • The delimiter for the TEXT format data cannot contain lowercase letters, digits, or special characters (.\).
      • The data length of a single row should be less than 1 GB. A row that has many columns using long delimiters cannot contain much valid data.
      • You are advised to use multi-character delimiters or invisible delimiters. For example, you can use multi-characters (such as $^&) and invisible characters (such as 0x07, 0x08, and 0x1b).
      • To use a tab to isolate CSV data, set delimiter to E'\t'.

      Value range: a multi-character delimiter within 10 bytes

      Default value:

      • A tab character in text format
      • A comma (,) in CSV format
    • NULL

      Specifies the string that represents a null value.

      Value range:

      • A null value cannot be \r or \n. The maximum length is 100 characters.
      • A null value cannot be the same as the delimiter or quote value.

      Default value:

      • The default value for the CSV format is an empty string without quotation marks.
      • The default value for the TEXT format is \N.
    • HEADER

      Specifies whether a file contains a header with the names of each column in the file. header is available only for CSV files.

      When data is imported, if header is on, the first row of the data file will be identified as the header and ignored. If header is off, the first row will be identified as a data row.

      When data is exported, if header is on, fileheader must be specified. If header is off, an exported file does not contain a header.

      Value range: true/on or false/off.

      Default value: false

    • ESCAPE

      Specifies an escape character for a CSV file. The value must be a single-byte character.

      Default value: '' If the value is the same as that of quote, it will be replaced by '\0'.

    • FORCE_NOT_NULL ( column_name [, ...] )

      Assigns a value to a specified column in CSV COPY FROM mode.

      Value range: an existing column.

    • DATE_FORMAT

      Specifies the DATE format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.

      Value range: a valid DATE value For details, see Datetime Functions.

      You can use the TIMESTAMP_FORMAT parameter to set the DATE format to TIMESTAMP for data import. For details, see TIMESTAMP_FORMAT below.

  • COPY_OPTION { option_name ' value ' }

    Specifies all types of native parameters of COPY.

    • NULL null_string
      Specifies the string that represents a null value.

      When using COPY FROM, any data item that matches this string will be stored as a null value, so make sure that you use the same string as you used with COPY TO.

      Value range:

      • A null value cannot be \r or \n. The maximum length is 100 characters.
      • A null value cannot be the same as the delimiter or quote value.

      Default value:

      • The default value for the TEXT format is \N.
      • The default value for the CSV format is an empty string without quotation marks.
    • FORCE NOT NULL column_name [, ...]

      Assigns a value to a specified column in CSV COPY FROM mode. If the column is null, its value is regarded as a string of 0 characters.

      Value range: an existing column.

    • BINARY
      Specifies that data is stored and read in binary mode instead of text mode.
      • In binary mode, you cannot declare DELIMITER, NULL, or CSV.
      • When BINARY is specified, CSV and TEXT cannot be specified through option or copy_option.
      • If the GUC parameter support_binary_copy_version is set to 'header_encoding', the header of the binary file exported using COPY TO contains the server encoding information of the database. When the binary file is imported using COPY FROM, the consistency of the encoding information is checked to ensure that the encoding information in the file must be consistent with that on the server.
      • If the GUC parameter copy_special_character_version is set to 'no_error', invalid characters will not be checked during the import and will be displayed as garbled characters in query results. Exercise caution when enabling this parameter. You can use the LOG ERRORS or LOG ERRORS DATA parameter in the COPY statement to code errors to an error table.
      • In binary mode, copy_special_character_version is set to 'no_error', and it takes effect only for fields of the TEXT, CHAR, VARCHAR, and NVARCHAR2 types.
    • CSV

      Enables the CSV mode. When CSV is specified, BINARY and TEXT cannot be specified through option or copy_option.

    • ESCAPE [AS] 'escape_character'

      Specifies an escape character for a CSV file. The value must be a single-byte character.

      Default value: ''. If the value is the same as that of quote, it will be replaced by '\0'.

    • ENCODING 'encoding_name'

      Specifies the name of a file encoding format.

      Value range: a valid encoding format

      Default value: current encoding format

    • DATE_FORMAT 'date_format_string'

      Specifies the DATE format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.

      Value range: a valid DATE value For details, see Datetime Functions.

      You can use the TIMESTAMP_FORMAT parameter to set the DATE format to TIMESTAMP for data import. For details, see TIMESTAMP_FORMAT below.

    • SKIP int_number

      Specifies that the first int_number rows of the data file are skipped during data import.

    • WHEN { ( start - end ) | column_name } { = | != } 'string'

      When data is imported, each row of data is checked. Only the rows that meet the WHEN condition are imported to the table.

    • SEQUENCE ( { column_name ( integer [, incr] ) [, ...] } )

      During data import, columns modified by SEQUENCE do not read data from the data file. The values are incremented by the value of incr based on the specified integer. If incr is not specified, the values are incremented from 1 by default.

    COPY FROM does not support expression conversion for distribution keys.

    The following special backslash sequences are recognized by COPY FROM:
    • \b: Backslash (ASCII 8)
    • \f: Form feed (ASCII 12)
    • \n: Newline character (ASCII 10)
    • \r: Carriage return character (ASCII 13)
    • \t: Tab (ASCII 9)
    • \v: Vertical tab (ASCII 11)
    • \digits: Backslash followed by one to three octal digits specifies that the ASCII value is the character with that numeric code.
    • \xdigits: Backslash followed by an x and one or two hex digits specifies the character with that numeric code.

Permission Control Examples

m_db=# copy t1 from '/home/xy/t1.csv';
ERROR:  COPY to or from a file is prohibited for security concerns
HINT:  Anyone can COPY to stdout or from stdin. gsql's \copy command also works for anyone.
m_db=# grant gs_role_copy_files to xxx;

This error occurs because a non-initial user does not have the COPY permission. To solve this problem, enable the enable_copy_server_files parameter. Then, the administrator can use the COPY function, and common users need to join the gs_role_copy_files group.

Examples

-- Create a schema.
m_db=# CREATE SCHEMA tpcds;

-- Create the tpcds.ship_mode table.
m_db=# CREATE TABLE tpcds.ship_mode
(
    SM_SHIP_MODE_SK           INTEGER               NOT NULL,
    SM_SHIP_MODE_ID           CHAR(16)              NOT NULL,
    SM_TYPE                   CHAR(30)                      ,
    SM_CODE                   CHAR(10)                      ,
    SM_CARRIER                CHAR(20)                      ,
    SM_CONTRACT               CHAR(20)
) DISTRIBUTE BY HASH(SM_SHIP_MODE_SK);

-- Insert a single data record into the tpcds.ship_mode table.
m_db=# INSERT INTO tpcds.ship_mode VALUES (1,'a','b','c','d','e');

-- Copy data from the tpcds.ship_mode file to the /home/omm/ds_ship_mode.dat file.
m_db=# COPY tpcds.ship_mode TO '/home/omm/ds_ship_mode.dat';

-- Output tpcds.ship_mode to STDOUT.
m_db=# COPY tpcds.ship_mode TO STDOUT;

-- Output the data of tpcds.ship_mode to STDOUT. The parameters are as follows: The delimiter is ',' (delimiter',') and the encoding format is UTF8 (encoding'utf8').
m_db=# COPY tpcds.ship_mode TO STDOUT WITH (delimiter ',', encoding 'utf8');

-- Output the data of tpcds.ship_mode to STDOUT. The parameters are as follows: The import format is CSV (format'CSV'), and the exported content of the SM_SHIP_MODE_SK column is enclosed in quotation marks (force_quote(SM_SHIP_MODE_SK)).
m_db=# COPY tpcds.ship_mode TO STDOUT WITH (format 'CSV', force_quote(SM_SHIP_MODE_SK));

-- Create the tpcds.ship_mode_t1 table.
m_db=# CREATE TABLE tpcds.ship_mode_t1
(
    SM_SHIP_MODE_SK           INTEGER               NOT NULL,
    SM_SHIP_MODE_ID           CHAR(16)              NOT NULL,
    SM_TYPE                   CHAR(30)                      ,
    SM_CODE                   CHAR(10)                      ,
    SM_CARRIER                CHAR(20)                      ,
    SM_CONTRACT               CHAR(20)
) DISTRIBUTE BY HASH(SM_SHIP_MODE_SK );

-- Copy data from STDIN to the tpcds.ship_mode_t1 table.
m_db=# COPY tpcds.ship_mode_t1 FROM STDIN;

-- Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table.
m_db=# COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat';

-- Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table, with the import format set to TEXT (format 'text'), the delimiter set to '\t' (delimiter E'\t'), excessive columns ignored (ignore_extra_data 'true'), and characters not escaped (noescaping 'true').
m_db=# COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat' WITH(format 'text', delimiter E'\t', ignore_extra_data 'true', noescaping 'true');

-- Delete tables and the schema.
m_db=# DROP TABLE tpcds.ship_mode;
m_db=# DROP TABLE tpcds.ship_mode_t1;
m_db=# DROP SCHEMA tpcds;