COPY
Description
Copies data between tables and files.
COPY FROM copies data from a file to a table, and COPY TO copies data from a table to a file.
Precautions
- When the enable_copy_server_files parameter is disabled, only the initial user is allowed to run the COPY FROM FILENAME or COPY TO FILENAME statement. When the enable_copy_server_files parameter is enabled, users with the SYSADMIN permission or users who inherit permissions of the built-in role gs_role_copy_files are allowed to run the COPY FROM FILENAME or COPY TO FILENAME statement. By default, database configuration files and key files are not allowed, and you can run COPY FROM FILENAME or COPY TO FILENAME for certificate files and audit logs to prevent unauthorized users from viewing or modifying sensitive files. When enable_copy_server_files is enabled, the administrator can use the GUC parameter safe_data_path to set the path for common users to import and export to the subpath of the set path. If this GUC parameter is not set (by default), the path used by common users is not blocked. This parameter reports an error for ... in the path of the COPY statement.
- COPY applies only to tables but not views.
- COPY TO requires the SELECT permission on the table to be read, and COPY FROM requires the INSERT permission on the table to be inserted.
- If a list of columns is specified, COPY copies only the data of the specified columns between the file and the table. If a table has any columns that are not in the column list, COPY FROM inserts default values for those columns.
- If a data source file is specified, the server must be able to access the file. If STDIN is specified, data flows between the client and the server. When entering data, use the TAB key to separate the columns of the table and use a backslash and a period (\.) in a new row to indicate the end of the input.
- COPY FROM throws an error if any row in the data file contains more or fewer columns than expected.
- The end of the data can be represented by a line that contains only backslashes and periods (\.). If data is read from a file, the end flag is unnecessary. If data is copied between client applications, an end tag must be provided.
- In COPY FROM, \N is an empty string. To enter the actual value \N, use \\N.
- COPY FROM can preprocess data using column expressions, but column expressions do not support subqueries.
- When a data format error occurs during COPY FROM execution, the transaction is rolled back. However, the error information is insufficient, making it difficult to locate the error data from a large amount of raw data.
- COPY FROM and COPY TO apply to low concurrency and local import and export of a small amount of data.
- When COPY is used in binary format, transcoding in distributed mode is not supported.
- COPY is a server command and its operating environment is the same as that of the database server process. \COPY is a client meta-command and its operating environment is the same as that of gsql on the client. Note that when the database and gsql are used in the sandbox environment, the COPY and \COPY commands both use the paths in the sandbox. When the database is used in the sandbox environment and gsql is used outside the sandbox, the COPY command uses the path inside the sandbox, and the \COPY command uses the path outside the sandbox.
- When executing COPY to import data to a base table with a GSI, the enable_stream_operator parameter must be enabled to achieve optimal data import performance.
- During the export using COPY TO, if the column data in the table contains the '\0' character, the column data will be truncated during the export. Only the data before '\0' will be exported.
- The import and export statistics and error information are stored in the gs_copy_summary and pgxc_copy_error_log system catalogs, respectively. Objects that may cause privilege escalation, such as RULE, TRIGGER, index functions, row-level security, CHECK constraints, GENERATED columns, DEFAULT columns, and ON UPDATE columns, cannot be contained in the system catalogs. Otherwise, the system considers that the objects are created by unauthorized users, reports an error, and exits.
Syntax
- Copy data from a file to a table.
1 2 3 4 5 6 7 8 9 10 11 12 13
COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | STDIN } [ [ USING ] DELIMITERS 'delimiters' ] [ WITHOUT ESCAPING ] [ LOG ERRORS ] [ LOG ERRORS DATA ] [ REJECT LIMIT 'limit' ] [ [ WITH ] ( option [, ...] ) | ( copy_option [, ...] ) | [ TRANSFORM ( { column_name [ data_type ] [ AS transform_expr ] } [, ...] ) ] | [ FIXED FORMATTER ( { column_name( offset, length ) } [, ...] ) ] ]
The fixed formatter syntax is compatible with copy_option but incompatible with option. copy_option is incompatible with option. transform is compatible with copy_option and fixed formatter.
- Copy data from a table to a file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
COPY table_name [ ( column_name [, ...] ) ] TO { 'filename' | STDOUT } [ [ USING ] DELIMITERS 'delimiters' ] [ WITHOUT ESCAPING ] [ [ WITH ] ( option [, ...] ) | ( copy_option [, ...] ) | [ FIXED FORMATTER ( { column_name( offset, length ) } [, ...] ) ] ] COPY query TO { 'filename' | STDOUT } [ WITHOUT ESCAPING ] [ [ WITH ] ( option [, ...] ) | ( copy_option [, ...] ) | [ FIXED FORMATTER ( { column_name( offset, length ) } [, ...] ) ] ]
- The syntax constraints of COPY TO are as follows:
(query) is incompatible with [USING] DELIMITERS. If the data comes from a query result, COPY TO cannot specify [USING] DELIMITERS.
- Use spaces to separate copy_option following FIXED FORMATTTER.
- copy_option is the native parameter, while option is the parameter imported by a compatible foreign table.
The syntax of the optional parameter option is as follows:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
FORMAT 'format_name' | FORMAT binary | OIDS [ boolean ] | DELIMITER 'delimiter_character' | NULL 'null_string' | HEADER [ boolean ] | USEEOF [ boolean ] | FILEHEADER 'header_file_string' | FREEZE [ boolean ] | QUOTE 'quote_character' | ESCAPE 'escape_character' | EOL 'newline_character' | NOESCAPING [ boolean ] | FORCE_QUOTE { ( column_name [, ...] ) | * } | FORCE_NOT_NULL ( column_name [, ...] ) | ENCODING 'encoding_name' | IGNORE_EXTRA_DATA [ boolean ] | FILL_MISSING_FIELDS [ boolean ] | COMPATIBLE_ILLEGAL_CHARS [ boolean ] | DATE_FORMAT 'date_format_string' | TIME_FORMAT 'time_format_string' | TIMESTAMP_FORMAT 'timestamp_format_string' | SMALLDATETIME_FORMAT 'smalldatetime_format_string'
The syntax of the optional parameter copy_option is as follows:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
OIDS | NULL 'null_string' | HEADER | USEEOF | FILEHEADER 'header_file_string' | FREEZE | FORCE_NOT_NULL column_name [, ...] | FORCE_QUOTE { column_name [, ...] | * } | BINARY | CSV | QUOTE [ AS ] 'quote_character' | ESCAPE [ AS ] 'escape_character' | EOL 'newline_character' | ENCODING 'encoding_name' | IGNORE_EXTRA_DATA | FILL_MISSING_FIELDS | COMPATIBLE_ILLEGAL_CHARS | DATE_FORMAT 'date_format_string' | TIME_FORMAT 'time_format_string' | TIMESTAMP_FORMAT 'timestamp_format_string' | SMALLDATETIME_FORMAT 'smalldatetime_format_string'
- The syntax constraints of COPY TO are as follows:
Parameters
- query
Specifies that the results will be copied.
Valid value: a SELECT or VALUES command in parentheses
- table_name
Specifies the name (possibly schema-qualified) of an existing table.
Value range: an existing table name
- column_name
Specifies an optional list of columns to be copied.
Value range: any columns. All columns will be copied if no column list is specified.
- STDIN
Specifies that input comes from the standard input. In the input, table columns are separated by tabs and each row ends with a backslash and a period (\.).
- STDOUT
Specifies that output goes to the standard output.
- FIXED
Fixes column length. When the column length is fixed, DELIMITER, NULL, and CSV cannot be specified. When FIXED is specified, BINARY, CSV, and TEXT cannot be specified by option or copy_option.
The definition of fixed length is as follows:
- The column length of each record is the same.
- Spaces are used for column padding. Columns of the numeric type are left-aligned and columns of the string type are right-aligned.
- No delimiters are used between columns.
- [USING] DELIMITERS 'delimiters'
String that separates columns within each row (line) of the file. It cannot be larger than 10 bytes.
Value range: The delimiter in text format cannot include any of the following characters: \.abcdefghijklmnopqrstuvwxyz0123456789, but has no restriction for the CSV format.
Value range: The default value is a tab character in text format and a comma in CSV format.
Both DELIMITER and DELIMITERS can specify delimiters. DELIMITERS can be followed by brackets, but DELIMITER cannot be directly followed by brackets. Otherwise, a syntax error is reported.
- WITHOUT ESCAPING
Specifies, in the TEXT format, whether to escape the backslash (\) and its following characters.
Value range: text only
- LOG ERRORS
If this parameter is specified, the error tolerance mechanism for data type errors in the COPY FROM statement is enabled. Row errors are recorded in the public.pgxc_copy_error_log table in the database for future reference.
Value range: a value set while data is imported using COPY FROM.
The restrictions of this error tolerance parameter are as follows:
- This error tolerance mechanism captures only the data type errors (DATA_EXCEPTION) that occur during data parsing of COPY FROM on a CN. Other errors, such as network errors between CNs and DNs or expression conversion errors on DNs, are not captured.
- Before enabling error tolerance for COPY FROM for the first time in a database, check whether the public.pgxc_copy_error_log table exists. If it does not, call the copy_error_log_create() function to create it. If it does, copy its data elsewhere, delete it, and call the copy_error_log_create() function to create the table. For details about columns in the public.pgxc_copy_error_log table, see Table 1.
- While a COPY FROM statement with specified LOG ERRORS is being executed, if public.pgxc_copy_error_log does not exist or does not have the table definitions compliant with those predefined in copy_error_log_create(), an error will be reported. Ensure that the error table is created using the copy_error_log_create() function. Otherwise, COPY FROM statements with error tolerance may fail to be run.
- If existing error tolerance parameters (for example, IGNORE_EXTRA_DATA) of the COPY statement are enabled, the error of the corresponding type will be processed as specified by the parameters and no error will be reported. Therefore, the error table does not contain such error data.
- LOG ERRORS DATA
The differences between LOG ERRORS DATA and LOG ERRORS are as follows:
- LOG ERRORS DATA fills the rawrecord column in the error tolerance table.
- Only users with the super permission can use the parameter options of LOG ERRORS DATA.
- If error content is too complex, it may fail to be written to the error tolerance table by using LOG ERRORS DATA, causing the task failure.
- For errors that cannot be read in certain code, the error codes are ERRCODE_CHARACTER_NOT_IN_REPERTOIRE and ERRCODE_UNTRANSLATABLE_CHARACTER. The rawrecord column is not recorded.
- REJECT LIMIT 'limit'
Used with the LOG ERROR options to set the upper limit of the tolerated errors in the COPY FROM statement. If the number of errors exceeds the limit, later errors will be reported based on the original mechanism.
Value range: a positive integer (1 to 2147483647) or 'unlimited'
Default value: If LOG ERRORS is not specified, an error will be reported. If LOG ERRORS is specified, the default value is 0.
Different from the GDS error tolerance mechanism, in the error tolerance mechanism described in LOG ERRORS, the count of REJECT LIMIT is calculated based on the number of data parsing errors on the CN where the COPY FROM statement is run, not based on the number of errors on each DN.
- FORMATTER
Defines the place of each column in the data file in fixed length mode. Defines the place of each column in the data file in the column(offset,length) format.
Value range:
- The value of offset must be larger than 0. The unit is byte.
- The value of length must be larger than 0. The unit is byte.
The total length of all columns must be less than 1 GB.
Replace columns that are not in the file with null.
- OPTION { option_name ' value ' }
Specifies all types of parameters of a compatible foreign table.
- FORMAT
Specifies the format of the source data file in the foreign table.
Value range: CSV, TEXT, FIXED, and BINARY
- The CSV file can process newline characters efficiently, but cannot process certain special characters well.
- The TEXT file can process certain special characters efficiently, but cannot process newline characters well.
- In FIXED files, the column length of each record is the same. Spaces are used for padding, and the excessive part will be truncated.
- All data in the BINARY file is stored/read as binary format rather than as text. It is faster than the text and CSV formats, but a binary-format file is less portable.
Default value: TEXT
- OIDS
Specifies the internal OID to be copied for each row.
An error is raised if OIDs are specified for a table that does not have OIDs, or in the case of copying a query.
Value range: true/on or false/off
Default value: false
- DELIMITER
Specifies the character that separates columns within each row (line) of the file.
- The value of delimiter cannot be \r or \n.
- A delimiter cannot be the same as the null value. The delimiter for the CSV format cannot be same as the quote value.
- The delimiter for the TEXT format data cannot contain lowercase letters, digits, or special characters (.\).
- The data length of a single row should be less than 1 GB. A row that has many columns using long delimiters cannot contain much valid data.
- You are advised to use multi-character delimiters or invisible delimiters. For example, you can use multi-characters (such as $^&) and invisible characters (such as 0x07, 0x08, and 0x1b).
- To use a tab to isolate CSV data, set delimiter to E'\t'.
Value range: a multi-character delimiter within 10 bytes
Default value:
- A tab character in text format
- A comma (,) in CSV format
- No delimiter in FIXED format
- NULL
Specifies the string that represents a null value.
Value range:
- A null value cannot be \r or \n. The maximum length is 100 characters.
- A null value cannot be the same as the delimiter or quote value.
Default value:
- The default value for the CSV format is an empty string without quotation marks.
- The default value for the TEXT format is \N.
- HEADER
Specifies whether a file contains a header with the names of each column in the file. header is available only for CSV and FIXED files.
When data is imported, if header is on, the first row of the data file will be identified as the header and ignored. If header is off, the first row will be identified as a data row.
When data is exported, if header is on, fileheader must be specified. If header is off, an exported file does not contain a header.
Value range: true/on or false/off
Default value: false
- USEEOF
The system does not report an error for "\." in the imported data.
Value range: true/on or false/off
Default value: false
- QUOTE
Specifies a quoted character string for a CSV file.
Default value: double quotation marks ("")
- The value of quote cannot be the same as that of the delimiter or null parameter.
- The value of quote must be a single-byte character.
- You are advised to set quote to an invisible character, such as 0x07, 0x08, or 0x1b.
- ESCAPE
Specifies an escape character for a CSV file. The value must be a single-byte character.
Default value: double quotation marks (""). If the value is the same as that of quote, it will be replaced by '\0'.
- EOL 'newline_character'
Specifies the newline character style of the imported or exported data file.
Value range: multi-character newline characters within 10 bytes. Common newline characters include \r (0x0D), \n (0x0A), and \r\n (0x0D0A). Special newline characters include $ and #.
- The EOL parameter supports only the TEXT format for data import and export and does not support the CSV or FIXED format for data import. For forward compatibility, the EOL parameter can be set to 0x0D or 0x0D0A for data export in the CSV or FIXED format.
- The value of EOL cannot be the same as that of the delimiter or null parameter.
- The EOL parameter value cannot contain the following characters: .abcdefghijklmnopqrstuvwxyz0123456789.
- FORCE_QUOTE { ( column_name [, ...] ) | * }
In CSV COPY TO mode, forces quotation marks to be used for all non-null values in each specified column. The asterisk (*) indicates all columns. Null values are not quoted.
Value range: an existing column name
- FORCE_NOT_NULL ( column_name [, ...] )
Assigns a value to a specified column in CSV COPY FROM mode.
Value range: an existing column name
- ENCODING
Specifies the encoding format of a data file. The default value is the current database encoding format.
- IGNORE_EXTRA_DATA
Specifies whether to ignore excessive columns when the number of data source files exceeds the number of foreign table columns. This parameter is used only during data import.
Value range: true/on or false/off
- true/on: If the number of columns in a data source file is greater than that defined by the foreign table, the extra columns at the end of a row are ignored.
- false/off: If the number of columns in a data source file is greater than that defined by the foreign table, the following error message is reported:
1
extra data after last expected column
Default value: false
If a newline character at the end of a row is missing and the row and another row are integrated into one, data in another row is ignored after the parameter is set to true.
- COMPATIBLE_ILLEGAL_CHARS
Specifies whether to tolerate invalid characters during data import. The parameter is valid only for data import using COPY FROM.
Value range: true/on or false/off
- true or on: No error message is reported and data import is not interrupted when there are invalid characters. Invalid characters are converted into valid ones, and then imported to the database.
- false/off: An error occurs when there are invalid characters, and the import stops.
Default value: false or off
The rules for converting invalid characters are as follows:
1. '\0' is converted to a space.
2. Other invalid characters are converted to question marks.
3. When compatible_illegal_chars is set to true/on, after invalid characters such as NULL, DELIMITER, QUOTE, and ESCAPE are converted to spaces or question marks, an error message stating "illegal chars conversion may confuse COPY escape 0x20" will be displayed to remind you of possible parameter confusion caused by the conversion.
4. When the GUC parameter copy_special_character_version is set to 'no_error', compatible_illegal_chars cannot be set to true or on.
5. When the GUC parameter copy_special_character_version is set to 'no_error', invalid characters will not be checked during the import and will be displayed as garbled characters in query results. Exercise caution when enabling this parameter. You can use the LOG ERRORS or LOG ERRORS DATA parameter in the COPY statement to code errors to an error table.
- FILL_MISSING_FIELDS
Specifies how to handle the problem that the last column of a row in a source data file is lost during data import.
Value range: true/on or false/off.
Default value: false or off
- DATE_FORMAT
Specifies the DATE format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid DATE value. For details, see Date and Time Processing Functions and Operators.
If Oracle is specified as the compatible database, the DATE format is TIMESTAMP. For details, see timestamp_format below.
- TIME_FORMAT
Specifies the TIME format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid TIME value. Time zones cannot be used. For details, see Date and Time Processing Functions and Operators.
- TIMESTAMP_FORMAT
Specifies the TIMESTAMP format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid TIMESTAMP value. Time zones cannot be used. For details, see Date and Time Processing Functions and Operators.
- SMALLDATETIME_FORMAT
Specifies the SMALLDATETIME format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid SMALLDATETIME value. For details, see Date and Time Processing Functions and Operators.
- FORMAT
- COPY_OPTION { option_name ' value ' }
Specifies all types of native parameters of COPY.
- OIDS
- NULL null_string
Specifies the string that represents a null value.
When using COPY FROM, any data item that matches this string will be stored as a null value, so make sure that you use the same string as you used with COPY TO.
Value range:
- A null value cannot be \r or \n. The maximum length is 100 characters.
- A null value cannot be the same as the delimiter or quote value.
Default value:
- The default value for the TEXT format is \N.
- The default value for the CSV format is an empty string without quotation marks.
- HEADER
Specifies whether a file contains a header with the names of each column in the file. header is available only for CSV and FIXED files.
When data is imported, if header is on, the first row of the data file will be identified as the header and ignored. If header is off, the first row will be identified as a data row.
When data is exported, if header is on, fileheader must be specified. If header is off, an exported file does not contain a header.
- USEEOF
The system does not report an error for "\." in the imported data.
- FILEHEADER
Specifies a file that defines the content in the header for exported data. The file contains data description of each column.
- This parameter is available only when header is on or true.
- fileheader specifies an absolute path.
- The file can contain only one row of header information, and ends with a newline character. Excess rows will be discarded. (Header information cannot contain newline characters.)
- The length of the file including the newline character cannot exceed 1 MB.
- FREEZE
Sets the COPY loaded data row as frozen, like these data have executed VACUUM FREEZE.
This is a performance option of initial data loading. The data will be frozen only when the following three requirements are met:
- The table being loaded has been created or truncated in the same transaction before copying.
- There are no cursors open in the current transaction.
- There are no original snapshots in the current transaction.
When COPY is completed, all the other sessions will see the data immediately. However, this violates the general principle of MVCC visibility, and users should understand that this may cause potential risks.
- FORCE NOT NULL column_name [, ...]
Assigns a value to a specified column in CSV COPY FROM mode.
Value range: an existing column
- FORCE QUOTE { column_name [, ...] | * }
In CSV COPY TO mode, forces quotation marks to be used for all non-null values in each specified column. The asterisk (*) indicates all columns. Null values are not quoted.
Value range: an existing column name
- BINARY
Specifies that data is stored and read in binary mode instead of text mode.
- In binary mode, you cannot declare DELIMITER, NULL, or CSV.
- When BINARY is specified, CSV, FIXED, and TEXT cannot be specified through option or copy_option.
- If the GUC parameter copy_special_character_version is set to 'no_error', invalid characters will not be checked during the import and will be displayed as garbled characters in query results. The database server code must be the same as the file code. Exercise caution when enabling this parameter. You can use the LOG ERRORS or LOG ERRORS DATA parameter in the COPY statement to code errors to an error table.
- In binary mode, copy_special_character_version is set to 'no_error', and it takes effect only for columns of the TEXT, CHAR, VARCHAR, NVARCHAR2, and CLOB types.
- CSV
Enables the CSV mode. When CSV is specified, BINARY, FIXED, and TEXT cannot be specified through option or copy_option.
- QUOTE [AS] 'quote_character'
Specifies a quoted character string for a CSV file.
Default value: double quotation marks ("").
- The value of quote cannot be the same as that of the delimiter or null parameter.
- The value of quote must be a single-byte character.
- You are advised to set quote to an invisible character, such as 0x07, 0x08, or 0x1b.
- ESCAPE [AS] 'escape_character'
Specifies an escape character for a CSV file. The value must be a single-byte character.
Default value: double quotation marks (""). If the value is the same as that of quote, it will be replaced by '\0'.
- EOL 'newline_character'
Specifies the newline character style of the imported or exported data file.
Value range: multi-character newline characters within 10 bytes. Common newline characters include \r (0x0D), \n (0x0A), and \r\n (0x0D0A). Special newline characters include $ and #.
- The EOL parameter supports only the TEXT format for data import and export and does not support the CSV or FIXED format. For forward compatibility, the EOL parameter can be set to 0x0D or 0x0D0A for data export in the CSV or FIXED format.
- The value of EOL cannot be the same as that of the delimiter or null parameter.
- The EOL parameter value cannot contain the following characters: .abcdefghijklmnopqrstuvwxyz0123456789.
- ENCODING 'encoding_name'
Specifies the name of a file encoding format.
Value range: a valid encoding format
Default value: current encoding format
- IGNORE_EXTRA_DATA
Specifies that when the number of data source files exceeds the number of foreign table columns, excess columns at the end of the row are ignored. This parameter is used only during data import.
If this parameter is not used and the number of columns in the data source file is greater than that defined in the foreign table, the following error information is displayed:1
extra data after last expected column
- COMPATIBLE_ILLEGAL_CHARS
Specifies that invalid characters are tolerated during data import. Invalid characters are converted and then imported to the database. No error is reported and the import is not interrupted. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
If this parameter is not used, an error is reported when invalid characters are encountered during the import, and the import is interrupted.
The rules for converting invalid characters are as follows:
1. '\0' is converted to a space.
2. Other invalid characters are converted to question marks.
3. When compatible_illegal_chars is set to true/on, after invalid characters such as NULL, DELIMITER, QUOTE, and ESCAPE are converted to spaces or question marks, an error message stating "illegal chars conversion may confuse COPY escape 0x20" will be displayed to remind you of possible parameter confusion caused by the conversion.
4. When the GUC parameter copy_special_character_version is set to 'no_error', compatible_illegal_chars cannot be set to true or on.
5. When the GUC parameter copy_special_character_version is set to 'no_error', invalid characters will not be checked during the import and will be displayed as garbled characters in query results. Exercise caution when enabling this parameter. You can use the LOG ERRORS or LOG ERRORS DATA parameter in the COPY statement to code errors to an error table.
- FILL_MISSING_FIELDS
Specifies how to handle the problem that the last column of a row in a source data file is lost during data import.
Value range: true/on or false/off
Default value: false or off
Do not specify this option. Currently, it does not enable error tolerance, but will make the parser ignore the said errors during data parsing on the CN. Such errors will not be recorded in the COPY error table (enabled using LOG ERRORS REJECT LIMIT) but will be reported later by DNs. Therefore, do not specify this option.
- DATE_FORMAT 'date_format_string'
Specifies the DATE format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid DATE value For details, see Date and Time Processing Functions and Operators.
If Oracle is specified as the compatible database, the DATE format is TIMESTAMP. For details, see timestamp_format below.
- TIME_FORMAT 'time_format_string'
Specifies the TIME format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid TIME value. Time zones cannot be used. For details, see Date and Time Processing Functions and Operators.
- TIMESTAMP_FORMAT 'timestamp_format_string'
Specifies the TIMESTAMP format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid TIMESTAMP value. Time zones cannot be used. For details, see Date and Time Processing Functions and Operators.
- SMALLDATETIME_FORMAT 'smalldatetime_format_string'
Specifies the SMALLDATETIME format for data import. The BINARY format is not supported. When data of such format is imported, error "cannot specify bulkload compatibility options in BINARY mode" will occur. The parameter is valid only for data import using COPY FROM.
Value range: a valid SMALLDATETIME value. For details, see Date and Time Processing Functions and Operators.
- TRANSFORM ( { column_name [ data_type ] [ AS transform_expr ] } [, ...] )
Specify the conversion expression of each column in the table. data_type specifies the data type of the column in the expression parameter. transform_expr is the target expression that returns the result value whose data type is the same as that of the target column in the table. For details about the expression, see Expressions.
COPY FROM does not support expression conversion for distribution keys.
The following special backslash sequences are recognized by COPY FROM:- \b: Backslash (ASCII 8)
- \f: Form feed (ASCII 12)
- \n: Newline character (ASCII 10)
- \r: Carriage return character (ASCII 13)
- \t: Tab (ASCII 9)
- \v: Vertical tab (ASCII 11)
- \digits: Backslash followed by one to three octal digits specifies that the ASCII value is the character with that numeric code.
- \xdigits: Backslash followed by an x and one or two hex digits specifies the character with that numeric code.
Permission Control Examples
gaussdb=> copy t1 from '/home/xy/t1.csv'; ERROR: COPY to or from a file is prohibited for security concerns HINT: Anyone can COPY to stdout or from stdin. gsql's \copy command also works for anyone. gaussdb=> grant gs_role_copy_files to xxx;
This error occurs because a non-initial user does not have the COPY permission. To solve this problem, enable the enable_copy_server_files parameter. Then, the administrator can use the COPY function, and common users need to join the gs_role_copy_files group.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
-- Create a schema. gaussdb=# CREATE SCHEMA tpcds; -- Create the tpcds.ship_mode table. gaussdb=# CREATE TABLE tpcds.ship_mode ( SM_SHIP_MODE_SK INTEGER NOT NULL, SM_SHIP_MODE_ID CHAR(16) NOT NULL, SM_TYPE CHAR(30) , SM_CODE CHAR(10) , SM_CARRIER CHAR(20) , SM_CONTRACT CHAR(20) ) DISTRIBUTE BY HASH(SM_SHIP_MODE_SK); -- Insert a single data record into the tpcds.ship_mode table. gaussdb=# INSERT INTO tpcds.ship_mode VALUES (1,'a','b','c','d','e'); -- Copy data from tpcds.ship_mode to the /home/omm/ds_ship_mode.dat file. gaussdb=# COPY tpcds.ship_mode TO '/home/omm/ds_ship_mode.dat'; -- Output tpcds.ship_mode to STDOUT. gaussdb=# COPY tpcds.ship_mode TO STDOUT; -- Output the data of tpcds.ship_mode to STDOUT. The parameters are as follows: The delimiter is ',' (delimiter',') and the encoding format is UTF8 (encoding'utf8'). gaussdb=# COPY tpcds.ship_mode TO STDOUT WITH (delimiter ',', encoding 'utf8'); -- Output the data of tpcds.ship_mode to STDOUT. The parameters are as follows: The import format is CSV (format'CSV'), and the exported content of the SM_SHIP_MODE_SK column is enclosed in quotation marks (force_quote(SM_SHIP_MODE_SK)). gaussdb=# COPY tpcds.ship_mode TO STDOUT WITH (format 'CSV', force_quote(SM_SHIP_MODE_SK)); -- Create the tpcds.ship_mode_t1 table. gaussdb=# CREATE TABLE tpcds.ship_mode_t1 ( SM_SHIP_MODE_SK INTEGER NOT NULL, SM_SHIP_MODE_ID CHAR(16) NOT NULL, SM_TYPE CHAR(30) , SM_CODE CHAR(10) , SM_CARRIER CHAR(20) , SM_CONTRACT CHAR(20) ) DISTRIBUTE BY HASH(SM_SHIP_MODE_SK ); -- Copy data from STDIN to the tpcds.ship_mode_t1 table. gaussdb=# COPY tpcds.ship_mode_t1 FROM STDIN; -- Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table. gaussdb=# COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat'; -- Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table, convert the data using the TRANSFORM expression, and insert the 10 characters on the left of the SM_TYPE column into the table. gaussdb=# COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat' TRANSFORM (SM_TYPE AS LEFT(SM_TYPE, 10)); -- Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table, with the import format set to TEXT (format 'text'), the delimiter set to \t' (delimiter E'\t'), excessive columns ignored (ignore_extra_data 'true'), and characters not escaped (noescaping 'true'). gaussdb=# COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat' WITH(format 'text', delimiter E'\t', ignore_extra_data 'true', noescaping 'true'); -- Copy data from the /home/omm/ds_ship_mode.dat file to the tpcds.ship_mode_t1 table, with the import format set to FIXED, fixed-length format specified (FORMATTER(SM_SHIP_MODE_SK(0, 2), SM_SHIP_MODE_ID(2,16), SM_TYPE(18,30), SM_CODE(50,10), SM_CARRIER(61,20), SM_CONTRACT(82,20))), excessive columns ignored (ignore_extra_data), and headers included (header). gaussdb=# COPY tpcds.ship_mode_t1 FROM '/home/omm/ds_ship_mode.dat' FIXED FORMATTER(SM_SHIP_MODE_SK(0, 2), SM_SHIP_MODE_ID(2,16), SM_TYPE(18,30), SM_CODE(50,10), SM_CARRIER(61,20), SM_CONTRACT(82,20)) header ignore_extra_data; -- Delete tables and the schema. gaussdb=# DROP TABLE tpcds.ship_mode; gaussdb=# DROP TABLE tpcds.ship_mode_t1; gaussdb=# DROP SCHEMA tpcds; |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot