Parallel Data Import
GaussDB provides a parallel data import function that enables a large amount of data to be imported in a fast and efficient manner. This section describes parameters for importing data to GaussDB in parallel.
raise_errors_if_no_files
Parameter description: Specifies whether to distinguish between the problems "the number of imported file records is empty" and "the imported file does not exist." If this parameter is set to TRUE and the problem "the imported file does not exist" occurs, GaussDB will report the error message "file does not exist."
This is a SUSET parameter. Set it based on instructions provided in Table 1.
Value range: Boolean
- TRUE indicates that the messages of "the number of imported file records is empty" and "the imported file does not exist" are distinguished when files are imported.
- FALSE indicates that the messages of "the number of imported file records is empty" and "the imported file does not exist" are not distinguished when files are imported.
Default value: FALSE
gds_debug_mod
Parameter description: Specifies whether to enable the debug function of Gauss Data Service (GDS). This parameter is used to better locate and analyze GDS faults. After the debug function is enabled, types of packets received or sent by GDS, peer end of GDS during command interaction, and other interaction information about GDS are written into the logs of corresponding nodes in the cluster. In this way, the state switching on the GaussDB state machine and the current state are recorded. If this function is enabled, additional log I/O resources will be consumed, affecting log performance and validity. You are advised to enable this function only when locating GDS faults.
This is a USERSET parameter. Set it based on instructions provided in Table 1.
Value range:
- on indicates that the GDS debug function is enabled.
- off indicates that the GDS debug function is disabled.
Default value: off
safe_data_path
Parameter description: Specifies the path prefix restriction except for the initial user. Currently, the path prefix restriction applies to the COPY operation and advanced packages.
This is a SIGHUP parameter. Set it based on instructions provided in Table 1.
Value range: a string of less than 4096 characters
Default value: NULL
- If a soft link file exists in the safe_data_path directory, the system processes the file based on the actual file path to which the soft link points. If the actual path is not in the safe_data_path directory, an error is reported.
- If a hard link file exists in the safe_data_path directory, it can be used properly. For security purposes, exercise caution when using hard link files. Do not create hard link files that point to other directories in the safe_data_path directory. Ensure that the permission on the safe_data_path directory is minimized.
enable_copy_server_files
Parameter description: Specifies whether to enable the permission to copy server files.
This is a SIGHUP parameter. Set it based on instructions provided in Table 1.
Value range: Boolean
- on indicates that the permission to copy server files is enabled.
- off indicates that the permission to copy server files is disabled.
Default value: off
When the enable_copy_server_files parameter is disabled, only the initial user is allowed to run the COPY FROM FILENAME or COPY TO FILENAME statement. When the enable_copy_server_files parameter is enabled, users with the SYSADMIN permission or users who inherit the gs_role_copy_files permission of the built-in role are allowed to run the COPY FROM FILENAME or COPY TO FILENAME statement.
support_binary_copy_version
Parameter description: Specifies whether the encoding information of the current database server is included when data is exported in BINARY mode using COPY FROM.
Parameter type: string
Unit: none
Value range: '' and header_encoding.
Default value: header_encoding
Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.
Setting suggestion: Retain the default value. If forward compatibility is required, leave this parameter empty.
Configuration Item |
Behavior |
---|---|
header_encoding |
When the binary mode of COPY FROM is used for export, the binary file header contains the encoding information of the current database server. |
Empty |
Forward compatibility configuration is performed and data is exported in the original binary format. |
copy_special_character_version
Parameter description: Determines the processing of invalid characters during data import and export using COPY.
Parameter type: string
Unit: none
Value range: '', no_error, and per_byte.
Default value: ''
Setting method: This is a USERSET parameter. Set it based on instructions provided in Table 1.
Use gsql to connect to the database. If you use the set method, the value is case-insensitive. If you use gs_guc, the value can only be lowercase.
Setting suggestion: none
Configuration Item |
Behavior |
---|---|
no_error |
When COPY is used to import a data file with the same encoding as that on the server, fault tolerance is performed on the data that does not meet the encoding requirements in the data file. The data with the original codes is inserted into the table. |
per_byte |
Determines how to process files encoded in GBK or ZHS16GBK when COPY is used to export text files. After the parameter is set to per_byte, one byte of data is exported at a time. Otherwise, two bytes of data are exported at a time. (One character occupies two bytes if data is encoded in GBK.) |
Empty |
The default value, which does not affect any function. Forward compatibility is supported. That is, an error is reported when invalid characters are found during COPY. |
- To ensure that the data to be imported is valid, its encoding must be validated when it is being copied. If this parameter is enabled, verification against invalid encoding will be masked, which causes invalid characters in the field. Therefore, exercise caution before enabling this parameter.
- Currently, encoding verification is masked only when the server encoding is the same as the data encoding. Client encoding is used by default if not specified.
- To record invalid code fields, you are advised to use the log errors or log errors data parameter in the COPY syntax.
- In binary mode, copy_special_character_version is set to 'no_error', and it takes effect only for fields of the TEXT, CHAR, VARCHAR, NVARCHAR2, or CLOB type.
- This parameter is valid only in the database with character sets encoded in UTF-8, GB18030, GB18030_2022, ZHS16GBK, or LATIN1.
- When the encoding of both the client and server is GBK or ZHS16GBK and the database contains data encoded in an invalid format, if copy_special_character_version is not set to per_byte, the exported data file may contain unexpected data.
- If copy_special_character_version is set to no_error, this parameter cannot be used together with the COMPATIABLE_ILLEGAL_CHARS parameter in COPY.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot