Importing Data Using LOAD DATA or LOAD XML
Scenarios
You can use the statement LOAD DATA FROM OBS or LOAD XML FROM OBS to load data from files stored in OBS buckets. The files are downloaded to local temporary storage in segments and then imported to the database. Once the import is complete, the local files are deleted.
Text files in OBS can be imported to TaurusDB single-node, cluster, and serverless instances.
Use LOAD DATA FROM OBS to load data from text files or a manifest file in an OBS bucket. Use SELECT INTO OUTFILE OBS to export and save data to an OBS bucket.
Use LOAD XML FROM OBS to load XML files that comply with the community-standard syntax and have been uploaded to OBS.
Supported Versions
The kernel version must be 2.0.75.260300 or later.
For details about how to check the kernel version, see How Can I Check the Version of a TaurusDB Instance?
Prerequisites
- The parameter rds_load_from_obs_enable must be set to ON. For details about how to set it, see Modifying Parameters of a DB Instance.
- The user that issues the load statement must have INSERT privileges on the database table into which data will be loaded. If the REPLACE parameter will be used, DELETE privileges on that database table are also required.
- You must grant the TaurusDB cluster permission to access OBS first.
- If you enable the function of loading data from OBS buckets using a Huawei Cloud account, no additional configuration is required. If you enable this function as an IAM user for the first time, you must have permission to create an agency.
- To load data from OBS buckets, you must have the following IAM permissions:
Table 1 IAM permissions and agencies IAM Policy
IAM Permissions
Agency
Role/Policy-based policy
- iam:agencies:listAgencies
- iam:agencies:createAgency
- iam:permissions:listRolesForAgencyOnProject
- iam:permissions:grantRoleToGroupOnProject
- iam:permissions:grantRoleToAgencyOnProject
- iam:roles:listRoles
- iam:roles:createRole
If you do not have these permissions, create a custom policy.
The system automatically creates a DbsObsLoadAgency agency. This agency is visible to you. Deleting the agency will result in failures to load data from OBS buckets.
Identity policy-based policy
iam:agencies:list
If you do not have these permissions, create a custom identity policy.
Create a DbsObsLoadAgency agency. If you have already created the agency, skip this step.
Constraints
| Item | Constraints |
|---|---|
| LOAD DATA |
|
| LOAD XML |
|
| Impact on performance | Importing and exporting data can affect performance of TaurusDB instances. Import data during off-peak hours. |
Creating a DbsObsLoadAgency Agency
- Log in to the IAM console.
- In the navigation pane, choose Agencies. On the displayed page, search for DbsObsLoadAgency.
- On the displayed page, set agency parameters. Figure 1 Creating an agency
- Agency Name: Enter DbsObsLoadAgency.
- Agency Type: Select Cloud service.
- Cloud Service: Search for op_svc_rds and select Database Service (DBS).
- Validity Period: Select Unlimited.
- Click Done.
- In the displayed dialog box, click Authorize.
- Search for DbsObsLoad, select it in the list, and click Next.

- Select Region-specific projects and the target region, and click OK.

LOAD DATA FROM OBS
- Syntax
LOAD DATA [FROM] OBS [FILE | MANIFEST] 'OBS-URI' [REPLACE | IGNORE] INTO TABLE tbl_name [PARTITION (partition_name,...)] [CHARACTER SET charset_name] [{FIELDS | COLUMNS} [TERMINATED BY 'string'] [[OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char'] ] [LINES [STARTING BY 'string'] [TERMINATED BY 'string'] ] [IGNORE number {LINES | ROWS}] [(col_name_or_user_var,...)] [SET col_name = expr,...]
The LOAD DATA FROM OBS statement uses the following required and optional parameters. You can find more details in the official MySQL documentation.
- FROM: optional.
- OBS: specifies that data will be loaded from OBS. If [FROM] OBS is used to load data from an OBS bucket, the LOCAL INFILE keyword cannot be used.
- FILE | MANIFEST: optional. The default value is FILE.
- FILE: loads data from a single file in OBS.
- MANIFEST: loads data from all files listed in a specified OBS manifest file.
A manifest file is a JSON file that lists the text files to be loaded to the database. Its format and content are as follows:
{ "entries": [ { "url": "obs://bucket/customermeta", "mandatory": true }, { "url": "obs://bucket/customerdata", "mandatory": true } ] }Each URL in the manifest file must be a complete object path with a bucket name and file name. The format is fixed to obs://bucket/object. mandatory indicates whether an error message is returned if any specified file cannot be found. The default value is true. Do not change it.
- OBS_URI
Specifies the URI of the text or manifest file to be imported. The format is fixed to obs://bucket/object. You can copy the URI of the object to be imported on the OBS console. For details, see Downloading an Object.
- bucket: the name of the OBS bucket where the specified text or manifest file is located.
- object: the name of the text, manifest, or XML file to be imported. If a manifest file is to be imported, the object name is object.manifest.
- REPLACE | IGNORE: determines what action to take for a new row if a primary key or unique key conflict occurs. This parameter is optional. The default value is IGNORE.
- REPLACE: The new row will replace the existing row if a conflict occurs.
- IGNORE: The new row will be discarded if a conflict occurs.
- PARTITION: optional. It specifies the partitions or subpartitions to which data is imported.
- CHARACTER SET: optional. It specifies the character set of the file to be imported. If this parameter is not specified, the character set specified by the system variable character_set_filesystem is used by default.
- FIELDS | COLUMNS: optional. It specifies how the fields in the file are delimited.
- TERMINATED BY: defines the delimiter between fields. Fields are tab-delimited (\t) by default.
- ENCLOSED BY: defines the character used to enclose field values. If OPTIONALLY is used, only character-type fields are enclosed. Otherwise, all fields are enclosed.
- ESCAPED BY: defines the escape character. The default value is a backslash (\).
- LINES: optional. It determines how the lines in the file are delimited.
- STARTING BY: defines the prefix string to be ignored in each line.
- TERMINATED BY: defines the line terminator. Lines are delimited by a newline character (\n) by default.
- IGNORE: optional. It specifies the number of lines to be ignored at the beginning of the file.
- col_name_or_user_var: optional. It specifies the mapping between a column in the file and a column or variable in the target table.
- SET: optional. It specifies the conversion or calculation logic to be applied to the imported column values.
- Examples
- Importing a single file
Load the result.txt file from an OBS bucket into table t in a TaurusDB database.
LOAD DATA FROM OBS 'obs://bucket/result.txt' INTO TABLE t;
- Importing a manifest file
Load the t.manifest file from an OBS bucket into table t in a TaurusDB database. This statement imports all data files listed in the manifest file to the table.
LOAD DATA FROM OBS MANIFEST 'obs://bucket/t.manifest' INTO TABLE t;
- Importing a single file with the specified character set, column delimiter, and line delimiter
- Importing a manifest file with the specified character set, column delimiter, escape character, line delimiter, and number of lines to ignore This statement replaces existing data with new data when a unique key conflict occurs. It also specifies utf8mb4 as the character set, a comma (,) as the field delimiter, and a backslash (\) as the escape character. Furthermore, it sets the line delimiter to a newline character (\n), ignores the first line of the file, and imports data only into the specified columns i1, t2, and c3.
LOAD DATA FROM OBS MANIFEST 'obs://bucket/tt.manifest' REPLACE INTO TABLE tt CHARACTER SET utf8mb4 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '""' ESCAPED BY '\\' LINES TERMINATED BY '\n' IGNORE 1 LINES (i1, t2, c3);
- Importing a single file
LOAD XML FROM OBS
- Syntax
LOAD XML FROM OBS 'OBS-URI' [REPLACE | IGNORE] INTO TABLE tbl_name [PARTITION (partition_name,...)] [CHARACTER SET charset_name] [ROWS IDENTIFIED BY '<element-name>'] [IGNORE number {LINES | ROWS}] [(field_name_or_user_var,...)] [SET col_name = expr,...]
LOAD XML FROM OBS uses the following mandatory and optional parameters. For more information, see the MySQL official documentation.
- OBS_URI: specifies the URI of the text file to be imported. The format is fixed to obs://bucket/object. You can copy the URI of the object to be imported on the OBS console. For details, see Downloading an Object.
- bucket: the name of the OBS bucket from which target text files are loaded.
- object: the name of the text or XML file to be imported. Manifest files are not supported.
- REPLACE | IGNORE: determines what action to take for a new row if a primary key or unique key conflict occurs. This parameter is optional. The default value is IGNORE.
- REPLACE: The new row will replace the existing row if a conflict occurs.
- IGNORE: The new row will be discarded if a conflict occurs.
- PARTITION: optional. It specifies the partitions or subpartitions to which data is imported.
- CHARACTER SET: optional. It specifies the character set of the file to be imported. If this parameter is not specified, the character set specified by the system variable character_set_filesystem is used by default.
- ROWS IDENTIFIED BY: optional. It specifies the row identifier. A tag and its subelements form a table record.
- IGNORE: optional. It specifies the number of lines to be ignored at the beginning of the file.
- field_name_or_user_var: optional. It specifies the mapping between a column in the file and a column or variable in the target table.
- SET: optional. It specifies the conversion or assignment logic to be applied to column values before they are written to the table.
- OBS_URI: specifies the URI of the text file to be imported. The format is fixed to obs://bucket/object. You can copy the URI of the object to be imported on the OBS console. For details, see Downloading an Object.
- Examples
- Importing an XML file
Load the test.xml file from an OBS bucket into table t in a TaurusDB database.
LOAD XML FROM OBS 'obs://bucket/test.xml' INTO TABLE t;
- Importing an XML file with the specified character set, field delimiter, and line delimiter
LOAD XML FROM OBS FILE 'obs://bucket/test//tt.csv' INTO TABLE tt CHARACTER SET utf8mb4 FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n';
- Importing an XML file
Common Errors
The following table lists OBS error codes that may appear in error messages when you import a text file.
| Error Code | Error Message | Measures |
|---|---|---|
| HttpErrorNotFound | Object not found. | Check whether the object exists in the OBS bucket. |
| InvalidAccessKeyId | Invalid AK. | The SQL statement execution takes an extended period, causing the temporary AK to expire. Avoid exporting a large amount of data at a time. |
For more information, see OBS Error Codes.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot