Help Center/ TaurusDB/ User Guide/ Data Migration/ Importing Data Using LOAD DATA or LOAD XML
Updated on 2026-06-04 GMT+08:00

Importing Data Using LOAD DATA or LOAD XML

Scenarios

You can use the statement LOAD DATA FROM OBS or LOAD XML FROM OBS to load data from files stored in OBS buckets. The files are downloaded to local temporary storage in segments and then imported to the database. Once the import is complete, the local files are deleted.

Text files in OBS can be imported to TaurusDB single-node, cluster, and serverless instances.

Use LOAD DATA FROM OBS to load data from text files or a manifest file in an OBS bucket. Use SELECT INTO OUTFILE OBS to export and save data to an OBS bucket.

Use LOAD XML FROM OBS to load XML files that comply with the community-standard syntax and have been uploaded to OBS.

Supported Versions

The kernel version must be 2.0.75.260300 or later.

For details about how to check the kernel version, see How Can I Check the Version of a TaurusDB Instance?

Prerequisites

  • The parameter rds_load_from_obs_enable must be set to ON. For details about how to set it, see Modifying Parameters of a DB Instance.
  • The user that issues the load statement must have INSERT privileges on the database table into which data will be loaded. If the REPLACE parameter will be used, DELETE privileges on that database table are also required.
  • You must grant the TaurusDB cluster permission to access OBS first.
    • If you enable the function of loading data from OBS buckets using a Huawei Cloud account, no additional configuration is required. If you enable this function as an IAM user for the first time, you must have permission to create an agency.
    • To load data from OBS buckets, you must have the following IAM permissions:
      Table 1 IAM permissions and agencies

      IAM Policy

      IAM Permissions

      Agency

      Role/Policy-based policy

      • iam:agencies:listAgencies
      • iam:agencies:createAgency
      • iam:permissions:listRolesForAgencyOnProject
      • iam:permissions:grantRoleToGroupOnProject
      • iam:permissions:grantRoleToAgencyOnProject
      • iam:roles:listRoles
      • iam:roles:createRole

      If you do not have these permissions, create a custom policy.

      The system automatically creates a DbsObsLoadAgency agency. This agency is visible to you. Deleting the agency will result in failures to load data from OBS buckets.

      Identity policy-based policy

      iam:agencies:list

      If you do not have these permissions, create a custom identity policy.

      Create a DbsObsLoadAgency agency. If you have already created the agency, skip this step.

Constraints

Table 2 Constraints

Item

Constraints

LOAD DATA

  • Do not initiate multiple LOAD DATA requests at the same time. When multiple LOAD DATA requests are initiated, SQL transactions may time out due to highly concurrent write operations, table locking, and system I/O occupation, causing all the LOAD DATA requests to fail.
  • When importing a table containing virtual columns using LOAD DATA, enable strict SQL mode by setting sql_mode to STRICT_TRANS_TABLES. Otherwise, data precision loss or overflow may occur in the virtual columns. For details about how to modify parameters, see Modifying Parameters of a DB Instance.
  • Do not import XML files using LOAD DATA. Otherwise, data may be disordered.

LOAD XML

  • Only parameters in the LOAD XML syntax tree are supported.
  • When importing a table containing virtual columns using LOAD XML, set sql_mode to enable strict SQL mode. Otherwise, data precision loss or overflow may occur in the virtual columns.
  • LOAD XML cannot be used to import data to views based on partitioned tables.
  • Only XML files can be imported.

Impact on performance

Importing and exporting data can affect performance of TaurusDB instances. Import data during off-peak hours.

Creating a DbsObsLoadAgency Agency

  1. Log in to the IAM console.
  2. In the navigation pane, choose Agencies. On the displayed page, search for DbsObsLoadAgency.

    • If the agency does not exist, go to 3.
    • If the agency already exists, click Authorize in the Operation column and go to 6.

  3. On the displayed page, set agency parameters.

    Figure 1 Creating an agency

    • Agency Name: Enter DbsObsLoadAgency.
    • Agency Type: Select Cloud service.
    • Cloud Service: Search for op_svc_rds and select Database Service (DBS).
    • Validity Period: Select Unlimited.

  4. Click Done.
  5. In the displayed dialog box, click Authorize.
  6. Search for DbsObsLoad, select it in the list, and click Next.

  7. Select Region-specific projects and the target region, and click OK.

LOAD DATA FROM OBS

  • Syntax
    LOAD DATA [FROM] OBS [FILE | MANIFEST] 'OBS-URI'
        [REPLACE | IGNORE]
        INTO TABLE tbl_name
        [PARTITION (partition_name,...)]
        [CHARACTER SET charset_name]
        [{FIELDS | COLUMNS}
            [TERMINATED BY 'string']
            [[OPTIONALLY] ENCLOSED BY 'char']
            [ESCAPED BY 'char']
        ]
        [LINES
            [STARTING BY 'string']
            [TERMINATED BY 'string']
        ]
        [IGNORE number {LINES | ROWS}]
        [(col_name_or_user_var,...)]
        [SET col_name = expr,...]

    The LOAD DATA FROM OBS statement uses the following required and optional parameters. You can find more details in the official MySQL documentation.

    • FROM: optional.
    • OBS: specifies that data will be loaded from OBS. If [FROM] OBS is used to load data from an OBS bucket, the LOCAL INFILE keyword cannot be used.
    • FILE | MANIFEST: optional. The default value is FILE.
      • FILE: loads data from a single file in OBS.
      • MANIFEST: loads data from all files listed in a specified OBS manifest file.

        A manifest file is a JSON file that lists the text files to be loaded to the database. Its format and content are as follows:

        {
            "entries": [
                {
                    "url": "obs://bucket/customermeta",
                    "mandatory": true
                },
                {
                    "url": "obs://bucket/customerdata",
                    "mandatory": true
                }
            ]
        }

        Each URL in the manifest file must be a complete object path with a bucket name and file name. The format is fixed to obs://bucket/object. mandatory indicates whether an error message is returned if any specified file cannot be found. The default value is true. Do not change it.

      • OBS_URI

        Specifies the URI of the text or manifest file to be imported. The format is fixed to obs://bucket/object. You can copy the URI of the object to be imported on the OBS console. For details, see Downloading an Object.

        • bucket: the name of the OBS bucket where the specified text or manifest file is located.
        • object: the name of the text, manifest, or XML file to be imported. If a manifest file is to be imported, the object name is object.manifest.
      • REPLACE | IGNORE: determines what action to take for a new row if a primary key or unique key conflict occurs. This parameter is optional. The default value is IGNORE.
        • REPLACE: The new row will replace the existing row if a conflict occurs.
        • IGNORE: The new row will be discarded if a conflict occurs.
      • PARTITION: optional. It specifies the partitions or subpartitions to which data is imported.
      • CHARACTER SET: optional. It specifies the character set of the file to be imported. If this parameter is not specified, the character set specified by the system variable character_set_filesystem is used by default.
      • FIELDS | COLUMNS: optional. It specifies how the fields in the file are delimited.
        • TERMINATED BY: defines the delimiter between fields. Fields are tab-delimited (\t) by default.
        • ENCLOSED BY: defines the character used to enclose field values. If OPTIONALLY is used, only character-type fields are enclosed. Otherwise, all fields are enclosed.
        • ESCAPED BY: defines the escape character. The default value is a backslash (\).
      • LINES: optional. It determines how the lines in the file are delimited.
        • STARTING BY: defines the prefix string to be ignored in each line.
        • TERMINATED BY: defines the line terminator. Lines are delimited by a newline character (\n) by default.
      • IGNORE: optional. It specifies the number of lines to be ignored at the beginning of the file.
      • col_name_or_user_var: optional. It specifies the mapping between a column in the file and a column or variable in the target table.
      • SET: optional. It specifies the conversion or calculation logic to be applied to the imported column values.
  • Examples
    • Importing a single file

      Load the result.txt file from an OBS bucket into table t in a TaurusDB database.

      LOAD DATA FROM OBS 'obs://bucket/result.txt' INTO TABLE t;
    • Importing a manifest file

      Load the t.manifest file from an OBS bucket into table t in a TaurusDB database. This statement imports all data files listed in the manifest file to the table.

      LOAD DATA FROM OBS MANIFEST 'obs://bucket/t.manifest' INTO TABLE t;
    • Importing a single file with the specified character set, column delimiter, and line delimiter
      In this statement, fields are separated by commas (,) and lines are separated by newline characters.
      LOAD DATA FROM OBS FILE 'obs://bucket/test/tt.csv' INTO TABLE tt CHARACTER SET utf8mb4 FIELDS TERMINATED BY ','  LINES TERMINATED BY '\n';
    • Importing a manifest file with the specified character set, column delimiter, escape character, line delimiter, and number of lines to ignore
      This statement replaces existing data with new data when a unique key conflict occurs. It also specifies utf8mb4 as the character set, a comma (,) as the field delimiter, and a backslash (\) as the escape character. Furthermore, it sets the line delimiter to a newline character (\n), ignores the first line of the file, and imports data only into the specified columns i1, t2, and c3.
      LOAD DATA FROM OBS MANIFEST 'obs://bucket/tt.manifest'
      REPLACE INTO TABLE tt
      CHARACTER SET utf8mb4
      FIELDS TERMINATED BY ',' 
      OPTIONALLY ENCLOSED BY '""'
      ESCAPED BY '\\'
      LINES TERMINATED BY '\n'
      IGNORE 1 LINES
      (i1, t2, c3);

LOAD XML FROM OBS

  • Syntax
    LOAD XML FROM OBS 'OBS-URI'
        [REPLACE | IGNORE]
        INTO TABLE tbl_name
        [PARTITION (partition_name,...)]
        [CHARACTER SET charset_name]
        [ROWS IDENTIFIED BY '<element-name>']
        [IGNORE number {LINES | ROWS}]
        [(field_name_or_user_var,...)]
        [SET col_name = expr,...]

    LOAD XML FROM OBS uses the following mandatory and optional parameters. For more information, see the MySQL official documentation.

    • OBS_URI: specifies the URI of the text file to be imported. The format is fixed to obs://bucket/object. You can copy the URI of the object to be imported on the OBS console. For details, see Downloading an Object.
      • bucket: the name of the OBS bucket from which target text files are loaded.
      • object: the name of the text or XML file to be imported. Manifest files are not supported.
    • REPLACE | IGNORE: determines what action to take for a new row if a primary key or unique key conflict occurs. This parameter is optional. The default value is IGNORE.
      • REPLACE: The new row will replace the existing row if a conflict occurs.
      • IGNORE: The new row will be discarded if a conflict occurs.
    • PARTITION: optional. It specifies the partitions or subpartitions to which data is imported.
    • CHARACTER SET: optional. It specifies the character set of the file to be imported. If this parameter is not specified, the character set specified by the system variable character_set_filesystem is used by default.
    • ROWS IDENTIFIED BY: optional. It specifies the row identifier. A tag and its subelements form a table record.
    • IGNORE: optional. It specifies the number of lines to be ignored at the beginning of the file.
    • field_name_or_user_var: optional. It specifies the mapping between a column in the file and a column or variable in the target table.
    • SET: optional. It specifies the conversion or assignment logic to be applied to column values before they are written to the table.
  • Examples
    • Importing an XML file

      Load the test.xml file from an OBS bucket into table t in a TaurusDB database.

      LOAD XML FROM OBS 'obs://bucket/test.xml' INTO TABLE t;
    • Importing an XML file with the specified character set, field delimiter, and line delimiter
      LOAD XML FROM OBS FILE 'obs://bucket/test//tt.csv' INTO TABLE tt CHARACTER SET utf8mb4 FIELDS TERMINATED BY ','  LINES TERMINATED BY '\n';

Common Errors

The following table lists OBS error codes that may appear in error messages when you import a text file.

Table 3 OBS error codes

Error Code

Error Message

Measures

HttpErrorNotFound

Object not found.

Check whether the object exists in the OBS bucket.

InvalidAccessKeyId

Invalid AK.

The SQL statement execution takes an extended period, causing the temporary AK to expire. Avoid exporting a large amount of data at a time.

For more information, see OBS Error Codes.