Updated on 2024-02-02 GMT+08:00

Migrating Metadata

Scenario

Migrate external metadata to LakeFormation and store the data in OBS for unified management.

Prerequisites

  • The current instance has been interconnected with the service whose metadata is to be migrated.
  • A catalog for storing migration metadata has been created for the current instance.
  • The target user has the permission to perform operations on OBS and the catalog for storing migration metadata.
  • You have created an OBS parallel file system for storing migrated data.
  • The name of the table owner can contain 1 to 49 characters, including only letters, digits, and underscores (_). The value cannot contain other characters such as hyphens (-).

Procedure

  1. Log in to the LakeFormation console.
  2. In the upper left corner, click and choose Analytics > LakeFormation to access the LakeFormation console.
  3. Select the LakeFormation instance to be operated from the drop-down list on the left and choose Tasks > Metadata Migration in the navigation pane.
  4. Click Create Migration Task, set related parameters, and click Submit.

    Table 1 Creating a metadata migration task

    Parameter

    Description

    Task Name

    Name of the metadata migration task.

    Description

    Description of the created migration task.

    Data Source

    Type of the data to be migrated. The options are as follows:

    • DLF
    • MRS RDS for MySQL
    • OpenSource HiveMetastore for MySQL
    • MRS RDS FOR PostgreSQL
    • MRS LOCAL GaussDB

    JDBC URL

    URL of the JDBC link of the metadata to be migrated, followed by ?useSSL=false&permitMysqlScheme.

    NOTE:

    Some examples are as follows:

    • JDBC URL of the MySQL data source type: jdbc:mysql://IP address:Port number/Database name? useSSL=false&permitMysqlScheme
    • JDBC URL of the PostgreSQL data source type: jdbc:postgresql://IP address:Port number/Database name? useSSL=false&permitMysqlScheme

    Set this parameter when Data Source is not set to DLF. Set the following parameters when Data Source is not set to DLF:

    • Username: username for accessing the data source.
    • Password: password for accessing the data source.

      If the user has a password, this parameter is mandatory. Otherwise, leave this parameter blank.

    Access Point

    Access point of the metadata service to be migrated.

    This parameter is displayed when Data Source is set to DLF. In addition, you need to set the following parameters:

    • Access Key: Obtain the AK from DLF O&M personnel.
    • Secret Key: Obtain the SK from DLF O&M personnel.

    Source Catalog

    Name of the catalog to which the metadata to be migrated belongs.

    Target Catalog

    Name of the catalog to which metadata is migrated in LakeFormation.

    Conflict Resolution

    Policy for resolving conflicts during migration.

    Currently, only Update old metadata is supported.

    Default Owner

    Default owner of metadata after migration.

    • If the configured default owner does not have the corresponding metadata operation permission, the migrated metadata cannot be added, deleted, modified, or queried. In this case, you can grant permissions to the owner or migrate permissions.
    • If all metadata can be used properly before the migration, you do not need to set this parameter.

    Log Path

    Storage location of logs generated during migration. Click to select a path.

    The path must exist in OBS. If the path is customized, the migration task will fail.

    Force Table Creation

    Selecting this option will bypass OBS path restrictions when creating an internal table.

    Metadata Objects to Migrate

    Select the metadata objects to be migrated.

    • All: Migrate databases, functions, data tables, and partitions.
    • Database: Migrate databases.
    • Function:Migrate functions.
    • Table: Migrate tables.
    • Partition: Migrate partitions.
      NOTE:
      • Select All to migrate all metadata for the first migration task.
      • Ensure that the upper-level directory of the selected metadata exists if All is not selected. For example, you need to ensure that the target catalog contains the database (for example, DB_1) where the tables are located if you plan to set this parameter to Table. Otherwise, the table migration will fail.
      • Ensure that the function class name exists if you plan to set this parameter to Function to guarantee a successful function migration task.

    Add Location Rule

    • If the prefix of the metadata storage path is not obs://, click Add Location Rule to replace the prefix with obs:// and ensure that the corresponding OBS storage path exists.

      For example, if the current metadata storage path is file:/a/b, set Original Path to file:/ and New Path to obs://. Ensure that the obs://a/b path exists in the OBS parallel file system, the new metadata storage path is obs://a/b.

    • You can create multiple rules at the same time. If a rule conflict occurs, the rule on the top of the page prevails.

  5. Click Start in the Operation column to run the migration task.

    • Before running a migration task, you need to authorize the task by referring to Granting the Job Management Permission.
    • After the migration task starts, if new metadata is added to the source database, the new metadata will not be migrated. You need to run the migration task again to migrate the new metadata.
    • If the task fails to be executed, you can click Start in the Operation column to retry after rectifying the fault.

    You can click Metadata on the navigation pane and click the name of target metadata object to view the metadata object after the migration. For example, choose Metadata > Database to view the migrated database.

    Click Edit or Delete in the Operation column to modify or delete a task.

  6. Click View Log in the Operation column to view the logs generated during task running.

    By default, the latest 50 lines of logs are displayed. You can click the hyperlink at the bottom of the log to view the complete log.

    The following table lists some error messages in logs and their causes.

    Error Message

    Cause

    field 'storageDescriptor.location' must match '^(obs|har)://.+/.+$'

    Incorrect location rule is configured. (The metadata storage path should start with obs://.)

    Invalid input parameter

    The input parameter of the metadata is invalid or LakeFormation does not support such metadata.

    Incorrect type of column xxx.

    The column type is invalid or LakeFormation is incompatible with the column type.

    No permission to perform this operation on resources.

    The default owner is incorrectly configured or the owner does not have the metadata operation permission.