Scanning for Incremental Metadata

Identify the metadata that has been changed in the source databases after the previous migration is complete, and synchronize the incremental metadata to Huawei Cloud CLI.

Prerequisites

A source connection has been created.
Target connections have been created.
At least a full metadata migration has been completed.

Preparations

Gain whitelist access to the Spark 3.3.1 feature.
Contact technical support to whitelist you to use the Spark 3.3.1 feature.
Configure a DLI job bucket.
You need to purchase a bucket or parallel file system on OBS. The bucket is used to store temporary data generated by DLI. For details, see Configuring a DLI Job Bucket.

Procedure

Sign in to the MgC console. In the navigation pane, under Project, select your big data migration project from the drop-down list.
In the navigation pane on the left, choose Migrate > Big Data Migration.
In the upper right corner of the page, click Create Migration Task.
Select MaxCompute for Source Component, Data Lake Insight (DLI) for Target Component, Incremental metadata scan for Task Type, and click Next.

Configure parameters required for creating an incremental metadata scan based on Table 1.

**Table 1** Parameters required for creating an incremental metadata scan
Area	Parameter	Configuration
Basic Settings	Task Name	The default name is Incremental-data-scan-of-MaxCompute-and-DLI-4 random characters (including letters and numbers). You can also customize a name.
Basic Settings	MgC Agent	Select the MgC Agent you connected to MgC in Preparations.
Source Settings	Source Connection	Select the source connection you created.
	Time Range	If you select All, MgC will scan for the incremental metadata generated in the source databases since the last metadata migration. If you select Custom, select a T-N option to limit the scan scope to the incremental metadata generated within a specific time period (24 × N hours) before the task start time (T). Assume that you select T-1 and the task was executed at 14:50 on June 6, 2024. The system scans for incremental metadata generated from 14:50 on June 5, 2024 to 14:50 on June 6, 2024.
	MaxCompute Parameters (Optional)	The parameters are optional and usually left blank. If needed, you can configure the parameters by referring to MaxCompute Documentation.
Data Scope	By database	Enter the names of databases to be scanned in the Include Databases text box. Click Add to add more entries. A maximum of 10 databases can be added. If there are tables you do not want to migrate, download the template in CSV format, add information about these tables to the template, and upload the template to MgC. For details, see steps 2 to 5.
Data Scope	By table	Download the template in CSV format. Open the downloaded CSV template file with Notepad. CAUTION: Do not use Excel to edit the CSV template file. The template file edited and saved in Excel cannot be identified by MgC. Retain the first line in the CSV template file. From the second line onwards, enter the information about tables to be migrated in the format of {MaxComute project name},{Table name}. MaxComute project name refers to the name of the MaxCompute project to be migrated. Table name refers to the data table to be migrated. NOTICE: Use commas (,) to separate the MaxCompute project name and the table name in each line. Do not use spaces or other separators. After adding the information about a table, press Enter to start a new line. After all table information is added, save the changes to the CSV file. Upload the edited and saved CSV file to MgC.
Target Settings	Target Connection	Select the DLI connection with a SQL queue created in Creating a Target Connection. CAUTION: Do not select the connection with a general queue configured.
Target Settings	Custom Parameters (Optional)	Configure the parameters as needed. For details, see Configuration parameter description and Custom Parameters.
Migration Settings	Concurrency	Set the number of concurrent migration subtasks. The default value is 3. The value ranges from 1 to 10.

After the configuration is complete, execute the task.
- A migration task can be executed repeatedly. Each time a migration task is executed, a task execution is generated.
- You can click the task name to modify the task configuration.
- You can select Run immediately and click Save to create the task and execute it immediately. You can view the created task on the Tasks page.
- You can also click Save to just create the task. You can view the created task on the Tasks page. To execute the task, click Execute in the Operation column.
After the migration task is executed, click View Executions in the Operation column. On the Task Executions tab, you can view the details of the running task execution and all historical executions.

Click View in the Progress column. On the displayed Progress Details page, view and export the incremental metadata scan results.
In the upper right corner of the progress details page, click Open DDL Editor to compare and edit the structures of incremental tables.

Parent topic: Creating a Migration Task

Previous topic: Migrating Full Metadata

Next topic: Migrating Full Data

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot