Help Center/ Migration Center/ Best Practices/ Storage Migration/ Migrating Data from Multiple Source Buckets by Prefix
Updated on 2025-02-17 GMT+08:00

Migrating Data from Multiple Source Buckets by Prefix

This section describes how to filter objects for migration in source buckets using prefixes and migrate the objects to Huawei Cloud OBS buckets.

Preparations

  • Prepare a HUAWEI ID or an IAM user that can access MgC. For details, see Preparations.
  • Create an application migration project on the MgC console.
  • Add the AK/SK pair used for accessing the source cloud platform to MgC. The AK/SK pair will be used to collect details about source buckets. For more information, see Adding Resource Credentials.
  • Ensure that the source and target accounts have the permissions required for the migration. For details, see How Do I Obtain Required Permissions for the Source and Target Accounts?
  • On Huawei Cloud, create an OBS bucket for receiving migrated data. For details, see Creating a Bucket. You can also use as existing bucket.
  • Create a prefix list for each source bucket to be migrated. A prefix list must meet the following requirements:
    • The list must be in .txt format and the file size cannot exceed 2 MB.
    • Each line in the file can only contain one prefix, and each prefix cannot be longer than 1,024 characters.
    • A maximum of 1,000 prefixes can be contained in a file.

Precautions

  • Supported regions
    You can use MgC to migrate object data in batches to the following regions. To migrate to other regions, use RDA or other solutions.
    • LA-Santiago
    • LA-Sao Paulo
    • TR-Istanbul
    • AP-Bangkok
    • AP-Singapore
    • AP-Jakarta
    • ME-Riyadh
    • CN North-Beijing4
    • CN East-Shanghai1
  • Migration over intranets

    Data can be migrated between buckets in the same region over the intranet.

  • Symbolic link processing
    MgC cannot migrate symbolic links. If the migration path you specify is pointed to by a symbolic link, you need to:
    • Enter the actual path to be migrated when creating a migration workflow.
    • After the migration is complete, manually create a symbolic link to the path at the target.

Step 1: Discovering Source Buckets

  1. Sign in to the MgC console. In the navigation pane, under Project, select your application migration project from the drop-down list.
  2. In the navigation pane, choose Discover > Source Resources.
  3. Under Online Discovery, click Cloud Discovery.

    Figure 1 Cloud platform discovery

  4. Set parameters in the Basic Settings and Task Settings areas based on Table 1.

    Table 1 Parameters in the Basic Settings and Task Settings areas

    Area

    Parameter

    Description

    Mandatory

    Basic Settings

    Task Name

    Enter a task name.

    Yes

    Task Description

    Describe the task.

    No

    Task Settings

    Source Platform

    Select the source cloud platform. In this example, select Huawei Cloud.

    Yes

    Credential

    Select the source credential added in Preparations. If you did not add the credential, click Create. In the displayed area, set Authentication to AK/SK, enter the AK/SK pair of the source account, and click Verify and Save.

    Yes

    Region

    Select the regions where your source resources are located.

    Yes

    Resource Type

    Select Object Storage from the drop-down list.

    Yes

    Application

    Select the application that you want to group the discovered resources into. If no applications are available, perform the following steps to create one:

    1. Click Create Application, enter an application name and description, select a business scenario and running environment, and select the region where the application resources will be deployed on the target cloud.
    2. Click OK.

    No

  5. Click Confirm. After the task for discovering object storage resources over the Internet is created, the system automatically starts collecting resource details.

    Wait until the task status changes to Succeeded that indicates the collection is complete.

Step 2: Create a Migration Cluster

Additional charges incur for migration clusters. For details, see Billing.

To ensure migration stability and data security, you are not allowed to log in to nodes in migration clusters. If you indeed need to log in to the nodes, contact technical support.

  1. Sign in to the MgC console. In the navigation pane, under Project, select an application migration project from the drop-down list.
  2. In the navigation pane, choose Deploy > Migration Clusters.
  3. Click Create Cluster in the upper right corner of the page.

    If this is your first time to create a migration cluster, you need to delegate MgC the required permissions. Click to view the permissions to be assigned.

  4. Configure the parameters listed in Table 2.

    Table 2 Parameters for creating a cluster

    Area

    Parameter

    Configuration

    Constraints

    Basic Settings

    Cluster Name

    Enter a name.

    The cluster name must be unique in the same account.

    Region

    Select the region to provision the cluster.

    The cluster must be provisioned in the target region you are migrating to.

    Cluster Type

    Select what the cluster will be used for.

    Currently, only storage migration is supported.

    Node Settings

    Master Node

    It is used to manage migration nodes and list nodes.

    A cluster can only have one master node.

    Migration Node

    Migration nodes are used for executing migration and verification tasks. The recommended specifications are 8 vCPUs and 16 GB of memory.

    • The node specifications cannot be modified after the cluster is created.
    • The number of nodes must meet the following requirements:
      • Number of migration nodes + Number of list nodes + 1 ≤ 100
      • Number of migration nodes + Number of list nodes + 1 ≤ Number of unused IP addresses in the subnet

    List Node

    List nodes are used for listing tasks. The recommended specifications are 8 vCPUs and 16 GB of memory.

    Network Settings

    VPC

    Select a VPC from the drop-down list.

    -

    Subnet

    Make sure that there are enough unused IP addresses for the migration and list nodes in this cluster.

    Number of unused IP addresses in the subnet ≥ Number of migration nodes + Number of list nodes + 1

    Network Type

    • Internet: You need to select a public NAT gateway. If there is no gateway available, choose Buy Gateway from the drop-down list and select the gateway specifications and EIPs you want to associate with the gateway. A maximum of 20 EIPs can be selected at a time.
    • Intranet: This option is suitable for data migration within a region.
    • Private line: Source data is directly accessed through the private line. For details about Direct Connect, see Direct Connect.

    -

    Advanced Settings

    DNS Configuration (Optional)

    Enter the IP address of the DNS server to update the value of nameserver in the /etc/resolv.conf file. Use commas (,) to separate multiple DNS server addresses, for example, 192.0.2.1,192.0.2.2.

    A maximum of three DNS IP addresses can be specified.

    Domain Mapping (Optional)

    Add mappings between domain names and IP addresses to update the /etc/hosts file.

    A maximum of 500 mappings can be added.

    -

    Traffic Limiting

    Allocate the maximum bandwidth to be used by the workflow during a specified period.

    • If you do not select this option, migration traffic is not limited.
    • If you select this option, limit the migration traffic by setting the start time, end time, and bandwidth limit.
      NOTICE:

      For example, if you set Start Time to 08:00, End Time to 12:00, and Maximum Bandwidth to 20 MB/s, the maximum migration speed is limited to 20 MB/s when the migration task is running in the period from 08:00 to 12:00. The migration speed is not limited beyond this period.

    • A maximum of five traffic limiting rules can be added.
    • The time is the local standard time of the region you are migrating to.

    Log Collection

    • If this option is enabled, logs generated during the migration are collected for possible troubleshooting later.
    • If this option is disabled, logs generated during storage migrations are not collected.

    -

  5. Click Confirm. Then you can view the cluster in the list. For details about cluster statuses, see Cluster Statuses. If the cluster status is Creation failed, move the cursor to the status to view the failure cause. After the fault is rectified, choose More > Retry to try to create the cluster again.

Step 3: Creating a Migration Plan

  1. Sign in to the MgC console.
  2. In the navigation pane, choose Design > Migration Plans. Click Create Migration Plan in the upper right corner of the page.

  3. In the Batch Object Storage Migration card, click Configure Migration Plan.

  4. In the Basic Settings area, set parameters listed in Table 3.

    Table 3 Basic parameters

    Parameter

    Configuration

    Migration Plan Name

    Enter a name.

    Description (Optional)

    Enter a description.

    Source Platform

    Select the source platform you selected in Step 1. Select Huawei Cloud.

    Target Region

    Select the region you want to migrate to.

  5. Above the source bucket list, click Add.

  6. Select the buckets to be migrated, click Modify in the Operation column, set Migration Method to Prefix migration, and click Save and then Confirm.

    • The selected resources must come from the source platform selected in Basic Settings.
    • A maximum of 100 buckets can be added.

  7. Associate source credentials.

    • To associate a source bucket with a credential, locate the source bucket in the list and click Modify in the Operation column. In the Modify Migration Settings page, select a source credential.
    • To associate multiple source buckets with a credential, select these buckets from the list and click Associate Credentials above the list.

  8. Import the prefix lists.

    Locate a source bucket in the list, and click Import Prefixes in the Operation column. Upload the prefix list file prepared for the bucket and click Confirm.

  9. Confirm that the source buckets have been associated with their credentials and the prefix import is complete for all the buckets. Then click Next to configure the target buckets.
  10. For each bucket, click Modify in the Operation column. Then select the credential used for accessing the target bucket, specify the target bucket, enter a prefix to rename or relocate migrated objects, and click Save.

  11. After you configure a target bucket for each source bucket, click Next. Assess how large of a migration cluster is required for the migration and create a migration cluster in the recommended size. Alternatively, you can skip this step and use an existing migration cluster. For details, see Managing a Migration Cluster.
  12. Click Next. On the displayed page, click Select Cluster to choose an existing migration cluster.
  13. In the displayed cluster list, select the cluster created in step 2 and click Confirm. The source resources in the migration plan will be migrated using the selected cluster.

    Only healthy or subhealthy migration clusters can be selected.

  14. Click OK. After migration plan is created, you can see it in the list.

    • If you need to modify the plan settings, click Design in the Operation column.
    • After Completed appears in the Progress column, click Create Workflow in the Operation column to create a migration workflow to migrate all buckets in the plan in a batch.

Step 4: Creating a Batch Object Storage Migration Workflow

  • A single object cannot be larger than 4.76837158203125 TB (500 MB × 10,000). Otherwise, the migration may fail.
  • During the migration, the system automatically creates a temporary folder named oms in each target bucket. Do not perform any operations on this folder, including but not limited to modifying, deleting, or adding data in the folder. Otherwise, the migration will be interrupted or fail.
  1. Sign in to the MgC console.
  2. In the navigation pane, choose Migrate > Workflows.
  3. Click Create Workflow in the upper right corner of the page.

  4. Select Batch Object Storage Migration and click Configure Workflow.
  5. In the Basic Information area, enter a name and description for the workflow.
  6. In the Migration Plan area, select the migration plan created in Step 3. Then you should view the overview of the migration plan. Click View Details to view more information about the plan.

  7. In the Migration Cluster area, select the cluster used for the migration. The cluster specified in the migration plan is preselected by default, but you can select another one if needed. The modification is applied to the current workflow but not to the migration plan.
  8. Configure the migration settings based on Table 4.

    Table 4 Parameters for configuring a migration task

    Parameter

    Value

    Description

    Concurrent Subtasks

    -

    Set the maximum number of concurrent subtasks. There cannot be more than 10 concurrent subtasks for each online migration node. For example, if there are 2 online migration nodes, the maximum number of subtasks can be 20 or any number below.

    Overwrite Existing

    Never

    Files existing at the migration target are never overwritten.

    WARNING:
    • If you choose Never for the initial migration, the attributes of involved parent folders at the source will not be migrated to the target. As a result, the folder attributes may be incomplete at the target. To avoid this issue, use the Never option with caution for the initial migration.
    • If a migration task is paused or interrupted and then restarted or resumed, the Never option will cause the system to skip files that were not completely migrated earlier, but the task may still be marked as successful. This affects data integrity. To avoid this issue, use the Never option with caution.

    Always

    Files existing at the migration target are always overwritten.

    If older or different size

    • Files that already exist at the target will be overwritten if they are older than or have different sizes from the paired files at the source.
    • Verification will be performed for folders after their contents are migrated. Folders that already exist at the target will be overwritten if they have different last modification times, sizes, or permissions from the paired folders at the source.
      NOTE:

      The same overwriting policy is applied to empty folders as files.

    If different CRC 64 checksum

    • If a source object has a CRC64 checksum different from the paired target object, the source object will overwrite the target one. Otherwise, the source object will be skipped during the migration. If either of them does not have a CRC64 checksum, their sizes and last modification times are checked.
      NOTE:
      • This option is only available for migration on Huawei Cloud or from Alibaba Cloud or Tencent Cloud.
      • Using this option requires that the target OBS bucket be added to the CRC64 feature whitelist.

    Consistency Check

    Size and last modified

    With this default method, the system checks data consistency by comparing object size and last modification time.

    CRC64 checksum

    The system verifies data consistency by comparing CRC64 values in the metadata. If a source object and the paired destination object have CRC64 checksums, the checksums are checked. Otherwise, their sizes and last modification times are checked.
    NOTE:
    • This option is only available for migration on Huawei Cloud or from Alibaba Cloud or Tencent Cloud.
    • Using this option requires that the target OBS bucket be added to the CRC64 feature whitelist.

    Migrate Metadata

    -

    Decide whether to migrate metadata.

    • If you select this option, object metadata will be migrated.
    • If you do not select this option, only the Content-Type and Content-Encoding metadata will be migrated.

  9. (Optional) Configure advanced options based on Table 5.

    Table 5 Advanced settings

    Parameter

    Description

    Record Migration Results

    Determine the migration results you want to record. After the migration is complete, records are automatically generated and saved to the /oms directory in the target storage buckets. Multiple options can be selected.

    For example, if you select Migrated objects, all migrated objects will be recorded in a file, and the file will be saved to the /oms directory in the target storage buckets.

    Migrate Incremental Data

    If you select No, incremental migration will not be performed.

    If you select Yes, configure the overwriting policy and specify how to execute incremental migration. For details, see Configuring Incremental Migration Settings.

    Target Storage Class

    Choose the storage class that your data will be migrated to. For details about storage classes, see Introduction to Storage Classes.

    Enable KMS Encryption

    • If you do not select this option, whether migrated data will be encrypted in the target bucket depends on the server-side encryption setting of the bucket.
    • If you select this option, all migrated data will be encrypted before it is stored to the target buckets.
    NOTE:
    • Using KMS to encrypt migrated data may slow down the migration speed by about 10%.
    • This option is only available when KMS is supported in the region you are migrating to.

    Restore Archive Data

    • If you do not select this option, the system records archived objects in the list of objects that failed to be migrated and continues to migrate other objects in the migration task.
    • If you select this option, the system automatically restores and migrates archived objects in the migration task. If an archived object fails to be restored, the system skips it and records it in the list of objects that failed to be migrated and continues to migrate other objects in the migration task.
    NOTE:

    The system will restore archived data before migrating it, and you pay the source cloud platform for the API requests and storage space generated accordingly.

    Filter Source Data

    Filter files and directories to be migrated by applying filters. For details about filters, see Source Data Filters.

    Download Data from CDN

    If the default domain name cannot meet your migration requirements, then as long as the source cloud service provider supports custom domain names, you can bind a custom domain name to the source bucket, and enable the CDN service on the source platform to reduce data download fees. Enter a custom domain name in the Domain Name text box and select a transmission protocol. HTTPS is more secure than HTTP and is recommended.

    If the migration source is the Alibaba Cloud OSS or Tencent Cloud COS, you also need to select an authentication type and enter an authentication key.

    Send SMN Notification

    Determine whether to use SMN to get notifications about migration results.

    • If you do not select this option, no SMN messages will be sent after the migration is complete.
    • If you select this option, after the migration is complete, SMN messages will be sent to the subscribers of the selected topic. You can select the language and trigger conditions for sending messages.

    Limit Traffic

    Allocate the maximum bandwidth to be used by the workflow during a specified period.

    • If you do not select this option, migration traffic is not limited.
    • If you select this option, limit the migration traffic by setting Start Time, End Time, and Bandwidth Limit.
      For example, if you set Start Time to 08:00, End Time to 12:00, and Bandwidth Limit to 20 MB/s, the maximum migration speed is limited to 20 MB/s from 08:00 to 12:00. The migration speed is not limited beyond this period.
      NOTE:
      • The rate limit ranges from 0 MB/s to 1,048,576 MB/s.
      • A maximum of five rules can be added.
      • The time is the local standard time of the region you are migrating to.

  10. Click Next: Confirm.
  11. Confirm the workflow settings and click Confirm. The Run Workflow dialog box is displayed, which indicates that the workflow has been created.

    • If you want to start the migration immediately, click Confirm to run the workflow.
    • If you want to add a stage or step to the workflow, click Cancel. The workflow enters a Waiting state, and the migration has not started. To start the migration, click Run in the Operation column.

  12. On the migration workflow details page, view the workflow settings and the migration progress. You can also perform the following operations:

    • Move the cursor to the migration progress bar of a resource. In the displayed window, view the migration details about the resource.
    • When a migration reaches a step that requires manual confirmation, place the cursor on the progress bar and click Confirm next to the step status in the displayed window. The migration can continue only after you confirm.
    • In the Basic Information area, click Manage next to the cluster name. The cluster details page is displayed on the right. On the displayed page, you can:
      • Add, edit, or delete traffic limiting rules to control cluster traffic based on your requirements.
      • Add or delete migration nodes or list nodes, or upgrade plug-ins for existing nodes as required.
    • In the Basic Information area, expand Advanced Settings. Review the incremental migration settings. If Incremental Migration Method is set to Automated, you can modify the number of incremental migrations.

(Optional) Step 5: Clearing the Migration Cluster

If the migration cluster is no longer needed after your data migration is complete, you can delete the cluster and the associated resources. For details, see Deleting a Migration Cluster.