Help Center/ Migration Center/ Best Practices/ Storage Migration/ Migrating Data from Multiple Source Buckets by Prefix
Updated on 2024-11-18 GMT+08:00

Migrating Data from Multiple Source Buckets by Prefix

This section describes how to filter objects to be migrated in source buckets using prefixes and migrate the objects to Huawei Cloud OBS buckets.

Preparations

  • Prepare a Huawei account or an IAM user that can access MgC. For details, see Preparations.
  • Create a migration project on the MgC console.
  • Add the AK/SK pair used for accessing the source cloud platform to MgC. The AK/SK pair will be used to collecting details about source buckets. For more information, see Adding Resource Credentials.
  • Ensure that the source and target accounts have the permissions required for the migration. For details, see How Do I Obtain Required Permissions for the Source and Target Accounts?
  • On Huawei Cloud, create an OBS bucket for receiving migrated data. For details, see Creating a Bucket. You can also use as existing bucket.
  • Create a prefix list for each source bucket to be migrated. A prefix list must meet the following requirements:
    • The list must be in .txt format and the file size cannot exceed 2 MB.
    • Each line in the file can only contain one prefix, and a prefix cannot be longer than 1,024 characters.
    • A maximum of 1,000 prefixes can be contained in a file.

Precautions

  • Supported regions
    You can object storage data in batches to the following regions. To migrate to other regions, use RDA or other solutions.
    • LA-Santiago
    • LA-Sao Paulo
    • TR-Istanbul
    • AP-Bangkok
    • AP-Singapore
    • AP-Jakarta
    • ME-Riyadh
  • Intranet migration

    Data can be migrated between buckets in the same region over the intranet.

  • Soft link processing
    MgC does not support migration through symbolic links. To migrate a path pointed to by a symbolic link, you need to:
    • Enter the actual path to be migrated when creating a migration workflow.
    • After the migration is complete, manually create a symbolic link to the path at the target.

Step 1: Discovering Source Buckets

  1. Sign in to the MgC console.
  2. In the navigation pane on the left, choose Research > Application Discovery. In the upper left corner of the page, select the migration project created in Preparations.
  3. Click Discover Over Internet in the Cloud Discovery area.

  4. Set parameters in the Basic Settings and Task Settings areas based on Table 1.

    Table 1 Parameters in the Basic Settings and Task Settings areas

    Area

    Parameter

    Description

    Mandatory

    Basic Settings

    Task Name

    Enter a task name.

    Yes

    Task Description

    Describe the task.

    No

    Task Settings

    Source Platform

    Select the source cloud platform. In this example, select Huawei Cloud.

    Yes

    Credential

    Select the source credential added in Preparations. If you did not add the credential, click Create. In the displayed area, set Authentication to AK/SK, enter the AK/SK pair of the source account, and click Verify and Save.

    Yes

    Region

    Select the regions where your source resources are located.

    Yes

  5. In the Resource Discovery area, select Object Storage from the Resource Type drop-down list. If the source platform is the Alibaba Cloud or Tencent Cloud, you need to enable Cloud Platform Discovery before selecting a resource type.

  6. Associate the collected object storage resources with an application.

    • If an application is available, select the application from the Application drop-down list.
    • If no applications are available, click Create Application. In the displayed dialog box, enter an application name and description; select the business scenario, environment, and region; and click OK.

  7. Click Confirm. The task for discovering object storage resources over the Internet is created, and the system automatically starts collecting resource details.
  8. On the Application Discovery page, in the Discovery Task card, click View next to Total tasks.

    Wait until the task status changes to Succeeded that indicates the collection is complete.

Step 2: Create a Migration Cluster

Additional charges incur for migration clusters. For details, see Billing.

To ensure migration stability and data security, you are not allowed to log in to nodes in migration clusters. If you indeed need to log in to the nodes, contact technical support.

  1. Sign in to the MgC console.
  2. In the left navigation pane, choose Deploy > Migration Clusters.
  3. In the upper right corner of the page, Click Create Cluster. If this is your first time to create a cluster, you must agree to delegate MgC the required permissions before you can access the Create Cluster page.

  4. Configure the parameters listed in Table 2.

    Table 2 Parameters for creating a cluster

    Area

    Parameter

    Configuration

    Constraints

    Basic Settings

    Cluster Name

    Enter a name.

    The cluster name must be unique in the same account.

    Region

    Select the region to provision the cluster.

    The cluster must be provisioned in the target region you are migrating to.

    Cluster Type

    Select what the cluster will be used for.

    Currently, only storage migration is supported.

    Node Settings

    Master Node

    It is used to manage migration nodes and list nodes.

    A cluster can only have one master node.

    Migration Node

    Migration nodes are used for executing migration and verification tasks. The recommended specifications are 8 vCPUs and 16 GB of memory.

    • The node specifications cannot be modified after the cluster is created.
    • The number of nodes must meet the following requirements:
      • Number of migration nodes + Number of list nodes + 1 ≤ 100
      • Number of migration nodes + Number of list nodes + 1 ≤ Number of unused IP addresses in the subnet

    List Node

    List nodes are used for listing tasks. The recommended specifications are 8 vCPUs and 16 GB of memory.

    Network Settings

    VPC

    Select a VPC from the drop-down list.

    -

    Subnet

    Make sure that there are enough unused IP addresses for the migration and list nodes in this cluster.

    Number of unused IP addresses in the subnet ≥ Number of migration nodes + Number of list nodes + 1

    Network Type

    • Internet: You need to select a public NAT gateway. If there is no gateway available, choose Buy Gateway from the drop-down list and select the gateway specifications and EIPs you want to associate with the gateway. A maximum of 20 EIPs can be selected at a time.
    • Intranet: This option is suitable for data migration within a region.
    • Private line: Source data is directly accessed through the private line. For details about Direct Connect, see Direct Connect.

    -

    Advanced Settings

    DNS Configuration (Optional)

    Enter the IP address of the DNS server to update the value of nameserver in the /etc/resolv.conf file. Use commas (,) to separate multiple DNS server addresses, for example, 192.0.2.1,192.0.2.2.

    A maximum of three DNS IP addresses can be specified.

    Domain Mapping (Optional)

    Add mappings between domain names and IP addresses to update the /etc/hosts file.

    A maximum of 500 mappings can be added.

    -

    Traffic Limiting

    Allocate the maximum bandwidth to be used by the workflow during a specified period.

    • If you do not select this option, migration traffic is not limited.
    • If you select this option, limit the migration traffic by setting the start time, end time, and bandwidth limit.
      NOTICE:

      For example, if you set Start Time to 08:00, End Time to 12:00, and Maximum Bandwidth to 20 MB/s, the maximum migration speed is limited to 20 MB/s when the migration task is running in the period from 08:00 to 12:00. The migration speed is not limited beyond this period.

    • A maximum of five traffic limiting rules can be added.
    • The time is the local standard time of the region you are migrating to.

    Log Collection

    • If this option is enabled, logs generated during the migration are collected for possible troubleshooting later.
    • If this option is disabled, logs generated during storage migrations are not collected.

    -

  5. Click Confirm. Then you can view the cluster in the list. For details about cluster statuses, see Cluster Statuses. If the cluster status is Creation failed, move the cursor to the status to view the failure cause. After the fault is rectified, choose More > Retry to try to create the cluster again.

Step 3: Creating a Migration Plan

  1. Sign in to the MgC console.
  2. In the navigation pane, choose Design > Migration Plans. Click Create Migration Plan in the upper right corner of the page.

  3. In the Batch Object Storage Migration card, click in the Configure Migration Plan.

  4. In the Basic Settings area, set parameters listed in Table 3.

    Table 3 Basic parameters

    Parameter

    Configuration

    Migration Plan

    Enter a name.

    Description (Optional)

    Enter a description.

    Source Platform

    Select the source platform you selected in Step 1. Select Huawei Cloud.

    Target Region

    Select the region you want to migrate to.

  5. Above the source bucket list, click Add.

  6. Select the buckets to be migrated, click Modify in the Operation column, set Migration Method to Prefix migration, and click Save and then Confirm.

    • The selected resources must come from the source platform selected in Basic Settings.
    • A maximum of 100 buckets can be added.

  7. Associate source credentials.

    • To associate a source bucket with a credential, locate the source bucket in the list and click Modify in the Operation column. In the Modify Migration Settings dialog box that is displayed, select a source credential.
    • To associate multiple source buckets with a credential, select these buckets from the list and click Associate Credentials above the list.

  8. Import the prefix lists.

    Locate a source bucket in the list, and click Import Prefixes in the Operation column. Upload the prefix list file prepared for the bucket and click Confirm.

  9. Confirm that the source buckets have been associated with their credentials and the prefix import is complete for all the buckets, and click Next to configure the target buckets.
  10. Locate a source bucket, click Modify in the Operation column, select the credential used for accessing target bucket, enter a prefix to rename migrated objects, and click Save.

  11. After you configure the migration settings for all buckets to be migrated, click Next. Assess how large of a migration cluster is required for the migration and create a migration cluster in the recommended size. Alternatively, you can skip this step and use an existing migration cluster. For details, see Managing a Migration Cluster.
  12. Click Next. On the displayed page, click Select Cluster to choose an existing migration cluster.
  13. In the displayed cluster list, select the cluster created in step 2 and click Confirm. The source resources in the migration plan will be migrated using the selected cluster.

    Only healthy or subhealthy migration clusters can be selected.

  14. Click OK. After migration plan is created, you can see it in the list.

    • If you need to modify the plan settings, click Design in the Operation column.
    • When the design progress of the plan is Completed, click Create Workflow in the Operation column to create a migration workflow to migrate all buckets in the plan in a batch.

Step 4: Creating a Batch Object Storage Migration Workflow

  • A single object cannot be larger than 4.76837158203125 TB (500 MB x 10,000). Otherwise, the migration may fail.
  • During the migration, the system automatically creates a temporary folder named oms in the target bucket. Do not perform any operations on this folder, including but not limited to modifying, deleting, or adding data in the folder. Otherwise, the migration will be interrupted or fail.
  1. Sign in to the MgC console.
  2. In the navigation pane on the left, choose Migrate > Workflows.
  3. Click Create Workflow in the upper right corner of the page.

  4. Select Batch Object Storage Migration and click Configure Workflow.
  5. In the Basic Information area, enter a name and description for the workflow.
  6. In the Migration Plan area, select the migration plan created in Step 3. Then you should view the overview of the migration plan. Click View Details to view more information about the plan.

  7. In the Migration Cluster area, select the cluster used for the migration. The cluster specified in the migration plan is preselected by default, but you can select another one if needed. The modification is applied to the current workflow but not to the migration plan.
  8. Configure the migration settings based on Table 4.

    Table 4 Migration settings

    Parameter

    Option

    Description

    Concurrent Subtasks

    -

    This parameter is user-defined. There cannot be more than 10 concurrent subtasks for each online migration node. For example, if the number of online migration nodes is 2, the maximum number of subtasks can be 20 or any number below.

    Overwrite Existing

    Never

    Files existing at the migration target are never overwritten.

    WARNING:
    • If you choose Never for the initial migration, the attributes of involved parent folders at the source will not be migrated to the target. As a result, the folder attributes may be incomplete at the target. To avoid this issue, use the Never option with caution for the initial migration.
    • If a migration task is paused or interrupted and then restarted or resumed, the Never option will cause the system to skip files that were not completely migrated earlier, but the task may still be marked as successful. This affects data integrity. To avoid this issue, use the Never option with caution.

    Always

    Files existing at the migration target are always overwritten.

    If older or different size

    • Files that already exist at the target will be overwritten if they are older than or have different sizes from the paired files at the source.
    • Verification will be performed for folders after their contents are migrated. Folders that already exist at the target will be overwritten if they have different last modification times, sizes,or permissions from the paired folders at the source.
      NOTE:

      For empty folders, the overwrite policy is the same as that for files.

    Migrate Metadata

    -

    Determine whether to migrate metadata.

    • If you select this option, object metadata will be migrated.
    • If you do not select this option, only the ContentType metadata will be migrated.

  9. (Optional) Configure advanced options based on Table 5.

    Table 5 Advanced settings

    Parameter

    Description

    Target Storage Class

    Choose the storage class that your data will be migrated to in the target bucket. For details about storage classes, see Introduction to Storage Classes.

    Enable KMS Encryption

    • If you do not select this option, objects are in the same encryption status before and after the migration.
    • If you select this option, all migrated objects will be encrypted before they are stored in the target bucket.
    NOTE:
    • Using KMS to encrypt migrated data may slow down the migration speed by about 10%.
    • This option is only available when KMS is supported in the region you are migrating to.

    Restore Archive Data

    • If you do not select this option, the system records archived objects in the list of objects that failed to be migrated and continues to migrate other objects in the migration task.
    • If you select this option, the system automatically restores and migrates archived objects in the migration task. If an archived object fails to be restored, the system skips it and records it in the list of objects that failed to be migrated and continues to migrate other objects in the migration task.
    NOTE:

    The system will restore archived data before migrating it, and you pay the source cloud platform for the API requests and storage space generated accordingly.

    Filter Source Data

    Filter files to be migrated by applying filters. For details about filters, see Source Data Filters.

    Download Data from CDN

    If the default domain name cannot meet your migration requirements, then as long as the source cloud service provider supports custom domain names, you can associate a custom domain name with the source bucket, and enable the CDN service on the source platform to reduce data download fees. Enter a custom domain name in the Domain Name text box and select a transmission protocol. HTTPS is more secure than HTTP and is recommended.

    If the migration source is the Alibaba Cloud OSS or Tencent Cloud COS, you also need to select an authentication type and enter an authentication key.

    Send SMN Notification

    Determine whether to use SMN to get notifications about migration results.

    • If you do not select this option, no SMN messages are sent after the migration is complete.
    • If you select this option, after the migration is complete, SMN messages are sent to the subscribers of the selected topic. You can select the language and trigger conditions for sending messages.

    Limit Traffic

    Allocate the maximum bandwidth to be used by the workflow during a specified period.

    • If you do not select this option, migration traffic is not limited.
    • If you select this option, limit the migration traffic by setting Start Time, End Time, and Bandwidth Limit.
      For example, if you set Start Time to 08:00, End Time to 12:00, and Bandwidth Limit to 20 MB/s, the maximum migration speed is limited to 20 MB/s when the migration task runs from 08:00 to 12:00. The migration speed is not limited beyond this period.
      NOTE:
      • The rate limit ranges from 0 MB/s to 1,048,576 MB/s.
      • A maximum of five rules can be added.
      • The time is the local standard time of the region you are migrating to.

  10. Click Next: Confirm.
  11. Confirm the workflow settings, and click Confirm. The Run Workflow dialog box is displayed, which indicates that the workflow has been created.

    • If you want to start the migration immediately, click Confirm to run the workflow.
    • If you want to add a stage or step to the workflow, click Cancel. The workflow enters a Waiting state, and the migration has not started. To start the migration, click Run in the Operation column.

  12. On the migration workflow details page, view the workflow settings and the migration progress. You can also perform the following operations:

    • Move the cursor to the migration progress bar of a resource. In the displayed window, view the migration details about the resource.
    • When a migration reaches a step that requires manual confirmation, place the cursor on the progress bar and click Confirm next to the step status in the displayed window. The migration can continue only after you confirm.
    • In the Basic Information area, click Manage next to the migration cluster name. The cluster details page is displayed on the right. On the displayed page, you can:
      • Add, edit, or delete traffic limiting rules to control cluster traffic based on your requirements.
      • Add or delete migration nodes or list nodes, or upgrade plug-ins for existing nodes as required.

(Optional) Step 5: Clearing the Migration Cluster

If the migration cluster is no longer needed after your data migration is complete, you can delete the cluster and the associated resources. For details, see Deleting a Migration Cluster.