Help Center/ Migration Center/ Best Practices/ Storage Migration/ Migrating Data from Ceph to Huawei Cloud OBS Using HTTP
Updated on 2024-11-18 GMT+08:00

Migrating Data from Ceph to Huawei Cloud OBS Using HTTP

Use MgC storage migration workflows to migrate data from Ceph to Huawei Cloud OBS using HTTP.

Supported Regions

The following regions are supported:

  • LA-Santiago
  • LA-Sao Paulo
  • TR-Istanbul
  • AP-Bangkok
  • AP-Singapore
  • AP-Jakarta
  • ME-Riyadh

A single object cannot be larger than 4.76837158203125 TB (500 MB x 10,000). Otherwise, the migration may fail.

Preparations

  • Preparing a Huawei account

    Before using MgC, prepare a HUAWEI ID or an IAM user that can access MgC and obtain an AK/SK pair for the account or IAM user. For details about how to obtain an access key, see Making Preparations.

  • Creating a migration project

    On the MgC console, create a migration project. For details, see Managing Migration Projects.

  • Creating an OBS bucket

    On Huawei Cloud OBS, create a Standard bucket in the target region for storing URL list files and receiving source data. For details, see Creating a Bucket.

    If an IAM user is used for migration, the IAM user must have the read and write permissions for the target bucket. For details, see Granting an IAM User the Read/Write Permissions for a Bucket.

  • Creating a migration cluster

    You can create a dedicated migration cluster for this task. A cluster consists of a master node, and several list and migration nodes. For details about how to create a cluster, see Creating a Migration Cluster.

Step 1: Generating URLs for Sharing and Downloading Ceph Files

Replace bucket01 and http://100.93.xxx.xx:7480 in the following steps with the actual Ceph S3 bucket name and Ceph RGW web access address and port.

Replace the following parameters in the example with the actual values:

  • <BUCKET-NAME>: bucket name
  • < FILE-NAME >: name of the JSON file to be created
  • <DOMAIN>: actual domain name or IP address of the Ceph RGW (RADOS Gateway) service.
  • <PORT>: actual access port of the Ceph RGW service.

If data in the bucket to be migrated can be accessed using a browser, skip step 1 and go to step 2.

  1. Run the following command to check whether an access policy has been configured for the source bucket:

    s3cmd info s3://<BUCKET-NAME> 
    • If the value of Policy is none in the command output, no access policy is configured for the bucket. Go to step 2.

    • If the value of Policy is not none in the command output, copy and save the policy information for restoring the policy after data migration.

  2. On the server where the s3cmd tool is installed, open the text editor and create a JSON file (with a user-defined name). The file contains the S3 bucket policy, which allows objects to be obtained from the specified S3 bucket. Copy the following content to the editor, replace <BUCKET-NAME> with the actual S3 bucket name, save the JSON file, and exit the editor.

    {
            "Statement": [{
                    "Effect": "Allow" ,
                    "Principal": "*",
                    "Action": "s3:GetObject",
                    "Resource": "arn:aws:s3::: <BUCKET-NAME>/*"
            }]
    }

    For more parameter settings, see Example Amazon S3 Bucket Policies.

  3. Use the s3cmd command line tool to set a bucket policy that allows public access to the files in the bucket. The command is in the following format:

    s3cmd setpolicy <FILE-NAME>.json s3://<BUCKET-NAME> 

    Replace <FILE-NAME> with the name of the JSON file created in step 2 and <BUCKET-NAME> with the actual S3 bucket name.

  4. Run the following command to list all files in the bucket and export the result to a text file:

    s3cmd ls -r s3://<BUCKET-NAME> >> <FILE-URL>.txt

    Replace <BUCKET-NAME> with the actual S3 bucket name and <FILE-URL> with the name of the local file that you want to save the result to, for example, s3url.txt.

  5. Open the generated list file (s3url.txt in this example) to view the list of shared S3 addresses of all files in the bucket. Replace s3:// and all parameters before it with http://<DOMAIN>:<PORT>/ to generate URLs that can be accessed using a browser. Enter a URL in the address box of the browser. If the file can be accessed, the setting is correct. If the access is denied and AccessDenied is returned, repeat step 1 to step 4 to set the access policy.

    For example, the actual domain name or IP address of the Ceph RGW service is 100.93.xxx.xx, the port number is 7480, the bucket name is bucket01, there are two files (dragon.png and index.html) in the bucket, and the generated shared S3 address list is as follows:
    2024-07-26 03:09         3987  s3://bucket01/dragon.png
    2024-07-26 02:01         1701  s3://bucket01/index.html
    Replace s3:// and all parameters before it with http://<DOMAIN>:<PORT>/, that is, replace the content from Date to s3:// with http://100.93.xxx.xxx:7480. Then the generated URL list is as follows:
    http://100.93.xxx.xxx:7480/bucket01/dragon.png
    http://100.93.xxx.xxx:7480/bucket01/index.html

  6. According to the method and requirements in the previous step, replace all shared S3 addresses in the list file with URLs. If there are a large number of S3 addresses in the list file, you can use a text editor (such as NotePad++) to replace them in batches.
  7. Based on the MgC requirements, edit the URL list file to include the shared URLs and file names in the following format:

    <SHARED-URL> <FILE-NAME>

    A shared URL and file name are separated by a tab character. The name of a file in a subfolder must contain the subfolder name.. For more requirements and restrictions on the URL list file, see What Are the Restrictions on Using MgC for Storage Migration?

    For example:
    http://100.93.xxx.xxx:7480/bucket01/dragon.png dragon.png
    http://100.93.xxx.xxx:7480/bucket01/index.html index.html

  8. After editing all URLs as required and verifying that the URLs are correct, save the URL list file.

Step 2: Uploading the URL List File to the OBS Bucket

  1. Sign in to the OBS console. In the navigation pane, choose Buckets.
  2. In the bucket list, click the created OBS bucket to go to the Objects page.
  3. Click Create Folder, enter a folder name (for example, cephUrl), and click OK.
  4. Click the name of the folder created in the previous step. Click Upload Object.
  5. Upload the URL list file (s3url.txt in this example) to the folder in either of the following ways:

    • Drag the URL list file to the Upload Object box and click Upload.
    • In the Upload Object box, click add files, select thea URL list file, and click Upload.

Step 2: Create a Storage Migration Workflow

  1. Sign in to the MgC console.
  2. In the navigation pane on the left, choose Migrate > Workflows. In the upper left corner of the page, select the migration project you created.
  3. Click Create Workflow in the upper right corner of the page.

  4. Select Storage Migration and click Configure Workflow.

  5. Set workflow basics based on Table 1.

    Table 1 Basic parameters

    Parameter

    Description

    Name

    User-defined

    Region

    Select the region where the target bucket is located from the drop-down list.

    Description

    User-defined

    Cluster

    Select the created cluster.

  6. Configure the migration source and target based on Table 2 and Table 3.

    Table 2 Parameters for configuring a migration source

    Parameter

    Description

    Location Type

    Select HTTP/HTTPS Source.

    List Path

    Enter the name of the folder (cephUrl/ in this example) where the URL list file is stored. Note that the folder name must be suffixed with a slash (/).

    Table 3 Parameters for configuring a migration target

    Parameter

    Description

    Location Type

    Select Huawei Cloud OBS.

    AK

    Enter the AK/SK pair of the target Huawei Cloud account. The account must have the read and write permissions for the target bucket.

    SK

    Bucket

    Select the created OBS bucket.

    Endpoint

    Enter the endpoint of the region where the target bucket is located.

    For example, if the target bucket is located in the CN North-Beijing4 region of Huawei Cloud, enter obs.cn-north-4.myhuaweicloud.com.

    NOTE:

    You can view the endpoint in the OBS bucket overview.

    Specify Prefix

    This parameter is optional. Specify a prefix to rename or relocate objects migrated to the target bucket. For example, if you specify the prefix /D, source file /A/B/C.txt will be relocated to /D/A/B/C.txt after being migrated to the target bucket. For details, see:

    Adding a Name Prefix or Path Prefix to Migrated Objects

  7. Configure the migration settings based on Table 4.

    Table 4 Migration settings

    Parameter

    Option

    Description

    Task Type

    List migration

    Migrates objects recorded in the list files.

    Concurrent Subtasks

    -

    Specify the maximum number of concurrent subtasks. There cannot be more than 10 concurrent subtasks for each online migration node. For example, if there are 2 online migration nodes, the maximum number of subtasks can be 20 or any number below.

    Overwrite Existing

    Never

    Files existing at the migration target will never be overwritten.

    WARNING:
    • If you choose Never for the initial migration, the attributes of involved parent folders at the source will not be migrated to the target. As a result, the folder attributes may be incomplete at the target. To avoid this issue, use the Never option with caution for the initial migration.
    • If a migration task is paused or interrupted and then restarted or resumed, the Never option will cause the system to skip files that were not completely migrated earlier, but the task may still be marked as successful. This affects data integrity. To avoid this issue, use the Never option with caution.

    Always

    Files existing at the migration target will always be overwritten.

    If older or different size

    • Files that already exist at the target will be overwritten if they are older than or have different sizes from the paired files at the source.
    • Verification will be performed for folders after their contents are migrated. Folders that already exist at the target will be overwritten if they have different last modification times, sizes,or permissions from the paired folders at the source.
      NOTE:

      For empty folders, the overwrite policy is the same as that for files.

    Clear Cluster

    -

    Determine whether to clear the migration cluster after the migration is complete.

    • If you select this option, a step for clearing the migration cluster will be created in the workflow. You can also choose whether to clear resources used by the cluster, such as NAT gateways, security groups, and VPCEP resources.
    • If you do not select this option, a step for clearing the migration cluster will not be created in the workflow.

  8. (Optional) Configure advanced options based on Table 5.

    Table 5 Advanced options

    Parameter

    Description

    Target Storage Class

    Choose the storage class that your data will be migrated to in the target bucket. For details about storage classes, see Introduction to Storage Classes.

    Enable KMS Encryption

    • If you do not select this option, objects are in the same encryption status before and after the migration.
    • If you select this option, all migrated objects will be encrypted before they are stored in the target bucket.
    NOTE:
    • Using KMS to encrypt migrated data may slow down the migration speed by about 10%.
    • This option is only available when KMS is supported in the region you are migrating to.

    Filter Source Data

    Filter files to be migrated by applying filters. For details about filters, see Source Data Filters.

    Send SMN Notification

    Determine whether to use SMN to get notifications about migration results.

    • If you do not select this option, no SMN messages are sent after the migration is complete.
    • If you select this option, after the migration is complete, SMN messages are sent to the subscribers of the selected topic. You can select the language and trigger conditions for sending messages.

    Limit Traffic

    Allocate the maximum bandwidth to be used by the workflow during a specified period.

    • If you do not select this option, migration traffic is not limited.
    • If you select this option, limit the migration traffic by setting Start Time, End Time, and Bandwidth Limit.
      For example, if you set Start Time to 08:00, End Time to 12:00, and Bandwidth Limit to 20 MB/s, the maximum migration speed is limited to 20 MB/s when the migration task runs from 08:00 to 12:00. The migration speed is not limited beyond this period.
      NOTE:
      • The rate limit ranges from 0 MB/s to 1,048,576 MB/s.
      • A maximum of five rules can be added.
      • The time is the local standard time of the region you are migrating to.

    Schedule Migration

    Schedule the migration to run during a period.

    • If you do not select this option, you need to manually start or stop the migration.
    • If you select this option, the migration runs during the specified period and stops beyond that period.

      For example:

      • If you set Start Time to 08:00 and End Time to 12:00, the migration task runs from 08:00 to 12:00 every day. The migration stops beyond that period.
      • If you set Start Time to 12:00 and End Time to 08:00, the migration runs from 12:00 of the current day to 08:00 of the next day. The migration stops beyond that period.

  9. Click Next: Confirm.
  10. Confirm the workflow settings and click Confirm. In the displayed dialog box, click Confirm to run the workflow immediately.
  11. In the workflow list, click the workflow name to go to its details page. You can view the configuration information and migration progress of the workflow.

Step 4: Restore the Bucket Access Policy

After the migration is complete, restore the access policy of the source bucket.

  • If the command output in step 1 is Policy: none, run the following command to delete the added public access policy:
    s3cmd delpolicy s3://<BUCKET-NAME> 
  • If the command output in step 1 contains an access policy, perform the following steps:
    1. Run the following command to delete the added public access policy:
      s3cmd delpolicy s3://<BUCKET-NAME> 
    2. Run the following command to restore the access policy to the original one:
      s3cmd setpolicy <Saved original policy>.json s3://<BUCKET-NAME>