Help Center/ Migration Center/ Best Practices/ Storage Migration/ Migrating Data from Ceph to Huawei Cloud OBS over HTTP
Updated on 2025-08-20 GMT+08:00

Migrating Data from Ceph to Huawei Cloud OBS over HTTP

This section describes how to use MgC storage migration workflows to migrate data from Ceph to Huawei Cloud OBS over HTTP.

Supported Regions

The size limit for a single object is 4.76837158203125 TB (500 MB × 10,000). If the limit is exceeded, the migration may fail.

Preparations

  • Preparing a Huawei account

    Before using MgC, prepare a HUAWEI ID or an IAM user that can access MgC and obtain an AK/SK pair for the account or IAM user. For details, see Preparations.

  • Creating an application migration project

    Create a migration project on the MgC console. For details, see Managing Migration Projects. Set Project Type to Application migration.

  • Creating an OBS bucket

    On Huawei Cloud OBS, create a Standard bucket in the target region for storing URL list files and receiving source data. For details, see Creating a Bucket.

    If an IAM user is used for migration, the IAM user must have the read and write permissions for the target bucket. For details, see Granting an IAM User the Read/Write Permissions for a Bucket.

  • Creating a migration cluster

    Create a dedicated migration cluster for this migration. A cluster consists of a master node as well as several list and migration nodes. For details about how to create a cluster, see Creating a Migration Cluster.

Step 1: Generate URLs for Sharing and Downloading Ceph Files

In this section, bucket01 and http://100.93.xxx.xx:7480 are examples. Use the actual Ceph S3 bucket name and the Ceph RGW web access address and port.

Replace the following in the example with the actual values:

  • <bucket-name>: bucket name
  • <file-name>: name of the JSON file to be created
  • <domain>: actual domain name or IP address of the Ceph RGW (RADOS Gateway) service.
  • <port>: actual access port of the Ceph RGW service.

If data in the bucket to be migrated can be accessed using a browser, skip step 1 and go to step 2.

  1. Run the following command to check whether an access policy has been configured for the source bucket:

    s3cmd info s3://<bucket-name> 
    • If the value of Policy is none in the command output, no access policy is configured for the bucket. Go to step 2.

    • If the value of Policy is not none in the command output, save the policy information for post-migration restoration.

  2. On the server where the s3cmd tool is installed, open a text editor and create a JSON file with a custom name. Copy and paste the bucket policy below that allows access to objects in the specified S3 bucket to the file. Replace <bucket-name> with the actual S3 bucket name, save the JSON file, and exit the editor.

    {
            "Statement": [{
                    "Effect": "Allow" ,
                    "Principal": "*",
                    "Action": "s3:GetObject",
                    "Resource": "arn:aws:s3::: <bucket-name>/*"
            }]
    }

    For more information, see Examples of Amazon S3 Bucket Policies.

  3. Use the s3cmd command line tool to set a bucket policy that allows public access to the files in the bucket.

    s3cmd setpolicy <file-name>.json s3://<bucket-name> 

    Replace <file-name> with the name of the JSON file created in step 2 and <bucket-name> with the actual S3 bucket name.

  4. List all files in the bucket and save the output to a text file:

    s3cmd ls -r s3://<bucket-name> >> <file-URL>.txt

    Replace <bucket-name> with the actual S3 bucket name and <file-URL> with the name of the local file that you want to save the output to, for example, s3url.txt.

  5. Open the generated list file (s3url.txt in this example) to view the S3 URLs of all files in the bucket. Modify a URL by replacing s3:// and everything before it with your actual domain name and port in the format of http://<domain>:<port>/. Test the URL by entering it in a browser. If the file opens, the access policy is configured correctly. If you receive an AccessDenied error, repeat step 1 to step 4 to configure the access policy again.

    For example, if the Ceph RGW service is running at IP address 100.93.xxx.xx on port 7480, and the bucket named bucket01 contains two files (dragon.png and index.html), the original S3 URLs would be as follows:
    2024-07-26 03:09         3987  s3://bucket01/dragon.png
    2024-07-26 02:01         1701  s3://bucket01/index.html
    Replace s3:// and all preceding content with http://<domain>:<port>/. In this example, replace everything from the date up to s3:// with http://100.93.xxx.xxx:7480. The resulting URLs are as follows:
    http://100.93.xxx.xxx:7480/bucket01/dragon.png
    http://100.93.xxx.xxx:7480/bucket01/index.html

  6. Update all S3 URLs in the list file based on the method described in the previous step. If the file contains a large number of entries, you can use a text editor (such as Notepad++) to perform batch replacements efficiently.
  7. Edit the URL list file to meet MgC requirements, ensuring that it includes both the shareable URLs and corresponding file names.

    <shared-URL> <file-name>

    Each shared URL must be followed by its corresponding file name, separated by a tab character. To move files to a specific subfolder at the target location, include the subfolder path in the file names. For additional requirements and limitations on the URL list file, see What Are the Restrictions on Using MgC for Storage Migration?

    For example:
    http://100.93.xxx.xxx:7480/bucket01/dragon.png dragon.png
    http://100.93.xxx.xxx:7480/bucket01/index.html index.html

  8. After editing and verifying all URLs, save the URL list file.

Step 2: Upload the URL List File to the OBS Bucket

  1. Sign in to the OBS console. In the navigation pane, choose Buckets.
  2. In the bucket list, click the created OBS bucket to open the Objects page.
  3. Click Create Folder, enter a folder name (for example, cephUrl), and click OK.
  4. Click the name of the folder created in the previous step. Click Upload Object.
  5. Upload the URL list file (s3url.txt in this example) to the folder in either of the following ways:

    • Drag the URL list file to the Upload Object box and click Upload.
    • In the Upload Object box, click add files, select the URL list file, and click Upload.

Step 3: Create a Storage Migration Workflow

  1. Sign in to the MgC console. In the navigation pane, under Project, select the created application migration project from the drop-down list.
  2. In the navigation pane, choose Workflows.
  3. Click Create Workflow in the upper right corner of the page.
  4. Select Storage Migration Template and click OK to open the page for creating a workflow.
  5. Set workflow basics based on Table 1.

    Table 1 Basic settings

    Parameter

    Description

    Name

    User-defined

    Region

    Select the region where the target bucket is located from the drop-down list.

    Description

    User-defined

    Cluster

    Select the created cluster.

  6. Configure the migration source and target based on Table 2 and Table 3.

    Table 2 Parameters for configuring a migration source

    Parameter

    Description

    Location Type

    Select HTTP/HTTPS Source.

    List Path

    Enter the name of the folder (cephUrl/ in this example) where the URL list file is stored. Note that the folder name must be suffixed with a slash (/).

    Table 3 Parameters for configuring a migration target

    Parameter

    Description

    Location Type

    Select Huawei Cloud OBS.

    AK

    Enter the AK/SK pair of the target Huawei Cloud account. The account must have the read and write permissions for the target bucket.

    SK

    Bucket

    Select the created OBS bucket.

    NOTE:

    Only Standard and Infrequent Access buckets are supported.

    Endpoint

    Enter the endpoint of the region where the target bucket is located.

    For example, if the target bucket is located in the CN North-Beijing4 region of Huawei Cloud, enter obs.cn-north-4.myhuaweicloud.com.

    NOTE:

    You can view the endpoint in the OBS bucket overview.

    Specify Prefix

    This parameter is optional. Specify a prefix to rename or relocate objects migrated to the target bucket. For example, if you specify the prefix /D, source file /A/B/C.txt will be relocated to /D/A/B/C.txt after being migrated to the target bucket. For details, see:

    Adding a Name Prefix or Path Prefix to Migrated Objects

  7. Configure migration settings based on Table 4.

    Table 4 Migration settings

    Parameter

    Sub-parameter

    Description

    Task Type

    Full migration

    Migrates all data in a source bucket or in specific paths.

    List migration

    Migrates objects recorded in the list files.

    In List Path box, enter the path of the object lists stored in the target bucket. Restrictions on an object list file vary with the target location.

    • Target location: Huawei Cloud OBS
      • An object list file cannot exceed 30 MB.
      • An object list file must be a .txt file, and the Content-Type metadata must be text/plain.
      • An object list file must be in UTF-8 without BOM.
      • Each line in an object list file can contain only one object name, and the object name must be URL encoded.
      • Each line in an object list file cannot exceed 16 KB, or the migration will fail.
      • The Content-Encoding metadata of an object list file must be left empty, or the migration will fail.
      • An object list file can contain a maximum of 10,000 lines.
    • Target location: NAS
      • An object list file cannot exceed 30 MB.
      • An object list file must be a .txt file.
      • An object list file must be in UTF-8 without BOM.
      • Each line in an object list file can contain only one object name, and the object name must be URL encoded.
      • Each line in an object list file cannot exceed 16 KB, or the migration will fail.
      • An object list file can contain a maximum of 10,000 lines.

    Prefix migration

    This option is only available for cloud storage migration.

    If you enter a file name or name prefix in the Prefix box, only the objects that exactly match the specified name or prefix are migrated.

    NOTICE:
    • If the files to be migrated are stored in the root directory of the source bucket, add their name prefixes directly. If the files are stored in a non-root directory, add their directories and name prefixes in the format of <folder-name>/<prefix>.
    • Use commas (,) to separate multiple prefixes.

    Listing Mode

    NOTE:

    This parameter is available only when Task Type is set to Full migration or Prefix migration

    Serial

    This is the default listing mode if the source is a bucket.

    Parallel

    This is the default listing mode if the source is a parallel file system (PFS). If this mode is selected when the source is a bucket, the listing operation may take a long time.

    Concurrent Subtasks

    -

    User-defined. There cannot be more than 10 concurrent subtasks for each online migration node. For example, if there are 2 online migration nodes, the maximum number of subtasks can be 20 or any number below.

    Overwrite Existing

    Never

    Files existing at the target will never be overwritten.

    WARNING:
    • If you choose Never for the initial migration, the attributes of involved parent folders at the source will not be migrated to the target. As a result, the folder attributes may be incomplete at the target. To avoid this issue, use the Never option with caution for the initial migration.
    • If you choose Never, restarting a migration after an interruption or pause may lead to incomplete data migration, even though the task may appear successful. This could impact data integrity, so use the Never option with caution.

    Always

    Files existing at the target will always be overwritten.

    If older or different size

    • The system replaces existing target files if they are older than or differ in size from their source counterparts. Files with matching modification times and sizes remain unchanged and are skipped from migration.
    • The system verifies folders after their content is migrated. Folders that already exist at the target will be overwritten if they have different last modification times, sizes, or permissions from the paired folders at the source.
      NOTE:

      The same overwriting policy is applied to empty folders as files.

    If different CRC 64 checksum

    If a source object has a CRC64 checksum different from the paired target object, the source object will overwrite the target one. Otherwise, the source object will be skipped during the migration. If either of them does not have a CRC64 checksum, their sizes and last modification times are checked.
    NOTE:
    • This option is only available for migration within Huawei Cloud or from Alibaba Cloud or Tencent Cloud.
    • Using this option requires that target OBS buckets be whitelisted for the CRC64 feature.

    Consistency Verification

    Size and last modified

    With this default method, the system checks data consistency by comparing object size and last modification time.

    CRC64 checksum

    The system checks data consistency by comparing the CRC64 values in the metadata of source and target objects. If a source object or the paired target object does not have a CRC64 checksum, the OMS-calculated CRC64 value is used for verification. CRC64 verification may generate extra public traffic and request costs. For details, see Consistency Verification.
    NOTE:
    • This option is only available for migration within Huawei Cloud or from AWS, Alibaba Cloud, or Tencent Cloud.
    • This option is only available for migration from NAS_NFS_V3_MOUNT and NAS_NFS_V3_PROTOCOL.
    • Using this option requires that target OBS buckets be whitelisted for the CRC64 feature.

    Migrate Metadata

    -

    Determine whether to migrate metadata.

    • If you select this option, object metadata will be migrated.
    • If you do not select this option, only the Content-Type and Content-Encoding metadata will be migrated.

    Clear Cluster

    -

    Determine whether to clear the migration cluster after the migration is complete.

    • If you select this option, a step for clearing the migration cluster will be created in the workflow. You can also choose whether to clear resources used by the cluster, such as NAT gateways, security groups, and VPCEP resources.
    • If you do not select this option, a step for clearing the migration cluster will not be created in the workflow, but the migration cluster and its resources will automatically be deleted 30 days after the workflow is created.

  8. (Optional) Configure advanced options based on Table 5.

    Table 5 Advanced options

    Parameter

    Description

    Target Storage Class

    Choose the storage class that your data will be migrated to. For details about storage classes, see How Do I Choose Storage Classes?

    NOTE:

    CRC64-based consistency verification is not available for Archive and Deep Archive storage classes. Even if you choose the CRC64-based verification method, the system automatically uses the object size and last modification time to verify data consistency.

    Enable KMS Encryption

    • If you do not select this option, whether migrated data will be encrypted in the target buckets depends on the server-side encryption setting of the buckets.
    • If you select this option, all migrated objects will be encrypted before they are stored to the target bucket.
    NOTE:
    • Using KMS to encrypt migrated data may slow down the migration speed by about 10%.
    • This option is only available when KMS is supported in the region you are migrating to.

    Filter Source Data

    Filter files to be migrated by applying filters. For details about filters, see Source Data Filters.

    Send SMN Notification

    Determine whether to use SMN to get notifications about migration results.

    • If you do not select this option, no SMN messages will be sent after the migration is complete.
    • If you select this option, after the migration is complete, SMN messages will be sent to the subscribers of the selected topic. You can select the language and trigger conditions for sending messages.

    Limit Traffic

    Set the maximum bandwidth to be used by the migration workflow during different periods.

    • If you do not select this option, migration traffic is not limited.
    • If you select this option, limit the migration traffic by setting Start Time, End Time, and Bandwidth Limit.
      For example, if you set Start Time to 08:00, End Time to 12:00, and Bandwidth Limit to 20 MB/s, the maximum migration speed is limited to 20 MB/s from 08:00 to 12:00. The migration speed is not limited outside this period.
      NOTE:
      • The bandwidth limit ranges from 1 MB/s to 1,048,576 MB/s.
      • A maximum of five rules can be added.
      • The time is the local standard time of the region you are migrating to.

    Schedule Migration

    Schedule the migration to run during a period.

    • If you do not select this option, you need to manually start or stop the migration.
    • If you select this option, the migration runs during the specified period and stops outside that period.

      For example:

      • If you set Start Time to 08:00 and End Time to 12:00, the migration task runs from 08:00 to 12:00 every day. The migration stops outside that period.
      • If you set Start Time to 12:00 and End Time to 08:00, the migration runs from 12:00 of the current day to 08:00 of the next day. The migration stops outside that period.

  9. Click Next: Confirm.
  10. Confirm the workflow settings and click Confirm. In the displayed dialog box, click Confirm to run the workflow immediately.
  11. In the workflow list, click the workflow name to go to its details page. You can view the configuration information and migration progress of the workflow.

Step 4: Restore the Bucket Access Policy

After the migration is complete, restore the access policy of the source bucket.

  • If the command output in step 1 is Policy: none, run the following command to delete the added public access policy:
    s3cmd delpolicy s3://<bucket-name> 
  • If the command output in step 1 contains an access policy, perform the following steps:
    1. Run the following command to delete the added public access policy:
      s3cmd delpolicy s3://<bucket-name> 
    2. Run the following command to restore the original access policy:
      s3cmd setpolicy <saved-original-policy>.json s3://<bucket-name>