Updated on 2025-05-07 GMT+08:00

Survey

Big data migration encompasses the process of transferring big data clusters, task scheduling platforms, and applications from one operational environment to another. This process is structured into the following three modules. This section specifically details the migration of big data clusters and task scheduling platforms. For comprehensive information regarding big data application migration, refer to Application Cloud Migration. Note that this section primarily highlights the differentiating aspects of each migration type.

  • Big data cluster migration: This involves the relocation of big data clusters, including their storage, compute, and management components, to a new operating environment. This process necessitates cluster reconfiguration and data migration, taking into account critical factors such as the data migration methodology, network throughput, system compatibility, and data consistency.
  • Big data task scheduling migration: This includes the migration of existing big data task scheduling systems, workflows, and scheduling policies to a new operating environment. This entails a thorough assessment of task dependencies, task adaptation and reconstruction (if required), performance optimization, deployment procedures, testing protocols, and verification processes.
  • Big data application migration: This refers to the migration of individual big data applications from one operating environment to another.

    The big data migration process is as follows:

    Figure 1 Big data migration process

For details about how to migrate big data applications, see Application Cloud Migration. This section describes only the special precautions for migrating big data applications.

The following outlines each phase of the big data migration process:
  1. Survey: Conduct a comprehensive assessment of the existing big data platform, detailing its current version, configuration specifications, resource quantities, data types, data volume, and the types and number of associated tasks.
  2. Design: Design the big data deployment architecture, data migration solution, task migration solution, and data verification solution.
  3. Deployment: Deploy the big data platform, including cluster deployment and task scheduling platform deployment.
  4. Migration: Migrate data and tasks.
  5. Verification: Verify data and tasks.
  6. Switchover: Switch over the big data application.
  7. Assurance: Perform real-time monitoring and special O&M assurance for a certain period of time after the service cutover.

Refer to the survey methods in Big Data Survey to survey the status of the big data cluster, big data task scheduling platform, and big data applications.