Updated on 2025-05-07 GMT+08:00

Data Survey

The following describes the data survey.

Table 1 Data survey

Survey Content

Purpose

Example

Data type

Select a proper migration tool based on the data type.

HDFS, HBase, and MySQL

Data volume

Obtain historical data volume to evaluate the historical data migration period.

Obtain daily incremental data to evaluate the daily incremental data synchronization period.

Historical data: X PB

Daily incremental data: Y TB

Data layers

Survey the data layers to determine the migration priority and data verification standards.

Data access layer, intermediate layer, and result layer

Data permissions

Determine the permission data migration method based on the source data permission control component.

Sentry and Ranger

Data importance

Survey data helps identify core data from non-core data, setting migration priorities and data verification standards.

Core data: transaction data

Non-core data: log data

Data update frequency

Determine the data migration plan and verification plan based on the update frequency.

Daily/weekly/monthly/real-time update

Task execution interval

Stagger peak hours of data migration, data verification, and services.

Offline task execution before and after work

The survey is performed through the current big data platform with surveys and interviews for supplement and confirmation.