Updated on 2026-05-20 GMT+08:00

LakeFormation

Huawei Cloud LakeFormation is an enterprise-class, one-stop data lake construction service. It provides unified metadata management, fine-grained permission control, and compatibility with open-source ecosystems through decoupled storage and compute. These capabilities help enterprises build and operate data lakes efficiently.

DataArts Migration can efficiently migrate data from and to LakeFormation.

How It Works

LakeFormation data is integrated through writing of native OBS files. Hive PARQUET/ORC data, partitioned tables, and non-partitioned tables can be processed. The integration delivers excellent write performance.

Preparation and Constraints

  • Network requirements

    The LakeFormation data source can communicate with CDM. This ensures smooth data transmission. For details, see Enabling Network Connectivity.

  • Required permissions
    • LakeFormation metadata read and write permissions: DataArts Migration reads and writes LakeFormation data. The LakeFormation CommonOperations or LakeFormation FullAccess system policy must be assigned to DataArts Migration. For details, see LakeFormation Permissions.
    • OBS write permission: DataArts Migration reads files from and writes files to OBS. The OBS OperateAccess or OBS Administrator system policy can be assigned to DataArts Migration.
  • Table format restrictions

    Currently, only Hive tables of LakeFormation can be written to DataArts Migration.

Supported Data Types

The following table lists the types of LakeFormation data that can be written.

Data Type

LakeFormation Data Type

Write

Numeric

TINYINT

SMALLINT

INT

BIGINT

FLOAT

DOUBLE

DECIMAL

Boolean

BOOLEAN

Character

CHAR

VARCHAR

STRING

Date/Time

DATE

TIMESTAMP

Binary

BYTEA

Complex type

ARRAY

MAP

UNIONTYPE

x

STRUCT

x

Supported File Storage Formats

The following table lists the LakeFormation file storage formats.

Data Source Storage Format

Write

PARQUET

ORC

AVRO

x

JSON

x

XML

x

CSV

x

TEXT

x

RC

x

SEQUENCE

x

Supported Migration Scenarios

DataArts Migration supports the following modes for synchronizing on-premises data:

  • Single table synchronization

    DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.

  • Database and table shard synchronization

    DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.

  • Entire DB migration

    DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.

Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the supported LakeFormation data migration scenarios.

Supported Migration Scenario

Single Table Read

Single Table Write

Database/Table Shard Read

Database/Table Shard Write

Entire DB Read

Entire DB Write

Supported

x

x

x

x

Core Capabilities

  • Connection configuration

    Configuration Item

    Supported

    Description

    AK/SK authentication

    AK/SK authentication is used to access LakeFormation.

    Agency authentication

    An IAM agency authorizes roles to access the service.

  • Write capabilities

    Configuration Item

    Supported

    Description

    Write Mode

    LOAD

    LOAD OVERWRITE

    Two write modes are supported: LOAD and LOAD OVERWRITE.

    • LOAD adds data to a destination table and is applicable to writing incremental data.
    • LOAD OVERWRITE overwrites the data in the destination table or partition.

    Dirty Data Processing

    x

    Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data. This function is not supported currently.

    Concurrent Write

    Concurrent write can fully utilize cluster resources to improve the data write speed.

    Table Creation in Editing State

    A destination table can be created during the configuration of a job that migrates data from a semi-structured or structured data source to LakeFormation.

Creating a Data Source

Create a data source in Management Center. For details, see Configuring Data Connection Parameters.

Creating an Offline Data Migration Job

Create a LakeFormation migration job in DataArts Factory. For details, see Creating an Offline Processing Migration Job.