Updated on 2026-05-20 GMT+08:00

MRS HBase data source

The data integration service supports MRS HBase 1.x and 2.x, meeting data synchronization requirements of different users in different deployment environments.

Preparation and Constraints

  • Network requirements

    The MRS HBase data source must communicate with the CDM network to ensure smooth data transmission. For details, see Enabling Network Connectivity.

  • Required permissions
    • Read permission: To read data from HBase, you need to assign the read-only permission of HBase to the IAM user or user group of the data integration service, for example, the MRS ReadOnlyAccess system policy. Alternatively, you can create a custom policy as required to assign the read-related permissions such as SELECT to the user or user group.
    • Write permission: To write data to HBase, in addition to the preceding OBS permissions, you need to grant the IAM user or user group of the data integration service the HBase write permission, for example, the MRS CommonOperations or MRS FullAccess system policy. You can also create a custom policy, grants write-related permissions, such as INSERT INTO TABLE and CREATE TABLE.
  • Access port: The bound ports vary depending on the MRS version. For details, see Common Ports for MRS Cluster Services.
    Table 1 Service ports

    Service

    Port

    Port Number

    Usage

    MRS Manager

    TCP

    28443

    Download the cluster configuration.

    TCP

    20009

    CAS authentication.

    TCP

    20029

    Internal communication of Manager.

    KDC

    TCP&UDP

    21730

    21731

    21732

    Kerberos Authentication

    HDFS

    TCP

    8020

    HDFS NameNode service port

    TCP

    9866

    HDFS dataNode service port

    HBase

    TCP

    16000

    HBase Master RPC port.

    TCP

    16020

    HBase RegionServer RPC port.

    ZooKeeper

    TCP

    2181

    ZooKeeper service port used for communications between the client and ZooKeeper cluster

Supported Migration Scenarios

DataArts Migration supports the following offline synchronization modes:

  • Single table synchronization

    DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.

  • Database and table shard synchronization

    DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.

  • Entire DB migration

    DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.

Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the migration scenarios supported by MRS HBase.

Supported Migration Scenario

Single Table Read

Single Table Write

Database/Table Shard Read

Database/Table Shard Write

Entire DB Read

Entire DB Write

Supported

x

x

x

Core Capabilities

  • Connection configuration

    Configuration Item

    Supported

    Description

    Authentication Mode

    SIMPLE

    KERBEROS

    The MRS cluster can be accessed in SIMPLE/KERBEROS authentication mode.

    Supported Versions

    1.x

    2.x

    Supports the read and write capabilities of HBase 1.x/2.x.

    SSL Authentication

    x

    HBase SSL authentication access, which ensures data transmission security. Currently, this capability is not supported.

  • Read capabilities

    Configuration Item

    Supported

    Description

    Incremental read

    The [RowKey condition] or [Start/End time] mode is supported to implement incremental read.

    Shard concurrency

    Supports horizontal sharding based on regions and multi-thread parallel extraction, significantly improving throughput efficiency.

    Custom fields

    You can add computed columns, constant columns, or masking functions for tasks to meet personalized service requirements.

    Dirty data processing

    Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

  • Write capabilities

    Configuration Item

    Supported

    Description

    Clear Data Before Import

    Data can be cleared before being imported.

    Dirty data processing

    Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

    Concurrent write

    Concurrent write improves efficiency.

Creating a Data Source

Create a data source in Management Center. For details, see Configuring Data Connection Parameters.

Creating an Offline Data Migration Job

Create a DataArts Fabric SQL integration job in DataArts Studio data development. For details, see Creating an Offline Processing Migration Job.