DMS for Kafka

Huawei Cloud DMS for Kafka is a message queue service based on open-source Apache Kafka. It provides high throughput, data persistence, horizontal scalability, and stream data processing capabilities.

DataArts Migration supports different versions of Huawei Cloud DMS for Kafka. This ensures that you can seamlessly access the latest features of DMS for Kafka while maintaining compatibility with historical versions for stable data transmission and processing.

Preparation and Constraints

Network requirements
The DMS for Kafka data source can communicate with CDM. This ensures smooth data transmission. For details, see Enabling Network Connectivity.

Required permissions

Read permission: Grant the read-only permission of DMS for Kafka to the IAM user or user group of DataArts Migration through the DMS ReadOnlyAccess system policy. You can also create a custom policy to grant read permissions, such as the permission to query instance information.
Write permission: Grant the write permission of DMS for Kafka to the IAM user or user group of DataArts Migration through the DMS UserAccess or DMS FullAccess system policy. You can also create a custom policy to grant write permissions, such as the permission to create and modify instances.

Enabling ports: When configuring the DMS for Kafka data source, ensure that the following ports are enabled in the firewall and security group so that DataArts Migration can communicate with DMS for Kafka instances.

**Table 1** Service ports
Service	Port Type	Port Number	Usage
DMS-Kafka	TCP	9092	Intranet plaintext access port
		9093	Intranet ciphertext access port
		9094	Internet plaintext access port
		9095	Internet ciphertext access port

Supported Data Types

DataArts Migration reads and writes DMS for Kafka records in JSON format, and infers the record types. The following table lists the supported JSON data types.

JSON Data Type	Read	Write
STRING	√	√
INTEGER	√	√
LONG	√	√
DOUBLE	√	√
BOOLEAN	√	√

Supported Migration Scenarios

DataArts Migration supports the following offline synchronization modes:

Single table synchronization
DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.
Database and table shard synchronization
DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.
Entire DB migration
DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.

Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the supported DMS for Kafka migration scenarios.

Supported Migration Scenario	Single Table Read	Single Table Write	Database/Table Shard Read	Database/Table Shard Write	Entire DB Read	Entire DB Write
Supported	√	√	x	√	x	x

Core Capabilities

Connection configuration

Configuration Item	Supported	Description
SSL authentication	√	SSL authentication is supported for accessing DMS for Kafka. This ensures secure data transmission.
Connection attribute optimization	√	You can optimize connection attributes as needed, such as adjusting the connection timeout interval and heartbeat interval, to improve performance and stability.

Read capabilities

Configuration Item	Supported	Description
Incremental read	√	Incremental Kafka data can be filtered and read based on the start time and end time policy.
Shard concurrency	√	Kafka data can be read from different shards concurrently. This improves resource utilization and read performance, and is suitable for large datasets.
Data type parsing	JSON/CSV	Data in JSON and CSV formats can be parsed.
Nested JSON data parsing	√	Nested JSON data structures can be parsed. JSON data with multiple layers of nested fields can be correctly processed, ensuring data integrity and accuracy.
Custom fields	√	You can add computed columns, constant columns, or masking functions for tasks to meet personalized service requirements.
Dirty data processing	x	Abnormal data cannot be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.

Write capabilities

Configuration Item	Supported	Description
Data type parsing	JSON/CSV	Data in JSON and CSV formats can be written.
Concurrent write	√	Concurrent write can fully utilize cluster resources to improve the data write speed.
Dirty data processing	x	Abnormal data cannot be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.