DMS for Kafka
Huawei Cloud DMS for Kafka is a message queue service based on open-source Apache Kafka. It provides high throughput, data persistence, horizontal scalability, and stream data processing capabilities.
DataArts Migration supports different versions of Huawei Cloud DMS for Kafka. This ensures that you can seamlessly access the latest features of DMS for Kafka while maintaining compatibility with historical versions for stable data transmission and processing.
Preparation and Constraints
- Network requirements
The DMS for Kafka data source can communicate with CDM. This ensures smooth data transmission. For details, see Enabling Network Connectivity.
- Required permissions
- Read permission: Grant the read-only permission of DMS for Kafka to the IAM user or user group of DataArts Migration through the DMS ReadOnlyAccess system policy. You can also create a custom policy to grant read permissions, such as the permission to query instance information.
- Write permission: Grant the write permission of DMS for Kafka to the IAM user or user group of DataArts Migration through the DMS UserAccess or DMS FullAccess system policy. You can also create a custom policy to grant write permissions, such as the permission to create and modify instances.
- Enabling ports: When configuring the DMS for Kafka data source, ensure that the following ports are enabled in the firewall and security group so that DataArts Migration can communicate with DMS for Kafka instances.
Table 1 Service ports Service
Port Type
Port Number
Usage
DMS-Kafka
TCP
9092
Intranet plaintext access port
9093
Intranet ciphertext access port
9094
Internet plaintext access port
9095
Internet ciphertext access port
Supported Data Types
DataArts Migration reads and writes DMS for Kafka records in JSON format, and infers the record types. The following table lists the supported JSON data types.
| JSON Data Type | Read | Write |
|---|---|---|
| STRING | √ | √ |
| INTEGER | √ | √ |
| LONG | √ | √ |
| DOUBLE | √ | √ |
| BOOLEAN | √ | √ |
Supported Migration Scenarios
DataArts Migration supports the following offline synchronization modes:
- Single table synchronization
DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.
- Database and table shard synchronization
DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.
- Entire DB migration
DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.
Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the supported DMS for Kafka migration scenarios.
| Supported Migration Scenario | Single Table Read | Single Table Write | Database/Table Shard Read | Database/Table Shard Write | Entire DB Read | Entire DB Write |
|---|---|---|---|---|---|---|
| Supported | √ | √ | x | √ | x | x |
Core Capabilities
- Connection configuration
Configuration Item
Supported
Description
SSL authentication
√
SSL authentication is supported for accessing DMS for Kafka. This ensures secure data transmission.
Connection attribute optimization
√
You can optimize connection attributes as needed, such as adjusting the connection timeout interval and heartbeat interval, to improve performance and stability.
- Read capabilities
Configuration Item
Supported
Description
Incremental read
√
Incremental Kafka data can be filtered and read based on the start time and end time policy.
Shard concurrency
√
Kafka data can be read from different shards concurrently. This improves resource utilization and read performance, and is suitable for large datasets.
Data type parsing
JSON/CSV
Data in JSON and CSV formats can be parsed.
Nested JSON data parsing
√
Nested JSON data structures can be parsed. JSON data with multiple layers of nested fields can be correctly processed, ensuring data integrity and accuracy.
Custom fields
√
You can add computed columns, constant columns, or masking functions for tasks to meet personalized service requirements.
Dirty data processing
x
Abnormal data cannot be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.
- Write capabilities
Configuration Item
Supported
Description
Data type parsing
JSON/CSV
Data in JSON and CSV formats can be written.
Concurrent write
√
Concurrent write can fully utilize cluster resources to improve the data write speed.
Dirty data processing
x
Abnormal data cannot be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.
Creating a Data Source
Create a data source in Management Center. For details, see Configuring Data Connection Parameters.
Creating an Offline Data Migration Job
Create a Kafka migration job in DataArts Factory. For details, see Creating an Offline Processing Migration Job.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot