PostgreSQL
DataArts Migration supports RDS for PostgreSQL on the cloud and the on-premises PostgreSQL data source. It is also compatible with GaussDB, Greenplum, and Kingbase. It meets data synchronization requirements in different deployment environments.
Preparation and Constraints
- Network requirements
The PostgreSQL data source can communicate with CDM. This ensures smooth data transmission. For details, see Enabling Network Connectivity.
- Database connection permissions
- Database connection permissions: The CONNECT permission is required, which allows users to connect to a specified database.
- Network access permissions: In the pg_hba.conf configuration file, enable the IP address of DataArts Migration to access the database.
- Table operation permission requirements
- USAGE permission on schemas: To view objects in a schema, you must have the USAGE permission on the schema.
- Reading data from an on-premises PostgreSQL database: The account must have the read-only permission (SELECT) on the table to be synchronized so that data can be read securely and accurately.
- Writing data to an on-premises PostgreSQL database: The account must have write permissions (INSERT, DELETE, and UPDATE) on the table to be synchronized so that data can be correctly written to the destination table.
Driver Selection
| Driver Name | How to Obtain | Recommended Version |
|---|---|---|
| POSTGRESQL | 42.3.4 |
Supported Data Types
DataArts Migration supports the following field types and their common variants in PostgreSQL 12 Community Edition. This ensures that DataArts Migration can correctly read and write data.
| Category | Field Type | PostgreSQL Read | PostgreSQL Write |
|---|---|---|---|
| Integer | smallint | √ | √ |
| int | √ | √ | |
| bigint | √ | √ | |
| smallserial | √ | √ | |
| serial | √ | √ | |
| bigserial | √ | √ | |
| Floating point number | float | √ | √ |
| DOUBLE PRECISION | √ | √ | |
| REAL | √ | √ | |
| Numeric | decimal(p,s) | √ | √ |
| NUMERIC | √ | √ | |
| Character | char | √ | √ |
| varchar | √ | √ | |
| text | √ | √ | |
| Time | date | √ | √ |
| timestamp | √ | √ | |
| timestamptz | √ | √ | |
| time | √ | √ | |
| timez | √ | √ | |
| interval | √ | √ | |
| Binary | BYTEA | √ | √ |
| Network | INET | × | × |
| Currency | money | √ | √ |
| Bit | bit | √ | √ |
| varbit | √ | √ | |
| Boolean | boolean | √ | √ |
| Others | int1 (GaussDB) | √ | √ |
| JSONB | √ | x | |
| UUID | x | x |
Supported Migration Scenarios
DataArts Migration supports the following offline synchronization modes:
- Single table synchronization
DataArts Migration supports table/file synchronization in data ingestion into a data lake or data migration to the cloud.
- Database and table shard synchronization
DataArts Migration supports synchronization of data from multiple databases and tables in data ingestion into a data lake or data migration to the cloud.
- Entire DB migration
DataArts Migrations supports synchronization of data from an on-premises database in data ingestion into a data lake or data migration to the cloud.
DataArts Migration supports synchronization of data from an on-premises database to the cloud. For details about the supported data source types, see the data source types supported by entire database synchronization.
Database and table shard synchronization and entire DB migration are not supported in all regions. The following table lists the supported PostgreSQL migration scenarios.
| Supported Migration Scenario | PostgreSQL Single Table Read | PostgreSQL Single Table Write | PostgreSQL Database/Table Shard Read | PostgreSQL Database/Table Shard Write | PostgreSQL Entire DB Read | PostgreSQL Entire DB Write |
|---|---|---|---|---|---|---|
| Supported | √ | √ | √ (supported in some regions) | √ | √ (supported in some regions) | x |
Core Capabilities
- Connection configuration
Configuration Item
Supported
Description
User/AK
√
User authentication ensures connection security.
SSL encryption
√
SSL encryption ensures secure data transmission. Currently, SSL authentication can be enabled only for RDS.
SSL authentication
√
Currently, SSL authentication can be enabled only for RDS. The standard Huawei Cloud CA certificate is used for authentication.
Private certificate
x
Private certificates are not supported.
Connection configuration optimization
√
Connection configuration such as connectTimeout can be optimized to improve connection performance.
Custom driver
√
Custom drivers are supported and provide better flexibility.
- Read capabilities
Configuration Item
Supported
Description
Shard concurrency
√
Horizontal sharding based on primary keys or common fields and multi-thread concurrent extraction significantly improve the throughput and efficiency.
Dirty data processing
√
Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.
Custom fields
√
You can add computed columns, constant columns, or masking functions for tasks to meet personalized service requirements.
Incremental read
√
Where conditions and the SQL mode enable incremental data reading.
Stream and batch reading
Batch reading
Batch reading improves efficiency when there is a small or medium amount of data.
Optimization of the number of rows read
√
You can set Fetch Size in the connection to properly control the amount of data to be transmitted. This improves performance and prevents a transmission delay or the system from being overloaded when there is a large amount of data.
View read
√
Data can be read from views. This enables flexible data integration and processing.
- Write capabilities
Configuration Item
Supported
Description
Data source optimization parameters
√
Optimization parameters such as batchSize and socketTimeout are supported at the source. They improve write performance.
Dirty data processing
√
Abnormal data can be written to the dirty data bucket to prevent job failures caused by a small amount of abnormal data.
Conflict resolution
x
The conflict resolution mechanism is not supported.
Pre- and post-import processing
√
Operations such as preSql and delete can clean and process data before and after data import.
Concurrent write
√
Concurrent write improves efficiency.
Optimization of the number of written rows
x
You can set the number of rows written by each request in the connection to properly control the amount of data to be transmitted. This improves performance and prevents a transmission delay or the system from being overloaded when there is a large amount of data. This function is not supported for this data source.
Creating a Data Source
Create a data source in Management Center. For details, see Configuring Data Connection Parameters.
Creating an Offline Data Migration Job
Create a PostgreSQL migration job in DataArts Factory. For details, see Creating an Offline Processing Migration Job.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot