Help Center> Data Replication Service> Best Practices> Real-Time Synchronization> From On-Premises PostgreSQL to RDS for PostgreSQL
Updated on 2024-07-15 GMT+08:00

From On-Premises PostgreSQL to RDS for PostgreSQL

DRS supports data synchronization from on-premises MySQL databases to RDS for PostgreSQL instances. With DRS, you can synchronize databases online with zero downtime and your services and databases can remain operational during migration.

This section describes how to configure DRS to migrate data from an on-premises PostgreSQL database to a RDS for PostgreSQL instance. The following network types are supported:

  • VPN
  • Public network

Diagram

Figure 1 VPN network
Figure 2 Public network+SSL connection

Synchronization Process

Figure 3 Flowchart

Synchronization Suggestions (Important)

  • Database synchronization is closely impacted by a wide range of environmental and operational factors. To ensure the synchronization goes smoothly, you are advised to perform a test run before the actual synchronization to help you detect and resolve any potential issues in advance. Recommendations on how to minimize any potential impacts on your data base are provided in this section.
  • It is strongly recommended that you start a task during off-peak hours. A less active database is easier to synchronize successfully. If the data is fairly static, there is less likely to be any severe performance impacts during the synchronization.

Notes on Synchronization (Important)

Before creating a real-time synchronization task, read this section carefully.

For details, see Precautions in Real-Time Synchronization.

Synchronization Preparations

  1. Permissions
    Table 1 lists the source and destination database user permissions required in full and incremental synchronizations from on-premises PostgreSQL databases to PostgreSQL DB instances on the current cloud.
    Table 1 Account permissions

    Synchronization Type

    Full

    Full+Incremental

    Source

    The CONNECT permission for databases, the USAGE permission for schemas, the SELECT permission for tables, the SELECT permission for sequences, and the SELECT permission for system table catalog pg_catalog.pg_authid (used for synchronizing user passwords)

    The CONNECT permission for databases, the USAGE permission for schemas, the SELECT permission for tables, the SELECT permission for sequences, the SELECT permission for system table catalog pg_catalog.pg_authid (used for synchronizing user passwords), the UPDATE, DELETE, and TRUNCATE permissions for tables that do not have primary keys, and the permission to create replication connections

    NOTE:
    • The UPDATE, DELETE, and TRUNCATE permissions for tables that do not have primary keys are only used to temporarily lock tables to ensure data consistency after the migration.
    • To add the permission to create replication connections, perform the following steps:
      1. Add host replication <src_user_name> <drs_instance_ip>/32 md5 before all configurations in the pg_hba.conf file of the source database.
      2. Run select pg_reload_conf(); in the source database as user SUPERUSER, or restart the DB instance to apply the changes.

    Destination

    • Database-level: The CREATEDB permission is required.
    • Table-level:
      • To synchronize databases, the CREATEDB permission is required.
      • To synchronize a schema, the CONNECT and CREATE permissions for the database that contains the schema are required.
      • To synchronize objects in a schema, the CONNECT permission for the database that contains the schema, and the USAGE and CREATE permissions for the schema that contain the objects are required.
    • Synchronization user: The CREATEROLE permission is required.
    • Synchronization user permissions: The default privilege cannot be modified. Otherwise, the object permissions of the destination database may be inconsistent with those of the source database.
    NOTE:

    To synchronize event triggers, text search parsers, and text search templates, the destination database version must be RDS for PostgreSQL 11.11 or later, and the destination database user must be user root or a member of user root.

    • Source database permissions:

      The source database user must have all the required permissions listed in Table 1. If the permissions are insufficient, create a user that has all of the permissions on the source database.

    • Destination database permissions:

      If the destination database is a RDS for PostgreSQL instance, the initial account can be used.

  2. Network settings
    • Source database network settings:

      You can synchronize data from on-premises PostgreSQL databases to RDS for PostgreSQL databases on the current cloud through a VPN or public network. Enable public accessibility or establish a VPN for the on-premises PostgreSQL databases based on your service requirements. You are advised to synchronize data through a public network, which is more convenient and cost-effective.

    • Destination database network settings:
      • If you want to access the destination databases through a VPN, enable the VPN service first so that the source database can communicate with the destination database.
      • If the source database attempts to access the destination database through a public network, no further configuration is required.
  3. Security rules
    1. Source database security group settings:
      • If the synchronization is performed over a public network, add the EIP of the DRS synchronization instance to the network whitelist of the source PostgreSQL database to enable the source database to communicate with the current cloud. Before configuring the network whitelist, you need to obtain the EIP of the synchronization instance.
        The IP address on the Configure Source and Destination Databases page is the EIP of the synchronization instance.
        Figure 4 Synchronization instance EIP
      • If the synchronization is performed over a VPN network, add the private IP address of the DRS synchronization instance to the network whitelist of the source database to enable the source database to communicate with the destination database.

        The IP address on the Configure Source and Destination Databases page is the private IP address of the synchronization instance.

      If you do take this step, then once the synchronization is complete, you should delete this item from the whitelist or your system will insecure.

    2. Destination database security group settings:

      By default, the destination database and the DRS synchronization instance are in the same VPC and can communicate with each other. No further configuration is required.

  4. Other

    DRS can synchronize only some types of DDL statements of PostgreSQL. For details, see related guides. Properly plan the source database services. Do not perform DDL operations that are not supported during synchronization.

Procedure

  1. Create a synchronization task.

    1. Log in to the management console and choose Databases > Data Replication Service to go to the DRS console.
    2. On the Data Synchronization Management page, click Create Synchronization Task.
    3. On the displayed page, specify the task name, description, and synchronization instance details.
      Figure 5 Synchronization instance information
      Table 2 Task settings

      Parameter

      Description

      Region

      The region where the replication instance is deployed. You can change the region. To reduce latency and improve access speed, select the region closest to your workloads.

      Project

      The project corresponds to the current region and can be changed.

      Task Name

      The task name consists of 4 to 50 characters, starts with a letter, and can contain only letters (case-insensitive), digits, hyphens (-), and underscores (_).

      Description

      The description consists of a maximum of 256 characters and cannot contain the following special characters: =<>&'\"

      Table 3 Synchronization instance settings

      Parameter

      Description

      Data Flow

      To the cloud

      Source DB Engine

      Select PostgreSQL.

      Destination DB Engine

      Select PostgreSQL.

      Network Type

      Select Public network. Enabling SSL is recommended. It may slow down the synchronization by 20% to 30% but it ensures data security.

      Destination DB Instance

      The RDS for PostgreSQL instance you created.

      Synchronization Mode

      • Full+Incremental

        This synchronization mode allows you to synchronize data in real time. After a full synchronization initializes the destination database, an incremental synchronization parses logs to ensure data consistency between the source and destination databases.

      • Full

        All objects and data in non-system databases are synchronized to the destination database at a time. This mode is applicable to scenarios where service interruption is acceptable.

      • Incremental
      NOTE:

      If you select the Full+Incremental synchronization mode, ongoing changes made to the data will be synchronized to the destination database in real time, ensuring that the source database remain accessible.

    4. On the Configure Source and Destination Databases page, wait until the synchronization instance is created. Then, specify source and destination database information and click Test Connection for both the source and destination databases to check whether they have been connected to the synchronization instance. After the connection tests are successful, select the check box before the agreement and click Next.
      Figure 6 Source and destination database details
      Table 4 Source database settings

      Parameter

      Description

      IP Address or Domain Name

      The IP address or domain name of the source database.

      Port

      The port of the source database.

      Range: 1 - 65535

      Database Username

      A username for the source database.

      Database Password

      The password for the database username.

      SSL Connection

      To improve data security during synchronization over a public network, you are advised to enable SSL to encrypt synchronization links and upload a CA certificate.

      Table 5 Destination database settings

      Parameter

      Description

      DB Instance Name

      The RDS for PostgreSQL instance you have selected during the synchronization instance creation is displayed by default and cannot be changed.

      Database Username

      The username for accessing the destination RDS for PostgreSQL instance.

      Database Password

      The password for the database username.

    5. On the Set Synchronization Task page, select the synchronization object and user.
      Figure 7 Synchronization mode
      Table 6 Synchronization object

      Parameter

      Description

      Flow Control

      You can choose whether to control the flow.

      Incremental Conflict Policy

      The real-time synchronization function provides conflict policies for you to choose from if the synchronized data conflicts with existing data (such as the source and destination databases containing the same primary or unique keys) in the destination database.

      Select any of the following conflict policies:

      • Ignore

        The system will skip the conflicting data and continue the subsequent synchronization process.

      • Report error

        The synchronization task will be stopped and fail.

      • Overwrite

        Conflicting data will be overwritten.

      In the following scenarios, you can select Ignore or Overwrite. In other scenarios, you are advised to select Report error.

      • Data already exists in the destination database.
      • Multiple source databases are synchronized to one destination database.
      • Data in the destination database is updated manually.

      Synchronize

      Available options: Index, Incremental DDLs, and Populate materialized views during the full synchronization phase

      Populate materialized views during the full synchronization phase: This option takes effect only for materialized views that was populated in the source database. This operation affects the full synchronization performance. You perform this operation after the full synchronization is complete.

      Synchronization Object

      DRS supports database- and table-level synchronization. You can select databases or tables for synchronization based on your service requirements. A single task can synchronize objects from only one database. Database name mapping is supported.

      • Database-level synchronization synchronizes all objects in the source database to the destination database.
      • Table-level synchronization synchronizes the selected table objects to the destination database.
        NOTE:

        If you select table-level synchronization, the selected table may have dependencies on other objects in the database. If the referenced objects are not selected and do not exist in the destination database, the task will fail. Ensure that all referenced objects are selected before synchronization or select database-level synchronization.

      Synchronize Account

      During the synchronization, you can synchronize accounts based on your service requirements.

      There are two types of accounts: accounts that can be synchronized and accounts that cannot be synchronized. For accounts that cannot be synchronized, you can view details in the Remarks column and determine whether to synchronize accounts and permissions based on your service requirements.

    6. On the Check Task page, check the synchronization task.
      • If any check fails, review the cause and rectify the fault. After the fault is rectified, click Check Again.

        For details about how to handle check failures, see Checking Whether the Source Database Is Connected in Data Replication Service User Guide.

      • If all check items are successful, click Next.
        Figure 8 Task Check

        You can proceed to the next step only when all checks are successful. If there are any items that require confirmation, view and confirm the details first before proceeding to the next step.

    7. On the Confirm Task page, specify Start Time and click Next.
      • Set Start Time to Start upon task creation or Start at a specified time based on site requirements.
      • After a synchronization task is started, the performance of the source and destination databases may be affected. You are advised to start a synchronization task during off-peak hours.
    8. After the task is submitted, go back to the Data Synchronization Management page to view the task status.

  2. Manage the migration task.

    The migration task contains two phases: full migration and incremental migration. You can manage them in different phases.
    • Full synchronization

      Viewing the synchronization progress: Click the target full synchronization task, and on the Migration Progress tab, you can see the synchronization progress of the structure, data, indexes, and synchronization objects. When the progress reaches 100%, the synchronization is complete.

    • Incremental synchronization
      • Viewing the synchronization delay: After the full synchronization is complete, an incremental synchronization starts. On the Data Synchronization Management page, click the target synchronization task. On the displayed page, click Synchronization Progress to view the synchronization delay of the incremental synchronization. If the synchronization delay is 0s, the destination database is being synchronized with the source database in real time. You can also view the data consistency on the Synchronization Comparison tab.
        Figure 9 Viewing the synchronization delay
      • Viewing the synchronization comparison results: To minimize service downtime, click the name of an incremental synchronization task. On the Synchronization Comparison page, create a comparison task.

        For details, see Comparing Migration Items in Data Replication Service User Guide.

  3. Cut over services.

    You are advised to start the cutover process during off-peak hours. At least one complete data comparison is performed during off-peak hours. To obtain accurate comparison results, start data comparison at a specified time point during off-peak hours. If it is needed, select Start at a specified time for Comparison Time. Due to slight time difference and continuous operations on data, inconsistent comparison results may be generated, reducing the reliability and validity of the results.

    1. Interrupt services first. If the workload is not heavy, you may not need to interrupt the services.
    2. Run the following statement on the source database and check whether any new sessions execute SQL statements within the next 1 to 5 minutes. If there are no new statements executed, the service has been stopped.
      select * from pg_stat_activity;

      The process list queried by the preceding statement includes the connection of the DRS synchronization instance. If no additional session executes SQL statements, the service has been stopped.

    3. On the Synchronization Progress page, view the synchronization delay. When the delay is displayed as 0s and remains stable for a period, then you can perform a data-level comparison between the source and destination databases. For details about the time required, refer to the results of the previous comparison.
      • If there is enough time, compare all objects.
      • If there is not enough time, use the data-level comparison to compare the tables that are frequently used and that contain key business data or inconsistent data.
    4. Determine an appropriate time to cut the services over to the destination database. After services are restored and available, the synchronization is complete.

  4. Complete the synchronization.

    1. Stop the synchronization task. After databases and services are synchronized to the destination database, to prevent operations on the source database from being synchronized to the destination database to overwrite data, you can stop the synchronization task. This operation only deletes the synchronization instance, and the synchronization task is still displayed in the task list. You can view or delete the task. You will not be charged for the synchronization task after the task is stopped.
    2. Delete the synchronization task. After the synchronization task is complete, you can delete it. After the synchronization task is deleted, it will not be displayed in the task list.