Help Center/ Data Replication Service/ Best Practices/ Real-Time Synchronization/ From ECS-hosted PostgreSQL to RDS for PostgreSQL
Updated on 2024-07-15 GMT+08:00

From ECS-hosted PostgreSQL to RDS for PostgreSQL

DRS helps you synchronize data from PostgreSQL databases on ECSs to PostgreSQL instances on the current cloud. With DRS, you can synchronize databases online with zero downtime and your services and databases can remain operational during migration.

This section describes how to use DRS to synchronize data from a PostgreSQL database on an ECS to a PostgreSQL instance on the current cloud. The following network scenarios are supported:

  • Source and destination databases are in the same VPC.
  • Source and destination databases are in different VPCs.

Diagram

Figure 1 Source and destination databases in the same VPC
Figure 2 Source and destination databases in the same region and different VPCs

Synchronization Process

Figure 3 Flowchart

Synchronization Suggestions (Important)

  • Database synchronization is closely impacted by a wide range of environmental and operational factors. To ensure the synchronization goes smoothly, you are advised to perform a test run before the actual synchronization to help you detect and resolve any potential issues in advance. Recommendations on how to minimize any potential impacts on your data base are provided in this section.
  • It is strongly recommended that you start a task during off-peak hours. A less active database is easier to synchronize successfully. If the data is fairly static, there is less likely to be any severe performance impacts during the synchronization.

Notes on Synchronization (Important)

Before creating a real-time synchronization task, read this section carefully.

For details, see Precautions in Real-Time Synchronization.

Synchronization Preparations

  1. Permissions
    Table 1 lists the source and destination database user permissions required in full and incremental synchronization from PostgreSQL databases on ECS to the PostgreSQL DB instances on the current cloud.
    Table 1 Account permissions

    Synchronization Type

    Full

    Full+Incremental

    Source

    The CONNECT permission for databases, the USAGE permission for schemas, the SELECT permission for tables, the SELECT permission for sequences, and the SELECT permission for system table catalog pg_catalog.pg_authid (used for synchronizing user passwords)

    The CONNECT permission for databases, the USAGE permission for schemas, the SELECT permission for tables, the SELECT permission for sequences, the SELECT permission for system table catalog pg_catalog.pg_authid (used for synchronizing user passwords), the UPDATE, DELETE, and TRUNCATE permissions for tables that do not have primary keys, and the permission to create replication connections

    NOTE:
    • The UPDATE, DELETE, and TRUNCATE permissions for tables that do not have primary keys are only used to temporarily lock tables to ensure data consistency after the migration.
    • To add the permission to create replication connections, perform the following steps:
      1. Add host replication <src_user_name> <drs_instance_ip>/32 md5 before all configurations in the pg_hba.conf file of the source database.
      2. Run select pg_reload_conf(); in the source database as user SUPERUSER, or restart the DB instance to apply the changes.

    Destination

    • Database-level: The CREATEDB permission is required.
    • Table-level:
      • To synchronize databases, the CREATEDB permission is required.
      • To synchronize a schema, the CONNECT and CREATE permissions for the database that contains the schema are required.
      • To synchronize objects in a schema, the CONNECT permission for the database that contains the schema, and the USAGE and CREATE permissions for the schema that contain the objects are required.
    • Synchronization user: The CREATEROLE permission is required.
    • Synchronization user permissions: The default privilege cannot be modified. Otherwise, the object permissions of the destination database may be inconsistent with those of the source database.
    NOTE:

    To synchronize event triggers, text search parsers, and text search templates, the destination database version must be RDS for PostgreSQL 11.11 or later, and the destination database user must be user root or a member of user root.

    • Source database permissions:

      The source database user must have all the required permissions listed in Table 1. If the permissions are insufficient, create a user that has all of the permissions on the source database.

    • Destination database permissions:

      If the destination database is a PostgreSQL database on the current cloud, the initial account can be used.

  2. Network settings
    • The source and destination databases must be in the same region.
    • The source and destination databases can be either in the same VPC or different VPCs.
      • If the source and destination databases are in different VPCs, the subnets of the source and destination databases are required to be in different CIDR blocks. You need to create a VPC peering connection between the two VPCs. For details, see VPC Peering Connection Creation Procedure.
      • If the source and destination databases are in the same VPC, the networks are interconnected by default.
  3. Security rules
    • In the same VPC, the network is connected by default. You do not need to set a security group.
    • In different VPCs, establish a VPC peering connection between the two VPCs. You do not need to set a security group.
  4. Other

    DRS can synchronize only some types of DDL statements of PostgreSQL. For details, see related guides. Properly plan the source database services. Do not perform DDL operations that are not supported during synchronization.

Procedure

  1. Create a synchronization task.

    1. Log in to the management console and choose Databases > Data Replication Service to go to the DRS console.
    2. On the Data Synchronization Management page, click Create Synchronization Task.
    3. On the displayed page, specify the task name, description, and synchronization instance details.
      Figure 4 Synchronization instance information
      Table 2 Task settings

      Parameter

      Description

      Region

      The region where the replication instance is deployed. You can change the region. To reduce latency and improve access speed, select the region closest to your workloads.

      Project

      The project corresponds to the current region and can be changed.

      Task Name

      The task name consists of 4 to 50 characters, starts with a letter, and can contain only letters (case-insensitive), digits, hyphens (-), and underscores (_).

      Description

      The description consists of a maximum of 256 characters and cannot contain the following special characters: =<>&'\"

      Table 3 Synchronization instance settings

      Parameter

      Description

      Data Flow

      To the cloud

      Source DB Engine

      Select PostgreSQL.

      Destination DB Engine

      Select PostgreSQL.

      Network Type

      Select a VPC network. Enabling SSL is recommended. It may slow down the synchronization by 20% to 30% but it ensures data security.

      Destination DB Instance

      The PostgreSQL instance you purchased.

      Synchronization Type

      • Full+Incremental

        This synchronization mode allows you to synchronize data in real time. After a full synchronization initializes the destination database, an incremental synchronization parses logs to ensure data consistency between the source and destination databases.

      • Full

        All objects and data in non-system databases are synchronized to the destination database at a time. This mode is applicable to scenarios where service interruption is acceptable.

      • Incremental
      NOTE:

      If you select the Full+Incremental synchronization type, data generated during the full synchronization will be synchronized to the destination database with zero downtime, ensuring that both the source and destination databases remain accessible.

    4. On the Configure Source and Destination Databases page, wait until the synchronization instance is created. Then, specify source and destination database information. You are advised to click Test Connection for both the source and destination databases to check whether they have been connected to the synchronization instance. After the connection tests are successful, select the check box before the agreement and click Next.

      Source databases are classified into two types: self-built databases on ECSs and RDS DB instances. Configure parameters based on different scenarios.

      • Scenario 1: Self-built databases on ECS - source database configuration
        Figure 5 Self-build on ECS - source database information
        Table 4 Self-build on ECS - source database information

        Parameter

        Description

        Source Database Type

        Select Self-built on ECS.

        VPC

        A dedicated virtual network in which the source database is located. It isolates networks for different services. You can select an existing VPC or create a VPC.

        Subnet

        A subnet provides dedicated network resources that are logically isolated from other networks, improving network security. The subnet must be in the AZ where the source database resides. You need to enable DHCP for creating the source database subnet.

        IP Address or Domain Name

        The IP address or domain name of the source database.

        Port

        Enter an integer ranging from 1 to 65535, which indicates the port number of the source database.

        Database Username

        A username for the source database.

        Database Password

        The password for the database username.

        SSL Connection

        To improve data security during synchronization over a public network, you are advised to enable SSL to encrypt synchronization links and upload a CA certificate.

      • Scenario 2: RDS DB instance - source database configuration
        Figure 6 RDS DB instance - source database information
        Table 5 RDS DB instance - source database information

        Parameter

        Description

        Source Database Type

        Select an RDS DB instance.

        DB Instance Name

        Select the PostgreSQL instance to be synchronized as the source DB instance.

        Database Username

        A username for the source database.

        Database Password

        The password for the database username.

      Figure 7 Destination database settings
      Table 6 Destination database settings

      Parameter

      Description

      DB Instance Name

      The RDS for PostgreSQL instance you have selected during the synchronization instance creation is displayed by default and cannot be changed.

      Database Username

      The username for accessing the destination PostgreSQL DB instance.

      Database Password

      The password for the database username.

    5. On the Set Synchronization Task page, select the synchronization object and user.
      Figure 8 Synchronization mode
      Table 7 Synchronization object

      Parameter

      Description

      Incremental Conflict Policy

      The real-time synchronization function provides conflict policies for you to choose from if the synchronized data conflicts with existing data (such as the source and destination databases containing the same primary or unique keys) in the destination database.

      Select any of the following conflict policies:

      • Ignore

        The system will skip the conflicting data and continue the subsequent synchronization process.

      • Report error

        The synchronization task will be stopped and fail.

      • Overwrite

        Conflicting data will be overwritten.

      In the following scenarios, you can select Ignore or Overwrite. In other scenarios, you are advised to select Report error.

      • Data already exists in the destination database.
      • Multiple source databases are synchronized to one destination database.
      • Data in the destination database is updated manually.

      Synchronize

      Available options: Index, Incremental DDLs, and Populate materialized views during the full synchronization phase

      Populate materialized views during the full synchronization phase: This option takes effect only for materialized views that was populated in the source database. This operation affects the full synchronization performance. You perform this operation after the full synchronization is complete.

      Synchronization Object

      Objects can be synchronized at the database or table level. You can synchronize the entire database or some tables based on service requirements. For a single task, you can synchronize objects in only one database of an instance. Database name mapping is supported.

      • Database-level synchronization synchronizes all objects in the source database to the destination database.
      • Table-level synchronization synchronizes the selected table objects to the destination database.
        NOTE:

        If you select table-level synchronization, the selected table may have dependencies on other objects in the database. If the referenced objects are not selected and do not exist in the destination database, the task will fail. Ensure that all referenced objects are selected before synchronization or select database-level synchronization.

      Synchronize Account

      During the synchronization, you can synchronize accounts based on your service requirements.

      There are two types of accounts: accounts that can be synchronized and accounts that cannot be synchronized. For accounts that cannot be synchronized, you can view details in the Remarks column and determine whether to synchronize accounts and permissions based on your service requirements.

    6. On the Check Task page, check the synchronization task.
      • If any check fails, review the cause and rectify the fault. After the fault is rectified, click Check Again.

        For details about how to handle check failures, see Checking Whether the Source Database Is Connected in Data Replication Service User Guide.

      • If all check items are successful, click Next.
        Figure 9 Task Check

        You can proceed to the next step only when all checks are successful. If there are any items that require confirmation, view and confirm the details first before proceeding to the next step.

    7. On the Confirm Task page, specify Start Time and click Next.
      • Set Start Time to Start upon task creation or Start at a specified time based on site requirements.
      • After a synchronization task is started, the performance of the source and destination databases may be affected. You are advised to start a synchronization task during off-peak hours.
    8. After the task is submitted, go back to the Data Synchronization Management page to view the task status.

  2. Manage the migration task.

    The migration task contains two phases: full migration and incremental migration. You can manage them in different phases.
    • Full synchronization

      Viewing the synchronization progress: Click the target full synchronization task, and on the Migration Progress tab, you can see the synchronization progress of the structure, data, indexes, and synchronization objects. When the progress reaches 100%, the synchronization is complete.

    • Incremental synchronization
      • Viewing the synchronization delay: After the full synchronization is complete, an incremental synchronization starts. On the Data Synchronization Management page, click the target synchronization task. On the displayed page, click Synchronization Progress to view the synchronization delay of the incremental synchronization. If the synchronization delay is 0s, the destination database is being synchronized with the source database in real time. You can also view the data consistency on the Synchronization Comparison tab.
        Figure 10 Viewing the synchronization delay
      • Viewing the synchronization comparison results: To minimize service downtime, click the name of an incremental synchronization task. On the Synchronization Comparison page, create a comparison task.

        For details, see Comparing Migration Items in Data Replication Service User Guide.

  3. Cut over services.

    You are advised to start the cutover process during off-peak hours. At least one complete data comparison is performed during off-peak hours. To obtain accurate comparison results, start data comparison at a specified time point during off-peak hours. If it is needed, select Start at a specified time for Comparison Time. Due to slight time difference and continuous operations on data, inconsistent comparison results may be generated, reducing the reliability and validity of the results.

    1. Interrupt services first. If the workload is not heavy, you may not need to interrupt the services.
    2. Run the following statement on the source database and check whether any new sessions execute SQL statements within the next 1 to 5 minutes. If there are no new statements executed, the service has been stopped.
      select * from pg_stat_activity;

      The process list queried by the preceding statement includes the connection of the DRS synchronization instance. If no additional session executes SQL statements, the service has been stopped.

    3. On the Synchronization Progress page, view the synchronization delay. When the delay is displayed as 0s and remains stable for a period, then you can perform a data-level comparison between the source and destination databases. For details about the time required, refer to the results of the previous comparison.
      • If there is enough time, compare all objects.
      • If there is not enough time, use the data-level comparison to compare the tables that are frequently used and that contain key business data or inconsistent data.
    4. Determine an appropriate time to cut the services over to the destination database. After services are restored and available, the synchronization is complete.

  4. Complete the synchronization.

    1. Stop the synchronization task. After databases and services are synchronized to the destination database, to prevent operations on the source database from being synchronized to the destination database to overwrite data, you can stop the synchronization task. This operation only deletes the synchronization instance, and the synchronization task is still displayed in the task list. You can view or delete the task. You will not be charged for the synchronization task after the task is stopped.
    2. Delete the synchronization task. After the synchronization task is complete, you can delete it. After the synchronization task is deleted, it will not be displayed in the task list.