Help Center/ Data Replication Service/ Workload Replay/ To the cloud/ From MySQL to GaussDB(for MySQL)
Updated on 2024-09-25 GMT+08:00

From MySQL to GaussDB(for MySQL)

Supported Source and Destination Databases

Table 1 Supported databases

Source DB

Destination DB

  • ECS-hosted MySQL 5.6, 5.7, and 8.0
  • On-premises MySQL 5.6, 5.7, and 8.0
  • MySQL 5.6, 5.7, and 8.0 on other clouds

GaussDB(for MySQL) Primary/Standby

Database Account Permission Requirements

When using DRS to create a workload replay task, you are advised to ensure that permissions of the source database account are the same as those of the destination database account before starting the task.

Precautions

To ensure smooth workload replay, read the following notes before creating a task.

Table 2 Precautions

Type

Restrictions

Starting a task

  • Source database requirements:
    • The source database can be a self-managed MySQL database or a MySQL database on other clouds (such as ApsaraDB RDS for MySQL and PolarDB for MySQL). You can enable and export audit logs or insight logs.
    • SQL workload files have been recorded on the source database and uploaded to an OBS bucket on Huawei Cloud. DRS obtains the workload files from the OBS bucket.
  • Destination database requirements:
    • The destination database must be GaussDB(for MySQL).
    • Baseline data has been developed in the destination database. The closer the time for collecting baseline data is to the start time for workload capturing on the source database, the more accurate simulation will be for the replay.
  • Workload file requirements:
    • If a workload file contains SQL delimiters (such as ^^), a parsing exception may occur. As a result, the replay task fails.
    • The full SQL structure of a workload file must be complete. If any SQL statement in audit logs provided by the user is truncated, a parsing exception may occur.
    • The size of a single SQL statement in a workload file cannot exceed 1 MB.
    • If other statements are inserted into a transaction, a deadlock may occur.
    • Only .gz and .zip files can be uploaded.
  • Other notes:
    • If configuration parameters (such as innodb_buffer_pool_size and sqlmode) of the source database are inconsistent with those of the destination database, the replay progress may be slow or the replay may fail.
    • If a workload file is deleted or added during a task editing, you need to select Parse and Reset when resetting the task and then replay the workload file again. For details, see Resetting a Replay Task.
    • The workload replay process is executed concurrently. DDL statements and DML statements are executed in the same batch (10s), and all the statements may be executed in disorder.

Parsing a workload file

After a parsing file is selected, the file cannot be renamed.

Replaying a database workload

Only SELECT, INSERT, DELETE, UPDATE, and DDLs are supported.

Stopping a task

A finished task cannot be restarted.

Prerequisites

Procedure

  1. On the Workload Replay Management page, click Create Workload Replay Task.
  2. On the Create Replay Instance page, select a region and project, specify the task name, description, and the replay instance details, and click Create Now.

    • Task information description
      Figure 1 Workload replay task information
      Table 3 Task information

      Parameter

      Description

      Region

      The region where the replay instance is deployed. You can change the region.

      Project

      The project corresponds to the current region and can be changed.

      Task Name

      The task name must start with a letter and consist of 4 to 50 characters. It can contain only letters, digits, hyphens (-), and underscores (_).

      Description

      The description can contain up to 256 characters and cannot contain special characters !=<>&'\"

    • Replay instance information
      Figure 2 Replay instance information

      Table 4 Replay instance settings

      Parameter

      Description

      Data Flow

      Select To the cloud.

      • Current cloud refers to the workload replay scenario where both source and destination databases are Huawei Cloud DB instances.
      • To the cloud refers to the workload replay scenario where the destination database is a Huawei Cloud DB instance and data needs to be transferred.

      Source DB Engine

      Select MySQL.

      Source DB From

      Platform where the source database is from. The audit log format varies depending on the source database. For details, see Audit Log Format.

      Destination DB Engine

      Select GaussDB(for MySQL).

      Network Type

      Public network is used as an example.

      Available options: Public network, VPC, VPN or Direct Connect

      Destination DB Instance

      The GaussDB(for MySQL) DB instance you created. Ensure that baseline data has been developed in the destination database.

      Replay Instance Subnet

      Select the subnet where the replay instance is located. You can also click View Subnets to go to the network console to view the subnet where the instance resides.

      By default, the DRS instance and the destination DB instance are in the same subnet. You need to select the subnet where the DRS instance resides, and there are available IP addresses for the subnet. To ensure that the replay instance can be successfully created, only subnets with DHCP enabled are displayed.

      Specify EIP

      This parameter is available when you select Public network for Network Type. Select an EIP to be bound to the DRS instance. DRS will automatically bind the specified EIP to the DRS instance and unbind the EIP after the task is complete.

      For details about the data transfer fee generated using a public network, see EIP Price Calculator.

    • AZ
      Figure 3 AZ
      Table 5 Task AZ

      Parameter

      Description

      AZ

      Select the AZ where you want to create the DRS task. Selecting the one housing the source or destination database can provide better performance.

    • Enterprise Project and Tags
      Figure 4 Enterprise Project and Tags
      Table 6 Enterprise Project and Tags

      Parameter

      Description

      Enterprise Project

      An enterprise project you would like to use to centrally manage your cloud resources and members. Select an enterprise project from the drop-down list. The default project is default.

      For more information about enterprise project, see Enterprise Management User Guide.

      To customize an enterprise project, click Enterprise in the upper right corner of the console. The Enterprise Project Management Service page is displayed. For details, see Creating an Enterprise Project in Enterprise Management User Guide.

      Tags

      • This setting is optional. Adding tags helps you better identify and manage your tasks. Each task can have up to 20 tags.
      • If your organization has configured tag policies for DRS, add tags to tasks based on the policies. If a tag does not comply with the policies, task creation may fail. Contact your organization administrator to learn more about tag policies.
      • After a task is created, you can view its tag details on the Tags tab. For details, see Tag Management.

    If a task fails to be created, DRS retains the task for three days by default. After three days, the task automatically stops.

  3. After the replay instance is created, on the Configure Source and Destination Databases page, specify parameters in Source Database, Destination Database, and Task Settings. Then, click Test Connection for the destination database to check whether the destination database has been connected to the replay instance. After the connection test is successful, click Next.

    • Source database information
      Figure 5 Source database information
      Table 7 Source database settings

      Parameter

      Description

      Workload File Source

      Specifies where the workload file in the source database is from.

      Access Key ID (AK)

      Access key ID, which is a unique identifier used in conjunction with a secret access key to sign requests cryptographically.

      Secret Access Key (SK)

      Used together with the access key ID to sign requests cryptographically. It identifies a request sender and prevents the request from being modified.

      • Based on the principle of least permission, the AK/SK permissions must be minimized. If you can use both temporary and permanent AKs/SKs, you are advised to use a temporary AK/SK. Permanent AKs/SKs are used only in scenarios where temporary AKs/SKs cannot meet requirements. For example, if a large number of logs need to be downloaded for a long time, temporary AKs/SKs may become invalid due to timeout.
      • AK/SK information of the user is encrypted and temporarily stored in the system until the task is deleted.

      Security Token

      When a temporary AK/SK is used, Security Token must be used, and the recommended validity period is 24 hours. Otherwise, OBS bucket information may fail to be obtained during workload replay.

      Bucket Name

      Name of the OBS bucket for storing workload files.

      Endpoint

      OBS provides an endpoint for each region. An endpoint can be considered as the domain name of OBS in a region, and is used to process access requests from the region.

      Workload File Prefix

      Prefix of a file name in the OBS bucket. Only files whose names start with this prefix will be displayed.

      Workload Type

      Only Audit log is supported.

      Workload File

      Select the required workload file.

    • Destination database information
      Figure 6 Destination database information
      Table 8 Destination database settings

      Parameter

      Description

      DB Instance Name

      The GaussDB(for MySQL) instance you selected when creating the task. This parameter cannot be changed.

      Replay Connection IP Address

      The primary node IP address of a DB instance is selected by default, but if the instance has a proxy IP address, you can also select that address if needed.

      Database Username

      The username for accessing the destination database.

      Database Password

      The password for the database username.

      The username and password of the destination database are encrypted and temporarily stored on the DRS instance host during the workload replay. After the task is deleted, the username and password are permanently deleted.

    • Task Settings
      Figure 7 Task settings
      Table 9 Task settings

      Parameter

      Description

      SQL Type

      Select the SQL type to be replayed to the destination database. The default value is SELECT. The available options are SELECT, INSERT, UPDATE, DELETE, and DDL.

      Replay Mode

      You can select Performance or Transaction.

      • In performance mode, you can set how many concurrent connections are allowed. SQL statements are replayed to the destination database based on a set number of connections. The SQL execution sequence in the source database may be different from that in the destination database. The replay performance is better.
      • In transaction mode, you cannot set how many concurrent connections are allowed. The number of connections is dynamically adjusted based on the connections in the source database logs to ensure that transaction SQL statements in the same connection of the source database are executed in sequence.

      Filter out SQLs

      The system fuzzily matches SQL statements based on the entered conditions, ignores case sensitivity, and filters SQL logs to be replayed to the destination database. The SQL logs that meet the conditions will be filtered out. You can configure up to 10 filtering rules.

      Filter out SQLs Without Conditions

      This option is used to filter out SQL statements of the SELECT, UPDATE, and DELETE types that do not contain conditions (that is, filter out SQL statements without a where condition).

      Maximum Concurrent Connections

      The number of replay threads configured for a workload replay task. The default value is 8. The value ranges from 1 to 100.

      Acceleration Configuration

      The percentage of the replayed SQLs to the SQLs executed on the source database within the same period. The percentage cannot exceed the maximum performance of the workload replay task. The value can be Unlimited, 100%, or 200%.

  4. On the Check Task page, check the replay task.

    • If any check fails, review the cause and rectify the fault. After the fault is rectified, click Check Again.
    • If all check items are successful, click Next.

  5. On the displayed page, specify Start Time, Send Notification, SMN Topic, and Stop Abnormal Tasks After and confirm that the configured information is correct and click Submit to submit the task.

    Figure 8 Task startup settings
    Table 10 Task startup settings

    Parameter

    Description

    Start Time

    Set Start Time to Start upon task creation or Start at a specified time based on site requirements.

    NOTE:

    After a replay task is started, the performance of the source and destination databases may be affected. You are advised to start a replay task during off-peak hours.

    Send Notifications

    SMN topic. This parameter is optional. If an exception occurs during workload replay, the system will send a notification to the specified recipients.

    SMN Topic

    This parameter is available only after you enable Send Notifications and create a topic on the SMN console and add a subscriber.

    For details, see Simple Message Notification User Guide.

    Stop Abnormal Tasks After

    Number of days after which an abnormal task automatically stops. The value must range from 14 to 100. The default value is 14.

    NOTE:

    Tasks in the abnormal state are still charged. If tasks remain in the abnormal state for a long time, they cannot be resumed. Abnormal tasks running longer than the period you set (unit: day) will automatically stop to avoid unnecessary fees.

  6. After the task is submitted, view and manage it on the Workload Replay Management page.

    • You can view the task status. For more information about task status, see Task Statuses.
    • You can click in the upper right corner to view the latest task status.
    • By default, DRS retains a task in the Configuration state for three days. After three days, DRS automatically deletes background resources, but the task status remains unchanged. When you reconfigure the task, DRS applies for resources for the task again.
    • For a public network task, DRS needs to delete background resources after you stop the task. The EIP bound to the task cannot be restored to the Unbound state until background resources are deleted.