Updated on 2023-11-27 GMT+08:00

Scaling Out a Cluster

When you need more compute and storage resources, add more nodes for cluster scale-out on the management console.

  • If a cluster is billed in yearly/monthly mode, new nodes in the cluster will also be billed in this mode.
  • When you scale out the standard data warehouse cluster, use the same storage specifications as the cluster.
  • Nodes cannot be added to a hybrid data warehouse (standalone).

After the data in a data warehouse is deleted, the occupied disk space may not be released, resulting in dirty data and disk waste. Therefore, if you need to scale out your cluster due to insufficient storage capacity, run the VACUUM command to reclaim the storage space first. If the used storage capacity is still high after you run the VACUUM command, you can scale out your cluster. For details about the VACUUM syntax, see section VACUUM in SQL Syntax Reference.

Impact on the System

  • Before the scale-out, exit the client connections that have created temporary tables because temporary tables created before or during the scale-out will become invalid and operations performed on these temporary tables will fail. Temporary tables created after the scale-out will not be affected.
  • After you start a scale-out task, the cluster automatically takes a snapshot before the task begins.
  • During the scale-out, functions such as cluster restart, scale-out, snapshot creation, database administrator password resetting, and cluster deletion are disabled.
  • During an offline scale-out, the cluster automatically restarts. Therefore, the cluster stays Unavailable for a period of time. After the cluster is restarted, the status becomes Available. After scale-out, the system dynamically redistributes user data among all nodes in the cluster.
  • During offline scale-out, stop all services or run only a few query statements. During table redistribution, a shared lock is added to tables. All insert, update, and delete operations as well as DDL operations on the tables are blocked for a long time, which may cause a lock wait timeout. After a table is redistributed, you can access the table. Do not perform queries that take more than 20 minutes during the redistribution (the default time for applying for the write lock during redistribution is 20 minutes). Otherwise, data redistribution may fail due to lock wait timeout.
  • In an online scale-out, during node addition, the cluster is locked and database objects are checked. Do not create or delete databases or tablespaces in this period, or the cluster may fail to be locked.
  • During online scale-out, you can perform insert, update, and delete operations on tables, but data updates are still be blocked for a short period of time. Redistribution consumes lots of CPU and I/O resources, which will greatly impact job performance. Therefore, perform redistribution when services are stopped or during periods of light load. Phase-based scale-out is also recommended: Perform high-concurrency redistribution during periods of light load, and stop redistribution or perform low-concurrency redistribution during periods of heavy load.
  • If a new snapshot is created for the cluster after the scale-out, the new snapshot contains data on the newly added nodes.
  • If the cluster scale-out fails, the database automatically performs the rollback operation in the background so that the number of nodes in the cluster can be restored to that before the scale-out.
    • If the rollback is successful and the cluster can be normally used, you can perform Scale Out again. If the scale-out still fails, contact the technical support.
    • If the database fails to be rolled back due to some exceptions, the cluster may become Unavailable. In this case, you cannot perform Scale Out or restart the cluster. Contact the technical support.
  • In the cloud native 9.0.2 scale-out scenario, if the number of buckets allocated to each DN is not between [3, 20], automatic scaling is triggered. You can view the number of buckets using the GUC parameter table_buckets.
    • Currently, the bucket scaling supports only the offline mode. The procedure is the same as that of the existing scaling procedure. The system automatically determines and executes the bucket scaling process.
    • During the scaling process, the cluster restarts. The restart takes several minutes. During the restart, all connections are closed.
    • After the restart is complete, the database can be read but cannot be written until data redistribution is complete.

    For example, if the number of buckets on the current node is 32 and the number of DNs in the logical cluster is 9, and the number of DNs needs to be expanded to 15, as 32/15=2 (rounded down) does not meet the requirement of [3,20], automatic scaling is triggered.

Prerequisites

  • The cluster to be scaled out is in the Available or Unbalanced state.
  • The number of nodes to be added must be less than or equal to the available nodes. Otherwise, system scale-out is not allowed.
  • To scale out a cluster as an IAM user, ensure that the IAM user has permissions for VPC, EVC, and BMS.

Scaling Out a Cluster

  • A cluster becomes read-only during scale-out. Exercise caution when performing this operation.
  • To ensure data security, you are advised to create a snapshot before the scale-out. For details about how to create a snapshot, see Manual Snapshots.
  • After you start a scale-out, the system first checks for scale-out prerequisites. If your cluster fails the check, modify configurations as prompted and try again. For details, see What Do I Do If the Scale-out Check Fails?
  1. Log in to the GaussDB(DWS) management console.
  2. Click Clusters.

    All clusters are displayed by default.

  3. In the Operation column of the target cluster, choose More > Scale Node > Scale Out. The scale-out page is displayed.

    In yearly/monthly billing mode, the number of nodes in the discount package is not displayed. The time remaining and the expiration time are displayed.

    Figure 1 Cluster scale-out

  4. Specify the number of nodes to be added.

    • The number of nodes after scale-out must be at least three nodes more than the original number. The maximum number of nodes that can be added depends on the available quota. In addition, the number of nodes after the scale-out cannot exceed 256.

      If the node quota is insufficient, click Increase quota to submit a service ticket and apply for higher node quota.

    • Flavor of the new nodes must be the same as that of existing nodes in the cluster.
    • The VPC, subnet, and security group of the cluster with new nodes added are the same as those of the original cluster.
    • The number of nodes to be added to a multi-AZ cluster must be a multiple of 3.

  5. Configure advanced parameters.

    • If you choose Default, Scale Online will be disabled, Auto Redistribution will be enabled, and Redistribution Mode will be Offline by default.
    • If you choose Custom, you can configure the following advanced configuration parameters for online scale-out:
      • Scale Online: Online scale-out can be enabled. During online scale-out, data can be added, deleted, modified, and queried in the database; and some DDL syntaxes are supported. Errors will be reported for unsupported syntaxes.
      • Auto Redistribution: Automatic redistribution can be enabled. If automatic redistribution is enabled, data will be redistributed immediately after the scale-out is complete. If this function is disabled, only the scale-out is performed. In this case, to redistribute data, select a cluster and choose More > Scale Node > Redistribute.
      • Redistribution Concurrency: If automatic redistribution is enabled, you can set the number of concurrent redistribution tasks. The value range is 1 to 32. The default value is 4.
      • Redistribution Mode: It can be set to Online or Offline. After confirming that the information is correct, click OK in the displayed dialog box.

  6. Click Next: Confirm.
  7. Click Submit.

    • After you submit the scale-out application, task information of the cluster changes to Scaling out and the process will take several minutes. During the scale-out, the cluster automatically restarts. Therefore, the cluster status will stay Unavailable for a while. After the cluster is restarted, the status will change to Available. In the last phase of scale-out, the system dynamically redistributes user data in the cluster, during which the cluster is in the Read-only state.
    • A cluster is successfully scaled out only when the cluster is in the Available state and task information Scaling out is not displayed. Then you can use the cluster.
    • If Scale-out failed is displayed, the cluster fails to be scaled out.

Scaling Out with Idle Nodes

To ensure reliability, prepare ECS or BMS nodes first by referring to Adding Nodes for a large-scale cluster, and scale out the cluster with idle nodes.

  • Disable automatic redistribution when you scale out a large-scale cluster to facilitate retries upon failures for improved reliability.
  • After the scale-out is complete, manually perform redistribution to ensure that multiple retries can be performed in this phase.

Precautions

  • A number of available nodes must be added to the cluster in advance so that idle nodes can be created and added for scale-out.
  • The anti-affinity rule dictates that the number of idle nodes to be added must be an integer multiple of the cluster ring size.
  • After you start a scale-out, the system first checks for scale-out prerequisites. If your cluster fails the check, modify configurations as prompted and try again. For details, see What Do I Do If the Scale-out Check Fails?

Procedure

  1. Log in to the GaussDB(DWS) management console.
  2. Click Clusters. All clusters are displayed by default.
  3. In the Operation column of the target cluster, choose More > Scale Node > Scale Out.

    If there are idle nodes in the cluster, the system displays a message asking you whether to add nodes.

  4. Configure the scale-out and redistribution parameters as required. For details, see Scaling Out a Cluster.

    Then click Next: Confirm.

  5. Confirm the information and click Submit.

Viewing Scaling Details

  1. Log in to the GaussDB(DWS) management console.
  2. Choose Clusters.
  3. In the Task Information column of a cluster, click View Details.

  4. Check the scale-out status of the cluster on the scaling details page.