Updated on 2024-09-06 GMT+08:00

Hot Row Update

Hot rows indicate the rows that are frequently added, deleted, modified, and queried in a database in the following scenarios, such as flash sales, concert ticket booking, and train ticket booking for hot routes. When a transaction updates data in a row, the row needs to be locked. Only one transaction can update a row at a time, and other transactions can continue to be executed only after the row lock is released, so the performance of existing hot row update is poor. Traditional sharding policies are ineffective to improve the processing performance.

GaussDB(for MySQL) optimizes hot row update, which can be automatically or manually enabled. After hot row update is enabled, hot rows can be upgraded efficiently.

Principles

The following figure shows the architecture of GaussDB(for MySQL) hot row update. There are two parts: Counter_hash and Group_hash. Counter_hash is used to automatically determine which rows are hot rows. Group_hash consists of multiple hotspot groups and is used to update hot rows. Each hotspot group corresponds to a hot row. Each hotspot group consists of multiple batches to ensure that the statements that update hot rows can be committed alternately.

Constraints

  • The kernel version of your GaussDB(for MySQL) instance is 2.0.54.240600 or later.
  • Usage constraints:
    • In a WHERE condition, only the primary key or unique index can be used for equivalent matching, and only one record can be updated.
    • Index columns cannot be modified.
    • The modifications take effect only for the columns of the integer type.
    • Only two elements in a hot row record can be added or subtracted. The first element is the same as the left side of the equal sign (=) and meets the constraints such as unique index. Value assignment is not allowed. Assuming that the column c is a to-be-modified column and the column d is a common column, only operations similar to c=c+1 or c=c-1 are allowed, and operations such as c=d+1, c=1+c, c=c+1+1, and c=1+c+1 are not allowed.
    • This function applies only for implicit transactions. That is, AUTOCOMMIT must be set to ON and cannot be used in BEGIN and COMMIT transactions.
    • HOTSPOT is used to explicitly mark hot row update transactions, or set rds_hotspot_auto_detection_threshold to a value other than 0 to enable automatic hot row update identification. For details about how to use rds_hotspot_auto_detection_threshold, see the parameter description.
    • The isolation level of transactions in a database must be READ COMMITTED (RC).
    • This function cannot be used in stored functions, triggers, and events. Otherwise, the following error is reported on the client:
      HOTSPOT hints can not be used in stored function, trigger or event
  • Behavior change: In a hotspot transaction group, except the transactions that failed to be executed or that were killed in the update phase, other transactions are committed in batches and recorded in redo logs and undo logs. These transactions can only be committed or rolled back in a centralized manner and cannot be rolled back separately. Dozens to hundreds of transactions can be committed in each batch.

Parameter Description

Table 1 Parameter description

Parameter

Description

rds_hotspot

Whether to enable hot row update. ON: The function is enabled.

rds_hotspot_follower_wait_commit_interval

Sleep time in microseconds before a follower transaction is blocked when waiting for the leader transaction logs to be persisted. For instances with slow log persistence, you are advised to increase the value. For instances with fast log persistence, you are advised to set this parameter to 0 so that follower transactions are blocked without sleeping.

rds_hotspot_leader_wait_follower_interval

Time interval, in microseconds, that the leader transaction check whether the follower transactions update records In lower concurrency, you are advised to set this parameter to a smaller value to avoid performance deterioration. In higher concurrency, you are advised to set this parameter to a larger value to improve performance. If queries per second (QPS) exceeds 200,000, you are advised to set this parameter to 100 or a larger value.

rds_hotspot_auto_detection_threshold

Whether to enable automatic identification for hot rows. The value 0 indicates that the function is disabled. If the value is not 0, it indicates the threshold for identifying hot rows. When the number of row updates per second exceeds the threshold, the function is enabled.

rds_hotspot_batch_size_lower_limit

Recommended minimum size of each batch of hot transactions. The size of each batch of transactions should be as large as possible. However, this is not strictly guaranteed. When the leader finds that all followers to be waited for have arrived, the batch of transactions enters the commit state.

rds_hotspot_max_memory_size

Maximum memory occupied by groups and counters during a hot row update. When the memory occupied by a group exceeds the threshold, the memory occupied by the group is cleared. When the memory occupied by a counter exceeds the threshold, the memory occupied by the counter is cleared. The system attempts to clear the old memory only when a new memory is applied for.

rds_hotspot_enable_time_statistics

Whether to enable status statistics related to the update time of hot rows. The value ON indicates this function is enabled.

Status Description

Table 2 Status description

Status

Description

Hotspot_total_trx

Total transactions using the hot row upgrade function.

Hotspot_update_errors

Transactions that failed to update hot rows. These transactions do not affect the commit of other transactions that update how rows.

Hotspot_trx_rollbacked

Number of transactions that are successfully updated but are finally rolled back. When the leader transaction decides to roll back, all follower transactions roll back together.

Hotspot_trx_committed

Number of transactions that are successfully committed to update hot rows.

Hotspot_batch_size

Number of transactions that are to update hot rows at a time. These transactions are committed in batches.

Hotspot_batch_wait_time

Time in microseconds that the next batch of transactions that update hot rows waits for the previous batch of transaction to release the lock. When a batch of transactions to update rows is committed, the rows are locked.

Hotspot_leader_wait_follower_time

Time in microseconds for the leader to wait for the followers in the current batch to complete record update.

Hotspot_leader_total_time

Total time spent by the leader transaction in updating hot rows in the current batch, in microseconds.

Hotspot_follower_total_time

Total time spent by a follower transaction in updating hot rows in the current batch, in microseconds.

Hotspot_follower_wait_commit_time

Time for a follower to wait for the leader to persist logs in a batch, in microseconds.

Hotspot_group_counts

Number of groups. Each hot row update corresponds to a group, and transactions in the group are committed in batches.

Hotspot_counter_counts

Number of counters. Counters are used to automatically determine whether a hot row is updated. When the statistical value in a counter meets the requirement, a group is created for hot row update.

New Keywords

The following table lists new keywords.

Table 3 New keywords

Keyword

Description

HOTSPOT

Indicates that hot row update is enabled.

NOT_MORE_THAN

(Optional) Indicates that the target value is not greater than a certain value.

NOT_LESS_THAN

(Optional) Indicates that the target value is not less than a certain value.

The preceding keywords are placed at the end of a SQL statement. HOTSPOT must be placed at the beginning. NOT_MORE_THAN and NOT_LESS_THAN can be placed at any position.

For example, if id is a primary key column and c is an INT column, the following syntax is supported:

UPDATE c=c+1 where id=10 HOTSPOT;
UPDATE c=c+1 where id=10 HOTSPOT NOT_MORE_THAN 100; // The value of the c column is not greater than 100.
UPDATE c=c-1 where id=10 HOTSPOT NOT_LESS_THAN 0; // The value of the c column is not less than 0.
UPDATE c=c+1 where id=10 HOTSPOT NOT_MORE_THAN 100 NOT_LESS_THAN 0; // The value of the c column is not greater than 100 and not less than 0.
UPDATE c=c+1 where id=10 HOTSPOT NOT_LESS_THAN 0 NOT_MORE_THAN 100; // The value of the c column is not greater than 100 and not less than 0.

When any value exceeds the value of NOT_MORE_THAN or NOT_LESS_THAN, the following error is reported to the client:

HOTSPOT field value exceeds limit

Example

  1. Create a table and prepare data.
    CREATE TABLE test.hotspot1 (
      `id` int NOT NULL primary key,
      `c` int NOT NULL DEFAULT '0'
    ) ENGINE=InnoDB;
    INSERT INTO test.hotspot1 VALUES (1, 1);
  2. Enable hot row update.
    SET GLOBAL rds_hotspot = ON;
  3. Change the isolation level to AUTOCOMMIT.
    SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
    SET SESSION AUTOCOMMIT = ON;
  4. Initiates an update for rows with HOTSPOT keyword.
    UPDATE test.hotspot1 SET c=c+1 WHERE id=1 HOTSPOT;
  5. Check the update status of hot rows.
    SHOW STATUS like "%hotspot%";

    Performance test

    • Test environments

      Instance specifications: 8 vCPUs | 32 GB, 32 vCPUs | 128 GB

      ECS specifications: 32 vCPUs | 64 GB

      Region: CN North-Beijing

      Test tool: sysbench-1.0.18

      Data model:

      • One table with one data record
      • Eight tables, with each table containing one data record
    • Parameter configuration:

      rds_hotspot=ON

      transaction_isolation=READ-COMMITTED

      max_prepared_stmt_count=1048576

      rds_global_sql_log_bin=OFF

    • Test method

      Definition of the data tables required for the test:

      CREATE TABLE sbtest (id int NOT NULL AUTO_INCREMENT,k int NOT NULL DEFAULT '0',PRIMARY KEY (id));

      Test statement:

      UPDATE sbtest%u SET k=k+1 WHERE id=1 hotspot;

    • Test scenarios and results

      Test scenario 1: updating a single hot row of an instance with 8 vCPUs and 32 GB

      Test result: The performance of all concurrent requests were improved to different degrees. The performance of 64 or less concurrent requests was not improved significantly, but the performance of 128 or more concurrent requests were improved significantly (up to 9.26 times).

      Test scenario 2: Updating a single hot row of an instance with 32 vCPUs and 128 GB

      Test result: The performance of 128 or more concurrent requests was improved significantly, by 639 times.

      Test scenario 3: Updating eight hot rows of an instance with 32 vCPUs and 128 GB

      Test result: The performance of 256 or less concurrent requests was not improved, but the performance of 512 or more concurrent requests was improved significantly, by 78.