Updated on 2024-12-30 GMT+08:00

Hot Row Update

Hot rows indicate the rows that are frequently added, deleted, modified, and queried in a database in the following scenarios, such as flash sales, concert ticket booking, and train ticket booking for hot routes. When a transaction updates data in a row, the row needs to be locked. Only one transaction can update a row at a time, and other transactions can continue to be executed only after the row lock is released, so the performance of existing hot row update is poor. Traditional sharding policies are ineffective to improve the processing performance.

TaurusDB optimizes hot row update, which can be automatically or manually enabled. After hot row update is enabled, hot rows can be updated efficiently.

Principles

The following figure shows the architecture of TaurusDB hot row update. There are two parts: Counter_hash and Group_hash. Counter_hash is used to automatically determine which rows are hot rows. Group_hash consists of multiple hotspot groups and is used to update hot rows. Each hotspot group corresponds to a hot row. Each hotspot group consists of multiple batches to ensure that the statements that update hot rows can be committed alternately.

Constraints

  • The kernel version of your TaurusDB instance must be 2.0.54.240600 or later.
  • Usage constraints:
    • In a WHERE condition, only a primary key or unique index can be used for equality matching, and only one record can be updated.
    • Index columns cannot be modified.
    • The modifications apply only for the columns of the integer type.
    • Only two elements in a hot row record can be added or subtracted. The first element is the same as the left side of the equal sign (=) and meets the constraints such as unique indexes. Value assignment is not allowed. Assuming that c is the column to be modified and d is a common column, only operations similar to c=c+1 or c=c-1 are allowed. Operations such as c=d+1, c=1+c, c=c+1+1, and c=1+c+1 are not allowed.
    • This function applies only for implicit transactions. That is, AUTOCOMMIT must be set to ON and cannot be used in BEGIN and COMMIT transactions.
    • You need to use HOTSPOT to explicitly mark hot row update transactions, or set rds_hotspot_auto_detection_threshold to a value other than 0 to enable automatic hot row update identification. For details about how to use rds_hotspot_auto_detection_threshold, see the parameter description.
    • The isolation level of transactions in a database must be READ COMMITTED (RC).
    • This function cannot be used in stored functions, triggers, or events. Otherwise, the following error is reported on the client:
      HOTSPOT hints can not be used in stored function, trigger or event
  • Behavior change: In a hotspot transaction group, except the transactions that failed to be executed or that were killed in the update phase, other transactions are committed in batches and recorded in redo logs and undo logs. These transactions can only be committed or rolled back in batches and cannot be rolled back separately. Dozens to hundreds of transactions can be committed in each batch.

Parameter Description

Table 1 Parameter description

Parameter

Description

rds_hotspot

Whether to enable hot row update. ON: The function is enabled.

rds_hotspot_follower_wait_commit_interval

Sleep time in microseconds before a follower transaction is blocked when waiting for the leader transaction logs to be persisted. For instances with slow log persistence, you are advised to increase the value. For instances with fast log persistence, you are advised to set this parameter to 0 so that follower transactions are blocked without sleeping.

rds_hotspot_leader_wait_follower_interval

Time interval, in microseconds, that the leader transaction in a hot row update waits for the follower transaction to update records. In low concurrency, you are advised to set this parameter to a smaller value to avoid performance deterioration. In high concurrency, you are advised to set this parameter to a larger value to improve performance. If queries per second (QPS) exceeds 200,000, you are advised to set this parameter to 100 or a larger value.

rds_hotspot_auto_detection_threshold

Whether to enable automatic identification for hot rows. The value 0 indicates that the function is disabled. If the value is not 0, it indicates the threshold for identifying hot rows. When the number of row updates per second exceeds the threshold, hot row update is enabled.

rds_hotspot_batch_size_lower_limit

Recommended minimum size for each batch of hot transactions. Each batch should strive to reach this size as much as possible. However, this is not strictly guaranteed. When the leader finds that all followers to be waited for have arrived, the batch of transactions enters the commit state.

rds_hotspot_max_memory_size

Maximum memory occupied by groups and counters during a hot row update. When the memory occupied by a group exceeds the threshold, the memory occupied by the group is cleared. When the memory occupied by a counter exceeds the threshold, the memory occupied by the counter is cleared. The system attempts to clear the old memory only when a new memory is applied for.

rds_hotspot_enable_time_statistics

Whether to enable status statistics related to the update time of hot rows. The value ON indicates this function is enabled.

Status Description

Table 2 Status description

Status

Description

Hotspot_total_trx

Total transactions using the hot row upgrade function.

Hotspot_update_errors

Transactions that failed to update hot rows. These transactions do not affect the commit of other transactions that update how rows.

Hotspot_trx_rollbacked

Number of transactions that are successfully updated but are finally rolled back. When the leader transaction decides to roll back, all follower transactions roll back together.

Hotspot_trx_committed

Number of transactions that are successfully committed to update hot rows.

Hotspot_batch_size

Number of transactions that are to update hot rows at a time. These transactions are committed in batches.

Hotspot_batch_wait_time

Time in microseconds that the next batch of transactions that are to update hot rows waits for the previous batch of transaction to release the lock. When a batch of transactions to update rows is committed, the rows are locked.

Hotspot_leader_wait_follower_time

Time in microseconds for the leader to wait for the followers in the current batch to complete record update.

Hotspot_leader_total_time

Total time spent by the leader transaction in updating hot rows in the current batch, in microseconds.

Hotspot_follower_total_time

Total time spent by a follower transaction in updating hot rows in the current batch, in microseconds.

Hotspot_follower_wait_commit_time

Time for a follower to wait for the leader to persist logs in the current batch, in microseconds.

Hotspot_group_counts

Number of groups. Each hot row update corresponds to a group, and transactions in the group are committed in batches.

Hotspot_counter_counts

Number of counters. Counters are used to automatically determine whether a hot row is updated. When the statistical value in a counter meets the requirement, a group is created for hot row update.

New Keywords

The following table lists new keywords.

Table 3 New keywords

Keyword

Description

HOTSPOT

Indicates that hot row update is enabled.

NOT_MORE_THAN

(Optional) Indicates that the target value is not greater than a certain value.

NOT_LESS_THAN

(Optional) Indicates that the target value is not less than a certain value.

The preceding keywords are placed at the end of an SQL statement. HOTSPOT must be placed at the beginning. NOT_MORE_THAN and NOT_LESS_THAN can be placed at any position.

For example, if id is a primary key column and c is an INT column, the following syntax is supported:

UPDATE c=c+1 where id=10 HOTSPOT;
UPDATE c=c+1 where id=10 HOTSPOT NOT_MORE_THAN 100; // The value of the c column is not greater than 100.
UPDATE c=c-1 where id=10 HOTSPOT NOT_LESS_THAN 0; // The value of the c column is not less than 0.
UPDATE c=c+1 where id=10 HOTSPOT NOT_MORE_THAN 100 NOT_LESS_THAN 0; // The value of the c column is not greater than 100 and not less than 0.
UPDATE c=c+1 where id=10 HOTSPOT NOT_LESS_THAN 0 NOT_MORE_THAN 100; // The value of the c column is not greater than 100 and not less than 0.

When any value exceeds the value of NOT_MORE_THAN or NOT_LESS_THAN, the following error is reported to the client:

HOTSPOT field value exceeds limit

Example

  1. Create a table and prepare data.
    CREATE TABLE test.hotspot1 (
      `id` int NOT NULL primary key,
      `c` int NOT NULL DEFAULT '0'
    ) ENGINE=InnoDB;
    INSERT INTO test.hotspot1 VALUES (1, 1);
  2. Enable hot row update.
    SET GLOBAL rds_hotspot = ON;
  3. Change the isolation level to AUTOCOMMIT.
    SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
    SET SESSION AUTOCOMMIT = ON;
  4. Initiate an update with HOTSPOT keyword.
    UPDATE test.hotspot1 SET c=c+1 WHERE id=1 HOTSPOT;
  5. Check the update status of hot rows.
    SHOW STATUS like "%hotspot%";

Performance test

  • Test environments

    Instance specifications: 8 vCPUs | 32 GB, 32 vCPUs | 128 GB

    ECS specifications: 32 vCPUs | 64 GB

    Region: CN North-Beijing4

    Test tool: sysbench-1.0.18

    Data models:

    • One table with one data record
    • Eight tables, with each table containing one data record
  • Parameter settings

    rds_hotspot=ON

    transaction_isolation=READ-COMMITTED

    max_prepared_stmt_count=1048576

    rds_global_sql_log_bin=OFF

  • Test method

    Definition of the data tables required for the test:

    CREATE TABLE sbtest (id int NOT NULL AUTO_INCREMENT,k int NOT NULL DEFAULT '0',PRIMARY KEY (id));

    Test statement:

    UPDATE sbtest%u SET k=k+1 WHERE id=1 hotspot;

  • Test scenarios and results

    Test scenario 1: updating a single hot row of an instance with 8 vCPUs and 32 GB

    Test result: The performance of all concurrent requests was improved to different degrees. The performance of 64 or less concurrent requests was not improved significantly, but the performance of 128 or more concurrent requests was improved significantly (up to 9.26 times).

    Test scenario 2: Updating a single hot row of an instance with 32 vCPUs and 128 GB

    Test result: The performance of 128 or more concurrent requests was improved significantly, by 639 times.

    Test scenario 3: Updating eight hot rows of an instance with 32 vCPUs and 128 GB

    Test result: The performance of 256 or less concurrent requests was not improved, but the performance of 512 or more concurrent requests was improved significantly, by 78 times.