Help Center/ TaurusDB/ kerneldesc/ Common Kernel Functions/ Self-Healing of Read Replicas upon a Replication Latency

Updated on 2024-12-30 GMT+08:00

View PDF

Self-Healing of Read Replicas upon a Replication Latency

TaurusDB is a cloud-native database with decoupled compute and storage. The primary node and read replicas share underlying storage data. To ensure consistency of cached data in memory, after the primary node communicates with read replicas, the read replicas need to read the redo log generated by the primary node from Log Stores to update the cached data in memory.

Figure 1 Principle diagram

Communications Between the Primary Node and Read Replicas

Although the primary node and read replicas share underlying storage data, they still need to communicate with each other.

Content sent by the primary node to read replicas: redo log description, such as the latest LSN of the redo log and the API for reading the log internally.
Content sent by read replicas to the primary node:
- Views of the read replicas. The views store the transaction list. The primary node can purge undo logs based on the view of each read replica.
- recycle_lsn values of the read replicas. recycle_lsn indicates the minimum LSN of the data pages read by a read replica. The LSN of the data pages read by a read replica will not be smaller than its recycle_lsn value. The primary node collects the recycle_lsn value of each read replica and evaluates the position for clearing the underlying redo log.
- Basic information about each read replica, such as the ID of the read replica and timestamp of the latest message. The primary node uses this information to manage read replicas.

After the communications, the read replicas can read the redo log and update the visibility of data.

How Read Replica Latency Is Calculated

Read replica latency refers to the amount of time that passes between when data is updated on the primary node and when the updated data is obtained on the read replicas. Read replicas read the redo log to update cached data. visible lsn is used to record the LSN of the redo log. It indicates the maximum LSN of the data pages read by read replicas. flush_to_disk_lsn is used to record the LSN of the latest redo log generated each time a data record is updated or inserted on the primary node. It indicates the maximum LSN of the data pages accessed by the primary node. Read replica latency is actually calculated based on the values of visible lsn and flush_to_disk_lsn. For example, at time t1, the flush_to_disk_lsn value is 100 and the visible lsn value is 80. After a period of time, read replicas replay the redo log. At time t2, the flush_to_disk_lsn value is 130 and the visible lsn value is 100. In this case, read replica latency is calculated as follows: t2 - t1.

How Read Replicas Advance the Visible LSN

The speed at which read replicas advance the visible LSN is the crucial factor that affects the latency.

Read replicas advance the visible LSN as follows:

Read replicas communicate with the primary node to obtain the LSN and description of the latest redo log.
Read replicas read the redo log from Log Stores to memory.
Read replicas parse the redo log, invalidate metadata in memory, and update views in memory.
Read replicas advance the visible LSN.

In most cases, there is a minimal latency between the primary and read replicas. However, in certain scenarios, such as when the primary node is executing a large number of DDL statements, there may be a significant latency.

Self-Healing Policy

If there is a significant read replica latency, users cannot access the latest data from read replicas, which may affect data consistency. To address this, the current database policy is that if the latency exceeds the default value (30s), read replicas reboot. After the reboot, the read replicas will read the latest data from storage, and there is no latency.

Parent topic: Common Kernel Functions

Previous topic: Cold Data Preloading for Read Replicas