Updated on 2024-06-03 GMT+08:00

Log Replay

recovery_time_target

Parameter description: Specifies the time for a standby node to write and replay logs.

Parameter type: integer.

Unit: second

Value range: 0 to 3600

0 indicates that log flow control is disabled. A value from 1 to 3600 indicates that the standby node can write and replay logs within the period specified by the value, so that the standby node can quickly assume the primary role. If recovery_time_target is set to a small value, the performance of the primary node is affected. If it is set to a large value, the log flow is not effectively controlled.

Default value: 60

Setting method: This is a SIGHUP parameter. Set it based on instructions in Table 1.

Setting suggestion: Retain the default value.

recovery_max_workers

Parameter description: Specifies the maximum number of concurrent replayer threads.

This is a POSTMASTER parameter. Set it based on instructions in Table 1.

Value range: an integer ranging from 0 to 20

Default value: 4

recovery_parallelism

Parameter description: Specifies the actual number of replayer threads. This parameter is read-only.

This is a POSTMASTER parameter and is affected by recovery_max_workers and recovery_parse_workers. If any value is greater than 0, recover_parallelism will be recalculated.

Value range: an integer ranging from 1 to 2147483647

Default value: 1

queue_item_size

Parameter description: Specifies the maximum length of the task queue of each redo replayer thread.

This is a POSTMASTER parameter. Set it based on instructions in Table 1.

Value range: a value ranging from 1 to 65535.

Default value: 560

recovery_parse_workers

Parameter description: Specifies the number of ParseRedoRecord threads for the ultimate RTO feature.

  1. In addition, it must be used together with recovery_redo_workers. If both recovery_parse_workers and recovery_redo_workers are greater than 1, ultimate RTO is enabled. If you do not want to enable ultimate RTO, retain the default value 1 of recovery_parse_workers.
  2. Ensure that the value of this parameter replication_type is set to 1 when ultimate RTO is enabled.
  3. If both the ultimate RTO and parallel replay are enabled at the same time, the ultimate RTO feature takes effect but the parallel replay feature does not take effect.
  4. Ultimate RTO does not support flow control. Flow control is determined by the parameter recovery_time_target.

Parameter type: integer.

Unit: none

Value range: 1 to 16

Default value: 1

Setting method: This is a POSTMASTER parameter. Set it based on instructions in Table 1.

Setting suggestion: For details about the values of recovery_parse_workers and recovery_redo_workers for different CPUs, memories, and deployment modes, see Table 1 Parameter settings for different CPUs, memory sizes, and deployment modes.

After ultimate RTO is enabled, the total number of extra replayer threads started by the standby node = recovery_parse_workers x (recovery_redo_workers + 2) + 5. More replayer threads occupy more CPU, memory, and I/O resources. Set parameters based on the actual hardware configuration. Otherwise, the system may fail to start due to insufficient resources. In hybrid deployment scenarios, host performance may be affected.

recovery_redo_workers

Parameter description: Specifies the number of PageRedoWorker threads corresponding to each ParseRedoRecord thread when the ultimate RTO feature is enabled. recovery_redo_workers must be used together with recovery_parse_workers. The value of recovery_redo_workers takes effect only when recovery_parse_workers is greater than 1.

Parameter type: integer.

Unit: none

Value range: 1 to 8

Default value: 1

Setting method: This is a POSTMASTER parameter. Set it based on instructions in Table 1.

Setting suggestion: For details about the values of recovery_parse_workers and recovery_redo_workers for different CPUs, memories, and deployment modes, see Table 1 Parameter settings for different CPUs, memory sizes, and deployment modes.

Table 1 Parameter settings for different CPUs, memory sizes, and deployment modes

No.

Number of CPUs

Memory (GB)

Hybrid Deployment or Not

recovery_parse_workers

recovery_redo_workers

Number of Replayer Threads

Remarks

1

4

-

-

1

1

-

Ultimate RTO is not recommended.

2

8

-

Yes

1

1

-

Ultimate RTO is not recommended.

3

8

64

No

1

1

-

Ultimate RTO is not recommended.

4

16

128

Yes

1

1

-

Ultimate RTO is not recommended.

5

16

128

No

2

3

15

-

6

32

256

Yes

2

2

13

-

7

32

256

No

2

8

25

-

8

64

512

Yes

2

4

17

-

9

64

512

No

2

8

25

Set the parameter to the recommended value for larger hardware specifications.

10

96

768

-

2

8

25

Set the parameter to the recommended value for larger hardware specifications.

enable_page_lsn_check

Parameter description: Specifies whether to enable the data page LSN check. During replay, the current LSN of the data page is checked to see if it is the expected one.

This is a POSTMASTER parameter. Set it based on instructions in Table 1.

Value range: Boolean

Default value: on

recovery_min_apply_delay

Parameter description: Specifies the replay delay of the standby node.

Parameter type: integer.

Unit: millisecond

Value range: 0 to INT_MAX

Default value: 0 (no delay added)

Setting method: This is a SIGHUP parameter. Set it based on instructions in Table 1.

Setting suggestion: See the notes below.

  • This parameter does not take effect on the primary node. It must be set on the standby node that requires a delay. You are advised to set this parameter on the asynchronous standby node. However, if the delay is set on the asynchronous standby node, the RTO will take a long time after the node is promoted to primary.
  • The delay time is calculated based on the transaction commit timestamp on the primary node and the current time on the standby node. Therefore, ensure that the clocks of the primary and standby nodes are synchronized.
  • If the delay time is too long, the disk where the Xlog file is located on the standby node may be full. Therefore, you need to set the delay time based on the disk size.
  • Operations without transactions are not delayed.
  • After the primary/standby switchover, if the original primary node needs to be delayed, you need to manually set this parameter.
  • When synchronous_commit is set to remote_apply, synchronous replication is affected by the delay. Each commit message is returned only after the replay on the standby node is complete.
  • Using this feature also delays hot_standby_feedback, which may cause the primary node to bloat, so be careful when using both.
  • After a DDL operation (such as DROP or TRUNCATE) that holds an AccessExclusive lock is performed on an object on the primary node, if the object is queried on the standby node during the delayed replay of the record on the standby node, it will be returned only after the lock is released.

dcf_truncate_dump_info_level

Parameter description: Specifies whether to print the LSN truncated by the DCF and the subsequent LSNs.

Parameter type: integer.

Unit: none

Value range: 0 to 2

  • 0: disabled.
  • 1: prints all LSNs truncated by the DCF (Xlogs whose LSN is greater than or equal to the truncated LSN).
  • 2: prints all LSNs truncated by the DCF and prints warning-level logs when the LSNs flushed to disks are greater than the truncated LSNs.

Default value: 0 (disabled)

Setting method: This is a SIGHUP parameter. Set it based on instructions in Table 1.

Setting suggestion: Retain the default value.

redo_bind_cpu_attr

Parameter description: Specifies the core binding operation of the replayer thread. Only the sysadmin user can access this parameter. This is a POSTMASTER parameter. Set it based on instructions in Table 1.

Value range: a string of more than 0 characters. The value is case-insensitive.
  • 'nobind': The thread is not bound to a core.
  • 'nodebind: 1, 2': Use the CPU cores in NUMA groups 1 and 2 to bind threads.
  • 'cpubind: 0-30': Use the CPU cores 0 to 30 to bind threads.
  • 'cpuorderbind: 16-32': One thread is bound to one CPU core starting from core 16. If the number of cores in the range is insufficient, the remaining threads are not bound. You are advised to set the interval to a value greater than or equal to the value of recovery_parallelism plus 1.

Default value: 'nobind'

  • This parameter is used for core binding in the Arm environment. You are advised to bind all replayer threads to the same NUMA group for better performance. In hybrid deployment scenarios, you are advised to bind the replayer threads of different nodes on the same host to different NUMA groups.
  • The core binding range specified by this parameter must be different from the core binding range specified by the GUC parameter thread_pool_attr and the CPU core IDs specified by the GUC parameters wal_rec_writer_bind_cpu, walwriteraux_bind_cpu, and wal_receiver_bind_cpu.