Updated on 2024-05-20 GMT+08:00

Logical Decoding Options

  • General options:
    • include-xids:

      Specifies whether the decoded data column contains XID information.

      Valid value: 0 and 1. The default value is 1.

      • 0: The decoded data column does not contain XID information.
      • 1: The decoded data column contains XID information.
    • skip-empty-xacts:

      Specifies whether to ignore empty transaction information during decoding.

      Valid value: 0 and 1. The default value is 0.

      • 0: The empty transaction information is not ignored during decoding.
      • 1: The empty transaction information is ignored during decoding.
    • include-timestamp:

      Specifies whether decoded information contains the commit timestamp.

      Valid value: 0 and 1. The default value is 0.

      • 0: The decoded information does not contain the commit timestamp.
      • 1: The decoded information contains the commit timestamp.
    • only-local:

      Specifies whether to decode only local logs.

      Valid value: 0 and 1. The default value is 1.

      • 0: Non-local logs and local logs are decoded.
      • 1: Only local logs are decoded.
    • force-binary:

      Specifies whether to output the decoding result in binary format.

      Value range: 0

      • 0: The decoding result is output in text format.
    • white-table-list:

      Whitelist parameter, including the schema and table name to be decoded.

      Value range: a string that contains table names in the whitelist. Different tables are separated by commas (,). An asterisk (*) is used to fuzzily match all tables. Schema names and table names are separated by periods (.). No space character is allowed. For example:

      select * from pg_logical_slot_peek_changes('slot1', NULL, 4096, 'white-table-list', 'public.t1,public.t2,*.t3,my_schema.*');
    • max-txn-in-memory:

      Memory control parameter. The unit is MB. If the memory occupied by a single transaction is greater than the value of this parameter, data is flushed to disks.

      Value range: an integer ranging from 0 to 100. The default value is 0, indicating that memory control is disabled.

    • max-reorderbuffer-in-memory:

      Memory control parameter. The unit is GB. If the total memory (including the cache) of transactions being concatenated in the sender thread is greater than the value of this parameter, the current decoding transaction is flushed to disks.

      Value range: an integer ranging from 0 to 100. The default value is 0, indicating that memory control is disabled.

    • include-user:

      Specifies whether the BEGIN logical log of a transaction records the username of the transaction. The user name of a transaction refers to the authorized user, that is, the login user who executes the session corresponding to the transaction. The user name does not change during the execution of the transaction.

      Valid value: 0 and 1. The default value is 0.

      • 0: The BEGIN logical log of a transaction does not contain the username of the transaction.
      • 1: The BEGIN logical log of a transaction records the username of the transaction.
    • exclude-userids:

      Specifies the OID of a blacklisted user.

      Value range: a string, which specifies the OIDs of blacklisted users. Multiple OIDs are separated by commas (,). The system does not check whether the OIDs exist.

    • exclude-users:

      Name list of blacklisted users.

      Value range: a string, which specifies the names of blacklisted users. Multiple names are separated by commas (,). The system does not check whether the names exist.

    • dynamic-resolution:

      Specifies whether to dynamically parse the names of blacklisted users.

      Valid value: 0 and 1. The default value is 1.

      • 0: If the parameter is set to 0, an error is reported and the logic decoding exits when the decoding detects that the user does not exist in blacklist exclude-users.
      • 1: If the parameter is set to 1, decoding continues when it detects that the user does not exist in blacklist exclude-users.
    • standby-connection:

      Specifies whether to restrict decoding only on the standby node. This option is set only for stream decoding.

      Value range: Boolean. The default value is false.

      • true: Only the standby node can be connected for decoding. When the primary node is connected for decoding, an error is reported and the system exits.
      • false: The primary or standby node can be connected for decoding.
    • sender-timeout:

      Heartbeat timeout threshold between the kernel and the client. This option is set only for stream decoding. If no message is received from the client within the period, the logic decoding stops and disconnects from the client. The unit is ms.

      Value range: an integer ranging from 0 to 2147483647. The default value depends on the value of the GUC parameter logical_sender_timeout.

    • change-log-max-len:

      Upper limit of the logical log length.

      Value range: 1 to 65535. The default value is 4096.

      If the upper limit is exceeded, the memory is destroyed and re-allocated for storage. If the length is too long, the memory usage increases. If the length is too short, the memory is frequently allocated and released.

    • max-decode-to-sender-cache-num:

      Cache threshold for sending decoding logs.

      Value range: 1 to 65535. The default value is 4096.

      If the threshold is not exceeded, the logs are not sent temporarily. If the threshold is exceeded, the logs are sent. This reduces the burden of frequent log sending.

    • enable-heartbeat:

      Specifies whether to generate heartbeat logs. This option is set only for stream decoding.

      Value range: Boolean. The default value is false.

      • true: Heartbeat logs are generated.
      • false: Heartbeat logs are not generated.

      If the heartbeat log output option is enabled, heartbeat logs will be generated. The heartbeat logs can be in the binary or TEXT/JSON format. For a binary heartbeat log message, it starts with a character 'h' and then the heartbeat log content: an 8-byte uint64 LSN indicating the end position of WAL reading when the heartbeat logical log is sent, an 8-byte uint64 LSN indicating the location of the WAL that has been flushed to disks when the heartbeat logical log is sent, and an 8-byte int64 timestamp (starting from January 1, 1970) indicating the timestamp when the latest decoded transaction log or checkpoint log is generated; then, it ends with character 'F'. TEXT/JSON heartbeat log messages that are sent in batches end with '0'. There is no such terminator for each TEXT/JSON heartbeat log message. For details, see the following figure.

    • parallel-decode-num:

      Number of decoder threads for parallel decoding. This option is set only for stream decoding. When the system function is called, this option is invalid and only the value range is verified.

      Value range: The value 1 indicates that decoding is performed based on the original serial logic. Other values indicate that parallel decoding is enabled. The default value is 1.

      If parallel-decode-num is not set (the default value is 1) or is explicitly set to 1, the options in the following "Parallel decoding" cannot be configured.

    • output-order:

      Specifies whether to use the CSN sequence to output decoding results. This option is set only for stream decoding. When the system function is called, this option is invalid and only the value range is verified.

      Valid value: 0 or 1 of the int type. The default value is 0.

      • 0: The decoding results are sorted by transaction COMMIT LSN. This mode can be used only when the value of confirmed_csn of the decoding replication slot is set to 0 (not displayed). Otherwise, an error is reported.
      • 1: The decoding results are sorted by transaction CSN. This mode can be used only when the value of confirmed_csn of the decoding replication slot is not set to 0. Otherwise, an error is reported.
    • auto-advance:

      Determines whether to automatically advance logical replication slots. This parameter is set only for streaming decoding.

      Value range: Boolean. The default value is false.

      • true: The logical replication slot is advanced to the current decoding position when all sent logs are confirmed and there is no transaction to be sent.
      • false: The replication service invokes the log confirmation interface to advance the logical replication slot.
    • skip-generated-columns:

      Specifies whether to skip generated columns in the logical decoding result. This parameter is invalid for UPDATE and DELETE on old tuples, and the corresponding tuples always output the generated columns.

      Value range: Boolean. The default value is false.

      • true: The decoding result of generated columns is not output.
      • false: The decoding result of generated columns is output.
  • Parallel decoding:
    The following configuration options are set only for stream decoding.
    • decode-style:

      Specifies the decoding format.

      Valid value: 'j', 't', or 'b' of the char type, indicating the JSON, TEXT, or binary format, respectively. The default value is 'b', indicating binary decoding.

      For the JSON and TEXT formats, in the decoding result sent in batches, the uint32 consisting of the first four bytes of each decoding statement indicates the total number of bytes of the statement (the four bytes occupied by the uint32 are excluded, and 0 indicates that the decoding of this batch ends). The 8-byte uint64 indicates the corresponding LSN (begin corresponds to first_lsn, commit corresponds to end_lsn, and other values correspond to the LSN of the statement).

    The binary encoding rules are as follows:

    1. The first four bytes represent the total number of bytes of the decoding result of statements following the statement-level delimiter letter P (excluded) or the batch end character F (excluded). If the value is 0, the decoding of this batch ends.
    2. The next eight bytes (uint64) indicate the corresponding LSN (begin corresponds to first_lsn, commit corresponds to end_lsn, and other values correspond to the LSN of the statement).
    3. The next one-byte letter can be B, C, I, U, or D, representing BEGIN, COMMIT, INSERT, UPDATE, or DELETE.
    4. If B is used in the step 3:
      1. The next eight bytes (uint64) indicate the CSN.
      2. The next eight bytes (uint64) indicate first_lsn.
      3. (Optional) If the next 1-byte letter is T, the following four bytes (uint32) indicate the timestamp length for committing the transaction. The following characters with the same length are the timestamp character string.
      4. (Optional) If the next one-byte letter is N, the following four bytes (uint32) indicate the length of the transaction username. The following characters with the same length are the transaction username.
      5. Because there may still be a decoding statement subsequently, a 1-byte letter P or F is used as a separator between statements. P indicates that there are still decoded statements in this batch, and F indicates that this batch is completed.
    5. If C is used in the step 3:
      1. (Optional) If the next 1-byte letter is X, the following eight bytes (uint64) indicate XID.
      2. (Optional) If the next 1-byte letter is T, the following four bytes (uint32) indicate the timestamp length. The following characters with the same length are the timestamp character string.
      3. When logs are sent in batches, decoding results of other transactions may still exist after a COMMIT log is decoded. If the next 1-byte letter is P, the batch still needs to be decoded. If the letter is F, the batch decoding ends.
    6. If I, U, or D is used in the step 3:
      1. The next two bytes (uint16) indicate the length of the schema name.
      2. The schema name is read based on the preceding length.
      3. The next two bytes (uint16) indicate the length of the table name.
      4. The table name is read based on the preceding length.
      5. (Optional) If the next 1-byte letter is N, it indicates a new tuple. If the letter is O, it indicates an old tuple. In this case, the new tuple is sent first.
        1. The following two bytes (uint16) indicate the number of columns to be decoded for the tuple, which is recorded as attrnum.
        2. The following procedure is repeated for attrnum times.
          1. The next two bytes (uint16) indicate the length of the column name.
          2. The column name is read based on the preceding length.
          3. The next four bytes (uint32) indicate the OID of the current column type.
          4. The next four bytes (uint32) indicate the length of the value (stored in the character string format) in the current column. If the value is 0xFFFFFFFF, it indicates null. If the value is 0, it indicates a character string whose length is 0.
          5. The column value is read based on the preceding length.
      6. Because there may still be a decoding statement after, if the next one-byte letter is P, it indicates that the batch still needs to be decoded, and if the next one-byte letter is F, it indicates that decoding of the batch ends.
    • sending-batch:

      Specifies whether to send messages in batches.

      Valid value: 0 or 1 of the int type. The default value is 0.

      • 0: The decoding results are sent one by one.
      • 1: When the accumulated size of decoding results reaches 1 MB, decoding results are sent in batches.

      In the scenario where batch sending is enabled, if the decoding format is 'j' or 't', before each original decoding statement, a uint32 number is added indicating the length of the decoding result (excluding the current uint32 number), and a uint64 number is added indicating the LSN corresponding to the current decoding result.

      In the scenario where the CSN sequence is used to output decoding results, batch sending is limited to a single transaction (that is, if a transaction has multiple small statements, batch sending is used). The batch sending function is not used to send multiple transactions in the same batch.

    • parallel-queue-size:

      Specifies the length of the queue for interaction between parallel logic decoding threads.

      Value range: an integer ranging from 2 to 1024. The value must be an integer power of 2. The default value is 128.

      The queue length is positively correlated with the memory usage during decoding.

    • logical-reader-bind-cpu:

      Specifies the CPU core ID bound to the reader thread.

      Value range: –1 to 65535. If this parameter is not specified, cores are not bound.

      The default value is –1, indicating that cores are not bound. The value –1 cannot be manually set. Ensure that the core ID is within the total number of logical cores of the machine. Otherwise, an error is reported. If multiple threads are bound to the same core, the load of the core increases and the performance deteriorates.

    • logical-decoder-bind-cpu-index:

      Specifies the CPU core ID bound to the logical decoder thread.

      Value range: –1 to 65535. If this parameter is not specified, cores are not bound.

      The default value is –1, indicating that cores are not bound. The value –1 cannot be manually set. Ensure that the core ID is within the total number of logical cores of the machine and is less than the value of [Number of CPU cores – Number of parallel logical decoders]. Otherwise, an error is reported.

      Starting from the specified core ID, the number of newly started threads increases by one.

      If multiple threads are bound to the same core, the load of the core increases and the performance deteriorates.

      When the GaussDB performs logical decoding and replays logs, a large number of CPU resources are occupied. Related threads such as WAL writer, WAL sender, WAL receiver and PageRedo are in the performance bottleneck. If these threads can be bound to a fixed CPU, frequent CPU switchovers caused by operating system scheduling threads can be reduced. In this way, the performance overhead caused by cache miss is also reduced, improving the process handling speed. If the user scenario has performance requirements, you can optimize the configuration by referring to the following core binding example.

      • The following is an example of setting parameters:
        1. walwriter_cpu_bind=1
        2. walwriteraux_bind_cpu=2
        3. wal_receiver_bind_cpu=4
        4. wal_rec_writer_bind_cpu=5
        5. wal_sender_bind_cpu_attr='cpuorderbind:7-14'
        6. redo_bind_cpu_attr='cpuorderbind:16-19'
        7. logical-reader-bind-cpu=20
        8. logical-decoder-bind-cpu-index=21
      • In the example, cores 1, 2, 3, 4, 5, and 6 are bound using the GUC tool. The command is as follows:

        gs_guc set -Z datanode -N all -I all -c "walwriter_cpu_bind=1"

        In the example, cores 7 and 8 are bound when a decoding request is initiated by a JDBC client.

      • In the example, walwriter_cpu_bind=1 indicates that the thread can run on CPU core 1;

        cpuorderbind:7-14 indicates that each started thread is bound to CPU cores 7 to 14 in sequence. If the CPU cores in the range are used up, the newly started threads do not participate in core binding;

        logical-decoder-bind-cpu-index indicates that the started threads are bound to CPU cores 21, 22, 23 and so on.

      • The core binding principle is that one thread occupies one CPU core. For details about GUC parameters in the example, see the Administrator Guide.
      • Inappropriate core binding, for example, binding multiple threads to one CPU core, may cause performance deterioration.
      • You can run the lscpu command to view CPU(s), that is, the number of logical CPU cores in your environment.

      If the number of logical CPU cores is less than 36, you are advised not to use this core binding policy. In this case, you are advised to use the default configuration (no parameter setting).