GTM Mode

To meet different concurrency and consistency requirements, GaussDB provides GTM-Lite and GTM-Free modes. The main difference between GTM-Lite and GTM-Free lies in the pressure of the central transaction management node GTM and the transaction processing process. In GTM-Lite mode, the pressure on the central transaction processing node is reduced, the transaction processing process is further optimized, and the GTM performance and concurrency bottleneck are reduced. In this way, the transaction processing capability is improved to a great extent while ensuring consistency. In GTM-Free mode, the central transaction management node does not participate in transaction management, eliminating the single point of failure and achieving higher transaction processing performance. However, in terms of consistency, all transactions can be completed to ensure external read consistency. Strong consistency read of distributed transactions is not supported. Transaction consistency that depends on query results, such as insert into select * from, is not supported.

The current version does not support the switchover between the two modes. You are advised to use the default GTM mode during installation. The GTM mode can remain unchanged before and after the upgrade.

The related GUC parameters include enable_gtm_free and gtm_option. You can run the show statement in gsql to query the current GTM mode.

SHOW enable_gtm_free;
SHOW gtm_option;

The method of determining the mode is as follows:

GTM-Lite mode: enable_gtm_free=off and gtm_option=1
GTM-Free mode: enable_gtm_free=on or gtm_option=2

In GTM-Lite mode, GaussDB supports strong consistency and complete syntaxes for distributed transactions. In GTM-Free mode, eventual consistent execution and concurrency control are used for distributed transactions. Therefore, the usage scenarios and methods of some syntaxes are restricted. If the restricted syntaxes are required due to service requirements, refactoring is required based on the understanding of the eventual consistent behavior. The involved restricted syntaxes and refactoring suggestions are as follows:

General principle: Specify a proper distribution key (by using DISTRIBUTE BY) for all user tables.
- You are advised to distribute data evenly.
- You are advised to use the join condition in a query as the distribution key to ensure that the join query does not cause data flow between DNs.
- You are advised to select the primary key of the table as the distribution key.
SELECT:
- During table query, the WHERE condition must contain the equivalent query condition of all distribution keys.
- Do not use subqueries in the SELECT target columns. Otherwise, the plan may fail to be pushed down to DNs for execution, affecting the execution performance.
DML:
By default, cross-node transactions are not supported. If the executed DML statement contains cross-node transactions, an error is reported. There are two scenarios:
1. If a user statement is split into multiple independent statements in the database, the error message "INSERT/UPDATE/DELETE/MERGE contains multiple remote queries under GTM-free mode Unsupport DML two phase commit under gtm free mode. modify your SQL to generate light-proxy or fast-query-shipping plan" is displayed. In this case, you need to modify the statement to execute it on a single node.
  An example is as follows:
  
  insert into t select * from b where b.c = xx;
  
  Assume that the distribution keys of the t and b tables are different, and the WHERE condition filters out only one data record. If enable_stream_operator is disabled, the preceding query is split into two independent statements and executed in serial mode. First, run select * from b where b.c = xx to extract data from a DN to the target record. Then, run the insert into t statement to deliver the extracted target record to another DN to complete the insertion. In GTM-Free mode, the preceding error is reported when such a statement is executed. Similar errors may also be reported for create table as select * from and DELETE, JOIN, and INSERT statements with subqueries.
  
  Refactoring solution: Before statement execution, add the set enable_stream_operator=on command to enable the streaming operator so that service statements can be pushed down for execution.
2. If a user statement is executed on multiple nodes in the database, the error message "Your SQL needs more than one datanode to be involved in" will be displayed. In this case, you are advised to modify the statement so that it can be executed on a single node.
  An example is as follows:
  
  insert into t values(3,3),(1,1);
  
  If (3,3) and (1,1) are distributed on different DNs, the execution of the preceding statement involves two DNs. In GTM-Free mode, the preceding error is reported when such a statement is executed.
  
  Refactoring method: If the preceding statement needs to be executed on multiple nodes, add a hint to the statement to prevent errors. The hint is as follows:
  
  insert /*+ multinode */ into t values(3,3),(1,1);
  
  There are similar constraints on the DELETE and UPDATE statements. You are advised to add the equivalent filtering condition of the distribution key to the WHERE condition in the DELETE and UPDATE statements.
It is recommended that application_type be set to perfect_sharding_type in the JDBC connection string in the development phase. In this way, errors are reported for all SQL statements for cross-node read and write operations, prompting developers to optimize statements as soon as possible.