2023-04 |
START WITH recursion performance optimized |
The performance of the following statements is optimized.
- START WITH...CONNECT BY statements
- START WITH...CONNECT BY statements with the connect_by_isleaf field
|
Viewing running SQL statements in the current stored procedure |
View SQL statements that are being executed in the current stored procedure. |
Session-level HA attributes |
- Session-level standby node connection detection and strong synchronization of session-level logs are supported.
- Session-level standby node connection detection: After this function is enabled, the system checks whether the standby node can be connected. If the standby node cannot be connected, the primary node reports an error.
- Strong synchronization of session-level logs: After this function is enabled, sessions between the primary and standby nodes are strongly synchronized.
- Single-node mode:
- Parameters for session-level standby node connection detection can be set, but the function does not take effect. After data is written to the primary node, a success message is returned.
- Parameters for strong synchronization of session-level logs can be set, but the function does not take effect. After data is written to the primary node, a success message is returned.
- A full build of the standby node is supported.
- During a full build (the standby node cannot be connected):
Parameters for session-level standby node connection detection can be set, but the function does not take effect. After data is written to the primary node, a success message is returned.
After the parameters for strong synchronization of session-level logs are configured, the function takes effect. After most_available_sync is enabled and keep_sync_window is configured, new sessions for strong synchronization enter the maximum availability mode if the disconnection duration of the standby node exceeds the value of keep_sync_window.
- When the standby node is restoring the logs generated during the full build:
Parameters for session-level standby node connection detection can be set, and the function can be enabled.
Parameters for forced synchronization of session-level logs can be set, and the function can be enabled.
- When the network is disconnected, the detection timeliness depends on parameters tcp_user_timeout and wal_sender_timeout. When the primary and standby nodes are disconnected (including scenarios where the wal receiver process on the standby node exits normally or crashes), the connection status of the standby node is updated synchronously.
Specifications:
- One primary node and one standby node are deployed. The data in the primary node is asynchronously synchronized to the standby node.
- If the standby node cannot be connected, write request errors are reported during and the service needs to be retried.
|
CURRENT OF CURSOR |
The CURRENT OF CURSOR syntax is supported.
Restrictions for CURSOR statements are as follows:
- Only simple queries of a single table in the SELECT statement are supported.
- Subqueries are not supported.
- LIMIT/OFFSET is not supported.
- START WITH is not supported.
- WITH statements are not supported.
- Multiple tables are not supported.
- Partitioned tables are not supported.
- It does not support row-store tables.
- Hash bucket tables are not supported.
- ORDER BY is not supported.
Restrictions for UPDATE statements are as follows:
- Only a single table can be updated.
- UPDATE statements can be used only in stored procedures.
- Multiple tables are not supported.
- Multiple WHERE clauses are not supported.
- WITH/USING and ORDER BY are not supported.
|
Read-Only detection capability of CM disks enhanced |
The capability is enhanced in the following aspects:
- The read-only status is obtained from the database to ensure accuracy.
- Read-only quorum arbitrates only the nodes that exceed the threshold. Other nodes are not affected.
- When the primary node is read-only, a primary/standby switchover is automatically performed. An available standby node is promoted to primary to ensure that the instance can work properly.
|
O&M database connection optimized |
Peer authentication is not used. In addition, concurrency conflicts are resolved and O&M performance is improved. |
Unified name GaussDB |
The name is changed to GaussDB. |
Automatic start of redistribution after lock timeout during scale-out |
The automatic retry mechanism is provided for scale-out. If the lock times out for multiple times, the mechanism automatically exits. After a period of time, scale-out is retried to switch tables during off-peak hours, reducing the impact on user services. |
Software component lifecycle matching the product lifecycle |
The software component lifecycle can match the product lifecycle.
- Update underlying software components (such as open-source, third-party, and self-developed software).
- Fix security vulnerabilities promptly within the product version lifecycle.
|
Single-replica deployment enhanced (for non-production environments) |
- Monitoring metrics are supported.
- An alarm can be triggered when a node is set to read-only due to the fully occupied disk.
- O&M interface reporting adapts to the single-node mode.
- CM makes the read-only alarm information persistent on DCC.
|
Optimizer adaptation and robustness improved |
The operator preference provides assurance if the cost model fails, and multi-version plan selection capability well as intelligent statistics are enhanced, improving the adaptability and robustness of the optimizer.
- In the cost convergence scenario, the operator preference mechanism is provided to prevent the optimizer from selecting a poor plan that is incorrectly estimated. In this way, the plan performance is improved by 50% when the statistics are invalid.
- The robustness of intelligent statistics is enhanced, and statistics of multiple columns are integrated to create a statistics model, improving the evaluation performance of high-frequency values by 50%. The overall performance is improved by 20% to 30%.
- Currently, many slow queries can be resolved only by using service hints. Multi-version plans are required to support more scenarios and provide an autonomous escape mechanism to automatically resolve more than 10% slow queries in specific scenarios. The performance deterioration in the TPC-C test model is less than 5%. This is available only to primary/standby instances.
|
Data restoration in extreme scenarios |
In extreme scenarios, users can restore their data by outputting disk data files which need to be identified by the COPY command. Then, data can be imported to a new DB instance as needed.
Generally, data restoration is based on existing redundancy technologies. This solution is used only in the following scenarios:
- Node recovery and primary/standby build
- Disaster recovery
- Backup and restoration
- Hot patch
- Other cases where the cluster needs to meet the expected requirements
Constraints:
- Only flushed data is parsed.
- Data consistency cannot be ensured during offline data parsing.
- Xlog replay cannot be guaranteed.
- The database is offline.
- Only heap table data in Astore and Ustore can be parsed.
|
Undo files of Ustore managed by the segment-page mechanism |
The undo files of Ustore are managed by the segment-page mechanism. This prevents file handle leakage when a large number of 1 MB undo files are generated. |
Global secondary indexes for distributed instances |
Global secondary indexes for distributed instances
- Global unique indexes can be created on non-distribution keys.
- The performance of point queries on non-base table distribution keys is three times that of common index queries.
- Astore is supported.
Constraints:
- Common UB-tree indexes can be created. Local indexes and expression indexes are not supported.
- GSIs whose distribution is the same as that of base tables cannot be created.
- The GSI distribution key in a base table cannot be updated.
- CREATE GSI CONCURRENTLY and PARALLEL are not supported.
- Hash-based GSIs can be created for row-store Astore tables and partitioned tables whose base tables are hash-based. Base tables cannot be replicated tables, bucketed tables, segment-page tables, list/range-based tables, Ustore tables, or column-store tables cannot be used as base tables. GSIs only support hash-based distribution.
- GSI can be only used in distributed instances.
- If _new$$ or _NEW$$ is added to a column name of a base table or an identifier (including ctid, xc_node_hash, xmin, xmax or tableoid), a GSI fails to be created for the base table whose column names are repeated.
- The GSI can be created only in GTM-Lite mode.
- IUD returning is not supported for base tables with GSIs.
- TABLE ACCESS BY INDEX ROWID is not supported.
- VACUUM FULL, COPY, GDS, and REINDEX TABLE/DATABASE are not supported. All GSIs will be skipped during execution.
- The following models are not supported: UPSERT, MERGE INTO, HOT UPDATE, max_datanode_for_plan (in PBE mode), logical replication, cluster, and For Update/share/nowait/skip Locked
- Compatibility syntax such as Start With is not supported.
- A maximum of 27 columns (including index keys and distribution keys) are supported.
- Other constraints are the same as those of hash distribution, UBTree, LP/FQS, and common indexes.
|
CURRENT OF CURSOR |
The CURRENT OF CURSOR syntax supports ORDER BY. |
SQL features supported by primary/standby Ustore instances |
SQL features supported by primary/standby Ustore instances
- Global temporary tables
- Column constraints
- Parallel query
- Statistics enhancement: Ustore index active page estimation and cost adaptation.
|
Out parameter overloading and capability enhancement for stored procedures |
- SQL PATCH scope: DML within a stored procedure
- Out function overloading scope: Functions with Out parameters can be created and invoked. Only Out parameters in direct value assignment can be returned. For example, in a = func(b), b is an out parameter and is returned.
|
Overload escape in distributed scenarios |
Overload escape is supported in distributed scenarios.
- After a certain proportion of slow SQL statements in a thread pool are executed, the CANCEL operation is triggered.
- Control or circuit breaker measures are taken for SQL statements that occupy a large number of resources to prevent normal SQL statements from being executed due to slow SQL statements.
|
SQL PATCH in distributed instances |
SQL PATCH in distributed instances
Constraints:
- SQL PATCH operations cannot be synchronized between CNs.
- SQL PATCH cannot be used for DNs.
- If the lightproxy plan delivered by a CN to a single node triggers the execution of SQL Patch, the plan is degraded to the FQS plan delivered by the CN to a single node, and SQL PATCH takes effect.
- Similar to HINT, SQL PATCH affects the capability and specifications of distributed plans.
- SQL PATCH cannot be used for DNs. For distributed instances, if a stored procedure is pushed down, the DML in the stored procedure generates plans on DNs and is not affected by SQL PATCH.
|
Ultimate RTO when standby nodes can process read requests |
Based on the ultimate RTO technology, standby nodes can process read requests.
Constraints:
- The DDL statement replay conflicts with the read-only mode of standby nodes.
- When standby nodes process read requests, they consume CPU, memory, and disk space. The memory and disk space are configured based on GUC parameters. If used memory and disk space exceed the configured GUC values, the system will forcibly reclaim the overused resources.
|
Syntax extension and cost model optimized |
Syntax extension and cost model optimized
- Auto analyze:
- Auto analyze is triggered when a large amount of data is inserted and updated in a short period of time to quickly process recycle task tables.
- By default, the sampling ratio and histogram precision are automatically increased based on the table size to improve the analysis accuracy of large tables.
- The adaptive estimation algorithm is provided to improve the NDV estimation accuracy in typical scenarios.
- HINT can be used:
- In parameterized paths.
- In Bitmap Scan operations.
- Across query blocks.
- The Plan Trace function can display the complete process of generating a query plan.
This is available only to primary/standby instances.
|
Index creation and rebuild online |
[Requirement description]
- Indexes (including common indexes, global indexes, and local indexes) can be concurrently created and rebuild in Astore.
- Indexes (including common indexes, global indexes, and local indexes) can be concurrently created and rebuild in Ustore.
Specifications:
- The online DML performance deteriorates by no more than 10%.
- Online index fields cannot be added, deleted, or modified.
- Only common tables and level-1 partitioned tables are supported.
|
UstoreDFX and other features enhanced for primary/standby instances |
Primary/standby instances support the following Ustore features after related whitelist is enabled.
- DFX capabilities are supplemented, and auxiliary verification for DML operations and REDO/UNDO functions is enhanced.
- VACUUM FULL is supported.
- UNDOMETA reduces the frequency and fields
- FSC structure is embedded and its performance is optimized.
|
Basic operator performance improved |
The performance of basic operators is improved.
- SRF
- Expression framework flattened
- AGG optimized
- INNER UNIQUE optimized
- Expression operator optimized
- Parser optimized
- Printtup optimized
- Communication performance optimized
- Index scanning operators optimized
- Predicates optimized
- Subquery optimized
- Noise floor elimination for basic operators
- Indexes optimized
Performance improvement of basic operators: The end-to-end TPC-H performance is improved by 20%, and the TPC-H performance is improved by 40% (including the full table scan operators, expression operators, and aggregation operators). |
SQL M* syntax compatibility |
GaussDB is compatible with some M* features and:
- Is compatible with RANGE, HASH, and LIST partition syntax of MySQL databases.
- Allows users to add columns before the first column of a table or after a specified column.
- Adapts to the MySQL database syntax for modifying column names or definitions.
- Allows users to configure common character set and collation for schemas, tables, and columns.
- Adopts to the modified CREATE, ALTER, and DROP EVENT syntax.
- Is compatible with SELECT INTO.
- Uses syntax tree of SET TRANSACTION to set the isolation level and read/write mode.
The following M* syntax is supported:
- The GROUP_CONCAT function is available for primary/standby instances.
- User-defined variables in a session can be set through SET statements.
- SET is enhanced in global variable configuration.
- Prefix indexes are supported.
- Delimiters are supported.
- DELETE statements can be used for deleting multiple tables.
|
Fully-encrypted database framework with software and hardware integrated |
Fully-encrypted database framework with software and hardware integrated: Based on the virtual TEE, the fully-encrypted framework can encrypt memory and ensure security isolation.
- Secure key transmission channel is supported.
- The fully-encrypted execution framework integrates software and hardware and supports the virtual TEE execution.
- Server-side encryption operator support range query and sorting operations. A whitelist is used to restrict syntax operations.
- The JDBC client driver is supported.
This is available only to primary/standby instances. |
Logic decoding capability enhanced |
- Logical replication supports distributed strong consistency.
- Sequence preserving is ensured when distributed transactions are committed.
- Sequence preserving is ensured in the same transaction on a single DN.
- Sequence preserving is not supported between different DNs of the same transaction.
Constraints: This feature is only suitable for GTM-Lite mode.
- Logic decoding performance views are provided.
- Log sending speed (Mbit/s).
- Log read speed (Mbit/s).
- Log parsing speed (Mbit/s).
Constraints: For parallel decoding, only the parallel decoding performance views of the primary nodes for primary/standby instances are provided.
- Distributed logical decoding is supported. DRS can directly connect to each standby DN in the service cluster to perform logical decoding that is replayed in the cluster scale-out.
- The following two monitoring metrics are added:
- Maximum number of transaction lines
- Number of large transactions (with more than 4096 lines)
- A heartbeat message is added so that external tools can accurately calculate the synchronization latency.
- The latest WAL timestamp is provided for external tools to calculate the latency.
- Parallel decoding provides functions to query the latest WAL timestamp. Serial and parallel decoding provide heartbeat messages.
- The logic decoding performance can reach 300 Mbit/s, which optimizes:
- The performance of saving data to the database is optimized.
- Single-slot decoding performance of primary nodes
- Concurrent replay performance of standby nodes
- Single-slot decoding performance of standby nodes
|
Partitioned table functions and performance optimized |
- LIST and RANGE multi-column partitioning are supported.
- The number of LIST partition keys can be increased from 1 to 16.
- The number of RANGE partition keys can be increased from 4 to 16.
- Range/List/Hash (RLH) partitions:
- RLH partitioned tables mean LIST, RANGE, and HASH tables.
- LIST/RANGE tables support node group distribution.
- The partition syntax is optimized as follows:
- LIST and level-2 partitions support SPLIT and MERGE functions.
- LIST, HASH, and level-2 partitions support CREATE TABLE ... LIKE statements.
Constraints: DDL and DML concurrency across partitions is not supported in distributed instances.
- Continuous improvement of partitioned table performance:
- The partitioned table performance in services is improved by 30%.
- Partition implicit type and function pruning are optimized.
This is available only to primary/standby instances.
|
SQL O* syntax compatibility improved |
- There are new 167 compatibility items for system views, system functions, and advanced packages.
- System functions: 39
- System views: 124
- Advanced package: 4
- Distributed instances support the Go driver.
- The dblink is supported.
- PIVOT/UNPIVOT is supported.
- The XML data type and related interfaces are supported.
- Distributed instances support packages.
- Primary/standby and distributed instances support ROWNUM.
- The following 11 XMLDOM interfaces are supported:
- XMLDOM.GETFIRSTCHILD
- XMLDOM.ITEM
- XMLDOM.DOMNODE
- XMLDOM.MAKEELEMENT
- XMLDOM.GETCHILDNODES
- XMLDOM.DOMNODELIST
- XMLDOM.GETNODEVALUE
- XMLDOM.GETLENGTH
- XMLDOM.GETNODENAME
- XMLDOM.GETELEMENTSBYTAGNAME
- XMLDOM.GETTAGNAME
- XMLPARSER.NEWPARSER
- The functions of tableof are enhanced.
- The array can be nested with tableof.
- Distributed instances support tableof.
|
WDR optimization and DFX support |
- WDR snapshots:
- A statement continues to be executed after it fails to be executed due to timeout, which does not affect the overall execution of a WDR snapshot.
- A WDR snapshot fails during the vacuum process.
- The retention period of WDR snapshots is not 8 days.
- WDR reports:
- The WDR report generation speed is further accelerated.
- The WDR report format is optimized.
- WDR objects are sorted based on multiple dimensions (such as dead tuples, live tuples, vacuum, analyze, scanned rows).
- There is an interface for querying GUC parameters such as search_path configured in a session.
- WDR reports can be read by the standby nodes.
- L0 FULL SQL plans can be viewed.
|
Optimization on memory usage of stored procedures in global PL/SQL cache |
The shared cache based on the stored procedure addresses the high memory usage identified by the stored procedure compatibility. The existing PL/SQL session-level cache is optimized.
- Global PL/SQL Cache: All sessions share PL/SQL compilation intermediate products, and their execution status is maintained independently.
- Syntax node optimization: The data structure of syntax nodes is optimized to reduce the memory usage of a single compilation product.
This is available only to primary/standby instances. |
ODBC Gplan reuse |
ODBC supports the JDBC Statement Pool function and Gplan reuse to improve SQL performance. |
Go drivers for fully-encrypted databases |
A fully-encrypted database can:
- Insert 60,000 rows per second.
- Query 90,000 rows per second.
- Update temporary tables.
- Support HIS key management.
- Support server-side encryption and decryption functions.
- Update CMK keys.
- Support load balancing for JDBC.
Go language drivers are available for fully-encrypted databases. |
Minimum RTO due to improved primary/standby synchronization |
The restriction that a standby node can reconnect to the primary node only after the replay is complete is removed, and the RTO performance is optimized. |
User-level audit settings |
The audit function is enhanced as follows:
- User-level audit is supported. The GUC parameter full_audit_users is added to set the list of audited users. Audit logs are recorded for all SQL statements executed by users in the list. Administrators' operations can be fully audited to audit administrators' access to user data.
- The audit operation can be configured based on client information. The GUC parameter no_audit_client is added to configure the list of clients that do not need to be audited. Audit logs are not recorded for specified client apps and IP addresses (such as cm_agent@127.0.0.1).
- Audit logs are recorded for high-risk system function calls.
|
Strong security authentication supported by Roach |
Roach supports SSL and Kerberos security certificate authentication. |
Security hardening |
The following security vulnerabilities are fixed:
CVE-2023-30608
CVE-2023-29469
CVE-2023-28484
CVE-2021-28235
CVE-2023-27538
CVE-2023-27536
CVE-2023-27535
CVE-2023-27534
CVE-2023-0466
CVE-2023-0465
CVE-2023-0464
CVE-2023-27533
CVE-2023-23931
CVE-2023-23915
CVE-2023-23914
CVE-2023-23916
CVE-2022-4304
CVE-2022-4450
CVE-2023-0215
CVE-2023-0286
CVE-2022-42898
CVE-2022-43552
CVE-2022-43551
CVE-2022-40303
CVE-2022-40304
CVE-2022-32221
CVE-2022-42916
CVE-2022-42915
CVE-2022-1941
CVE-2022-35252
CVE-2022-37434
CVE-2022-32205
CVE-2022-32208
CVE-2022-32206
CVE-2022-2097
CVE-2022-32207
CVE-2022-2068
CVE-2022-27782
CVE-2022-27781
CVE-2021-22570
CVE-2022-29824
CVE-2022-1292
CVE-2022-27775
CVE-2022-27774
CVE-2022-27776
CVE-2022-22576
CVE-2021-3520
CVE-2021-37750 |