Updated on 2024-12-23 GMT+08:00

Precautions

  • Parallel query is in the open beta test (OBT) phase. You are advised to use it in the test environment.
  • The GaussDB(for MySQL) engine version must be MySQL 8.0.22 or later.
  • Both read replicas and primary nodes support parallel query. Parallel query consumes a lot of compute resources (such as CPU and memory). To ensure instance stability, parallel query is disabled by default on primary nodes of GaussDB(for MySQL) instances whose kernel version is 2.0.42.230600 or later. To use parallel query, contact customer service.
  • Parallel query is suitable for the following scenarios:
    • Full table scans, index scans, index range scans, index reverse scans, index point queries, and index pushdown
    • Single-table queries, multi-table joins, views, subqueries, and partial CTE queries
    • Multiple JOIN algorithms, including BNL JOIN, BKA JOIN, HASH JOIN, NESTED LOOP JOIN, SEMI JOIN, ANTI JOIN, and OUTER JOIN
    • Multiple subqueries, including conditional subqueries, scalar subqueries, some correlated subqueries, non-correlated subqueries, and derived tables
    • Multiple data types, including Integer, Character, Floating Point, and Time
    • Arithmetic expressions (+, -, *, %, /, |, and &), conditional expressions (<, <=, >, >=, <>, BETWEEN/AND, and IN), logical operations (OR, AND, and NOT), and common functions (Character, Integer, and Time), and aggregation functions (COUNT/SUM/AVG/MIN/MAX)

      The COUNT aggregate function can only be executed concurrently when innodb_parallel_select_count is disabled.

    • Non-partitioned table queries, and queries for a single partition of partitioned tables
    • ORDER BY, GROUP BY/DISTINCT, LIMIT/OFFSET, WHERE/HAVING, and column projection
    • UNION/UNION ALL queries
    • EXPLAIN statements to view execution plans, including traditional Explain statements, EXPLAIN FORMAT=TREE, and EXPLAIN FORMAT=JSON
  • Parallel query is not suitable for the following scenarios:
    • Non-query statements
    • Window functions
    • Triggers
    • Prepared statements
    • Spatial indexes
    • System tables, temporary tables, and non-InnoDB tables
    • Full-text indexes
    • Stored procedures
    • Subqueries that cannot be converted to semi-joins
    • Statements that do not meet the ONLY_FULL_GROUP_BY rules
    • Index Merge statements
    • HASH JOIN operations, during which data overflows to disks
    • Lock queries, such as SERIALIZABLE isolation level, FOR UPDATE or SHARE LOCK
    • Recursive queries
    • WITH ROLLUP
    • Statements with keyword HIGH_PRIORITY
    • No line of data returned in the execution result. (The execution plan shows: Zero limit, Impossible WHERE, Impossible HAVING, No matching min/max row, Select tables optimized away, Impossible HAVING noticed after reading const tables, or no matching row in const table)
    • Columns with type ZEROFILL. Its column values can be optimized to constants.
    • Generated columns, BLOB, TEXT, JSON, and GEOMETRY
    • Spatial functions (such as SP_WITHIN_FUNC)
    • DISTINCT clauses in aggregate functions, such as SUM(DISTINCT), AVG(DISTINCT), and COUNT(DISTINCT)
    • GROUP_CONCAT
    • JSON_ARRAYAGG and JSON_OBJECTAGG
    • User-defined functions
    • STD, STDDEV, and STDDEV_POP
    • VARIANCE, VAR_POP, and VAR_SAMP
    • BIT_AND, BIT_OR, and BIT_XOR
    • set_user_var
    • RAND functions with parameters
    • json_* (such as json_length and json_type)
    • st_distance
    • get_lock
    • is_free_lock, is_used_lock, release_lock, and release_all_locks
    • sleep
    • xml_str
    • weight_string
    • REF functions (VIEW_REF, OUTER_REF, and AGGREGATE_REF)
    • SHA, SHA1, SHA2, and MD5
    • row_count
    • User-related functions (such as user, current_user, session_user, and system_user)
    • extractvalue
    • GeomCollection, GeometryCollection, LineString, MultiLineString, MultiPoint, MultiPolygon, and Polygon
    • MASTER_POS_WAIT
    • Spatial relationship functions, such as MBRContains, MBRCoveredBy, MBRCovers, MBRDisjoint, MBREquals, MBRIntersects, MBROverlaps, MBRTouches, and MBRWithin
    • Point
    • PS_CURRENT_THREAD_ID()
    • PS_THREAD_ID(CONNECTION_ID())
    • WAIT_FOR_EXECUTED_GTID_SET
    • WAIT_UNTIL_SQL_THREAD_AFTER_GTIDS
    • UNCOMPRESS (COMPRESS ())
    • STATEMENT_DIGEST_TEXT
    • Functions BINARY and CONVERT
    • Functions starting with ST_
  • The execution results of parallel queries may be incompatible with that of serial queries.
    • Number of errors or alarms

      If an error or alarm message is displayed during serial queries, the error or alarm message will be displayed in each worker thread during the parallel queries. As a result, the total number of error or alarm messages increases.

    • Precision

      During the parallel queries, if there is a function type in a SELECT statement, additional stored procedures will be generated in the intermediate results. As a result, compared with serial queries, the precision of the floating point part in parallel queries may be different, and the final result may be slightly different.

    • Truncation

      During the parallel queries, if there is a function type in a SELECT statement, additional stored procedures will be generated in the intermediate results. In this process, the calculation result of the function needs to be cached, and data truncation may occur (generally due to data type conversion, for example, covering a floating-point value to a character string). As a result, the final result is different from the serial queries.

    • Sequence of result sets

      Because tasks are executed by multiple worker threads during parallel queries, the sequence of the returned result set may not be consistent with that of serial queries. In the case of a query with LIMIT, this problem is more likely to occur. If fields of GROUP BY are invisible characters, the sequence of the returned result set is also different.

    • UNION ALL result sets

      UNION ALL ignores sort operators. The sequence of the returned result set in parallel execution may be different from that in non-parallel execution. In the case of a query with LIMIT, the result sets are different.