Esta página ainda não está disponível no idioma selecionado. Estamos trabalhando para adicionar mais opções de idiomas. Agradecemos sua compreensão.

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

SQL Self-Diagnosis

Updated on 2024-11-05 GMT+08:00

Performance issues may occur when you run the INSERT/UPDATE/DELETE/SELECT/MERGE INTO or CREATE TABLE AS statement. The product supports automatic performance diagnosis and saves related diagnosis information to Real-time Top SQL. When enable_resource_track is set to on, the diagnosis information is dumped to Historical Top SQL. You can query the warning column in the GS_WLM_SESSION_STATISTICS, GS_WLM_SESSION_HISTORY, and GS_WLM_SESSION_INFO views to obtain reference information for performance tuning.

  • Alarms that can trigger SQL self-diagnosis depend on the settings of resource_track_level.

    When resource_track_level is set to query, you can diagnose alarms such as uncollected multi-column/single-column statistics, unpruned partitions, and failure of pushing down SQL statements. When resource_track_level is set to perf or operator, all alarms can be diagnosed.

  • Whether a SQL plan will be diagnosed depends on the settings of resource_track_cost.

    A SQL plan will be diagnosed only if its execution cost is greater than resource_track_cost. You can use the EXPLAIN keyword to check the plan execution cost.

  • When EXPLAIN PERFORMANCE or EXPLAIN VERBOSE is executed, SQL self-diagnosis information, except the ones without multi-column statistics, will be generated. For details, see SQL Execution Plan.

Alarms Related to SQL Execution Performance

Currently, the following alarms on performance issues will be reported:

  1. Statistics of a single column or multiple columns are not collected.

    If statistics of a single column or multiple columns are not collected, an alarm is reported. To handle this alarm, you are advised to perform ANALYZE on related tables. For details, see Updating Statistics and Optimizing Statistics.

    If no statistics are collected for the OBS foreign table and HDFS foreign table in the query statement, an alarm indicating that statistics are not collected will be reported. Because the ANALYZE performance of the OBS foreign table and HDFS foreign table is poor, you are not advised to perform ANALYZE on these tables. Instead, you are advised to use the ALTER FOREIGN TABLE syntax to modify the totalrows attribute of the foreign table to correct the estimated number of rows.

    Example alarms:

    The statistics about a table are not collected.

    Statistic Not Collect
            schema_test.t1

    The statistics about a single column are not collected.

    Statistic Not Collect
            schema_test.t2(c1)

    The statistics about multiple columns are not collected.

    Statistic Not Collect
            schema_test.t3((c1,c2))

    The statistics about a single column and multiple columns are not collected.

    Statistic Not Collect
            schema_test.t4(c1)
            schema_test.t5((c1,c2))
  2. Partitions are not pruned.

    When a partitioned table is queried, the partition is pruned based on the constraints on the partition key to improve the query performance. However, the partition table may not be pruned due to improper constraints, deteriorating the query performance. For details, see Case: Rewriting SQL Statements and Eliminating Prune Interference.

  3. SQL statements are not pushed down.

    The cause details are displayed in the alarms. For details, see Optimizing Statement Pushdown.

    The potential causes for the pushdown failure are as follows:
    • Caused by functions

      The function name is displayed in the diagnosis information. Function pushdown is determined by the shippable attribute of the function. For details, see the CREATE FUNCTION syntax.

    • Caused by syntax

      The diagnosis information displays the syntax that causes the pushdown failure. For example, if the statement contains the With Recursive, Distinct On, or row expression and the return value is of the record type, an alarm is reported, indicating that the syntax does not support pushdown.

    Example alarms:

    SQL is not plan-shipping 
            "enable_stream_operator" is off
    
    SQL is not plan-shipping 
            "Distinct On" can not be shipped
    
    SQL is not plan-shipping
            "v_test_unshipping_log" is VIEW that will be treated as Record type can't be shipped
  4. Vectorized plans are not supported.

    For SQL statements that cannot use vectorized plans, detailed reasons why vectorized plans cannot be used are reported.

    Common reasons are as follows:

    • The target column contains functions whose return type is a set.
    • The target column or query condition, the distribution key of the Stream operator, and the Limit and Offset clauses contain expressions that cannot be vectorized (such as geospatial types, array expressions, Row expressions, XML expressions, and functions whose parameters or return values contain the refcursor type).
    • The Group By clause contains an array-equivalent judgment statement.
    • GC_FDW and LOG_FDW do not support vectorization.
    • The plan contains operators such as Cte Scan, Recursive Union, Merge Append, and Lock Rows.

    Example alarms:

    SQL is un-vectorized
             Function regexp_split_to_table that returns set is un-vectorized
    
    SQL is un-vectorized
             Array expression is un-vectorized
    
    SQL is un-vectorized
             Function array_agg is un-vectorized
    
    SQL is un-vectorized
             RecursiveUnion is un-vectorized
  5. In a hash join, the larger table is used as the inner table.

    An alarm will be reported if the number of rows in the inner table reaches or exceeds 10 times of that in the foreign table, more than 100,000 inner-table rows are processed on each DN in average, and data has been flushed to disks. You can check the query_plan column in GS_WLM_SESSION_HISTORY to check whether hash joins are used. In this scenario, you need to adjust the sequence of the HashJoin internal and foreign tables. For details, see Join Order Hints.

    Example alarm:

    Execute diagnostic information
    PlanNode[7] Large Table is INNER in HashJoin "Vector Hash Aggregate"

    In the preceding command, 7 indicates the operator whose ID is 7 in the query_plan column.

  6. nestloop is used in a large-table equivalent join.

    An alarm will be reported if nested loop is used in an equivalent join where more than 100,000 larger-table rows are processed on each DN in average. You can check the query_plan column of GS_WLM_SESSION_HISTORY to see if nested loop is used. In this scenario, you need to adjust the table join mode and disable the NestLoop join mode between the current internal and foreign tables. For details, see Join Operation Hints.

    Example alarm:

    Execute diagnostic information
            PlanNode[5] Large Table with Equal-Condition use Nestloop"Nested Loop"
  7. A large table is broadcasted.

    An alarm will be reported if more than 100 thousand of rows are broadcasted on each DN in average. In this scenario, the broadcast operation of the Broadcast lower-layer operator needs to be disabled. For details, see Stream Operation Hints.

    Example alarm:

    Execute diagnostic information
            PlanNode[5] Large Table in Broadcast "Streaming(type: BROADCAST dop: 1/2)"
  8. Data skew occurs.

    An alarm will be reported if the number of rows processed on any DN exceeds 100 thousand, and the number of rows processed on a DN reaches or exceeds 10 times of that processed on another DN. Generally, this alarm is generated due to storage layer skew or computing layer skew. For details, see Optimizing Data Skew.

    Example alarm:

    Execute diagnostic information
           PlanNode[6] DataSkew:"Seq Scan", min_dn_tuples:0, max_dn_tuples:524288
  9. The index is improper.

    During base table scanning, an alarm is reported if the following conditions are met:

    • For row-store tables:
      • When the index scanning is used, the ratio of the number of output lines to the number of scanned lines is greater than 1/1000 and the number of output lines is greater than 10,000.
      • When sequential scanning is used, the number of output lines to the number of scanned lines is less than 1/1000, the number of output lines is less than or equal to 10,000, and the number of scanned lines is greater than 10,000.
    • For column-store tables:
      • When the index scanning is used, the ratio of the number of output lines to the number of scanned lines is greater than 1/10000 and the number of output lines is greater than 100.
      • When sequential scanning is used, the number of output lines to the number of scanned lines is less than 1/10,000, the number of output lines is less than or equal to 100, and the number of scanned lines is greater than 10,000.

    For details, see Optimizing Operators. You can also refer to Case: Creating an Appropriate Index and Case: Setting Partial Cluster Keys.

    Example alarms:

    Execute diagnostic information
            PlanNode[4] Indexscan is not properly used:"Index Only Scan", output:524288, filtered:0, rate:1.00000
            PlanNode[5] Indexscan is ought to be used:"Seq Scan", output:1, filtered:524288, rate:0.00000

    The diagnosis result is only a suggestion for the current SQL statement. You are advised to create an index only for frequently used filter criteria.

  10. Estimation is inaccurate.

An alarm will be reported if the maximum number or the estimated maximum number of rows processed on a DN is over 100,000, and the larger number reaches or exceeds 10 times of the smaller one. In this scenario, you can refer to Rows Hints to correct the estimation on the number of rows, so that the optimizer can re-design the execution plan based on the correct number.

Example alarm:

Execute diagnostic information
        PlanNode[5] Inaccurate Estimation-Rows: "Hash Join" A-Rows:0, E-Rows:52488

Constraints

  1. An alarm contains a maximum of 2048 characters. If the length of an alarm exceeds this value (for example, a large number of long table names and column names are displayed in the alarm when their statistics are not collected), a warning instead of an alarm will be reported.
    WARNING, "Planner issue report is truncated, the rest of planner issues will be skipped"
  2. If a query statement contains the Limit operator, alarms of operators lower than Limit will not be reported.
  3. For alarms about data skew and inaccurate estimation, only alarms on the lower-layer nodes in a plan tree will be reported. This is because the same alarms on the upper-level nodes may be triggered by problems on the lower-layer nodes. For example, if data skew occurs on the Scan node, data skew may also occur in operators (for example, Hashagg) at the upper layer.

Usamos cookies para aprimorar nosso site e sua experiência. Ao continuar a navegar em nosso site, você aceita nossa política de cookies. Saiba mais

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback