Esta página ainda não está disponível no idioma selecionado. Estamos trabalhando para adicionar mais opções de idiomas. Agradecemos sua compreensão.

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ GaussDB(DWS)/ Troubleshooting/ Cluster Performance/ Data Skew Causes Slow SQL Statement Execution and Operations Fail on Large Tables

Data Skew Causes Slow SQL Statement Execution and Operations Fail on Large Tables

Updated on 2025-01-06 GMT+08:00

Symptom

SQL statement execution is slow and SQL statements cannot be executed on large tables.

Possible Causes

The distribution modes supported by GaussDB(DWS) are hash, replication, and roundrobin (supported by 8.1.2 clusters and later versions). If the created table is distributed in Hash mode and the distribution key is not specified, the first column of the table is selected as the distribution key. In this case, skew may occur. Table skew has the following negative impacts.

  • The SQL execution performance is poor because data is distributed only on some DNs. When the SQL statement is executed, only some DNs are involved in computing, and the advantage of distributed computing is not leveraged.
  • The usage of resources, especially disks, will be skewed. That is, the usage of some disks may be close to the upper limit, but the usage of other disks is low.
  • The CPU usage of some nodes may be excessively high.

Cause Analysis

  1. Log in to the GaussDB(DWS) management console. On the Clusters page, locate the target cluster. In the Operation column of the target cluster, click Monitoring Panel. Choose Monitoring > Node Monitoring. Click the Disks tab to view the disk usage.

    NOTE:

    Check the usage of each data disk. It is found that the usage is uneven among data disks. Generally, the difference between the highest and the lowest disk usage is small. If the difference exceeds 5%, data skew may occur.

  2. Connect to the database and check the job operating status in the waiting view. It is found that the job waits for being processed by one or some DNs.

    1
    SELECT wait_status, count(*) as cnt FROM pgxc_thread_wait_status WHERE wait_status not like '%cmd%' AND wait_status not like '%none%' and wait_status not like '%quit%' group by 1 order by 2 desc;
    

  3. The explain performance of the slow statement shows that the scan time and number of scan rows in the base table of each DN are unbalanced.

    1
    explain performance select avg(ss_wholesale_cost) from store_sales;
    

    • Time of scanning a base table: The fastest DN takes 5 ms, and the slowest DN takes 1173 ms.
    • Data distribution: Some DNs have 22,831,616 rows and other DNs have no row, resulting in data skew.

  4. You can detect data skew by using the skew check interface.

    1
    SELECT table_skewness('store_sales');
    

    1
    SELECT table_distribution('public','store_sales');
    

  1. The resource monitoring result shows that the CPU usage and I/O of some nodes are significantly higher than those of other nodes.

Handling Procedure

How to find the skewed table
  1. If the number of tables in the database is less than 10,000, use the PGXC_GET_TABLE_SKEWNESS view to query data skew of all tables in the database.
    1
    SELECT * FROM pgxc_get_table_skewness ORDER BY totalsize DESC;
    
  2. If the number of tables in the database is greater than 10,000, it may take a long time (hours) to query the entire database and calculate skew columns in the PGXC_GET_TABLE_SKEWNESSPGXC_GET_TABLE_SKEWNESS view. You are advised to perform the following operations by referring to the definition of the PGXC_GET_TABLE_SKEWNESS view:
    • In 8.1.2 and earlier cluster versions, the table_distribution()table_distribution() function is used to optimize calculation by customizing output and reducing output columns. For example:
      1
      2
      3
      4
      5
      6
      SELECT schemaname,tablename,max(dnsize) AS maxsize, min(dnsize) AS minsize 
      FROM pg_catalog.pg_class c 
      INNER JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace 
      INNER JOIN pg_catalog.table_distribution() s ON s.schemaname = n.nspname AND s.tablename = c.relname 
      INNER JOIN pg_catalog.pgxc_class x ON c.oid = x.pcrelid AND x.pclocatortype = 'H' 
      GROUP BY schemaname,tablename;
      
    • For clusters of 8.1.3 and later versions, the gs_table_distribution()gs_table_distribution() function can be used to check data skew of all tables in the database. The gs_table_distribution() function is better than the table_distribution() function when you query all tables in the database. In a large cluster with a large amount of data, use the gs_table_distribution() function.
      1
      2
      3
      4
      5
      6
      SELECT schemaname,tablename,max(dnsize) AS maxsize, min(dnsize) AS minsize 
      FROM pg_catalog.pg_class c 
      INNER JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace 
      INNER JOIN pg_catalog.gs_table_distribution() s ON s.schemaname = n.nspname AND s.tablename = c.relname 
      INNER JOIN pg_catalog.pgxc_class x ON c.oid = x.pcrelid AND x.pclocatortype = 'H' 
      GROUP BY schemaname,tablename;
      
      NOTE:

      Run the following statement to query large tables:

      1
      SELECT schemaname||'.'||tablename as table, sum(dnsize) as size FROM gs_table_distribution() group by 1 order by 2 desc limit 10;
      

      Run the following statement to query the table skew rate:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      28
      29
      WITH skew AS
      (
              SELECT
                      schemaname,
                      tablename,
                      pg_catalog.sum(dnsize) AS totalsize,
                      pg_catalog.avg(dnsize) AS avgsize,
                      pg_catalog.max(dnsize) AS maxsize,
                      pg_catalog.min(dnsize) AS minsize,
                      (pg_catalog.max(dnsize) - pg_catalog.min(dnsize)) AS skewsize,
                      pg_catalog.stddev(dnsize) AS skewstddev
              FROM pg_catalog.pg_class c
              INNER JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
              INNER JOIN pg_catalog.gs_table_distribution() s ON s.schemaname = n.nspname AND s.tablename = c.relname
              INNER JOIN pg_catalog.pgxc_class x ON c.oid = x.pcrelid AND x.pclocatortype IN('H', 'N')
              GROUP BY schemaname,tablename
      )
      SELECT
              schemaname,
              tablename,
              totalsize,
              avgsize::numeric(1000),
              (maxsize/totalsize)::numeric(4,3)  AS maxratio,
              (minsize/totalsize)::numeric(4,3)  AS minratio,
              skewsize,
              (skewsize/avgsize)::numeric(4,3)  AS skewratio,
              skewstddev::numeric(1000)
      FROM skew
      WHERE totalsize > 0;
      
Methods of selecting a distribution key for a table
  1. If the distinct value of the column is large and no obvious data skew occurs, you can define multiple columns as a distribution key.
    View the size of distinct.
    1
    SELECT count(distinct column1) FROM table;
    
    Check whether there are data skews:
    1
    SELECT count(*) cnt, column1 FROM table group by column1 order by cnt limit 100;
    
  2. Select the columns where JOIN or GROUP BY statement is frequently used to reduce the use of STREAM.
  3. It is advised against using these distribution key selection methods:
    1. The default value of the distribution key (the first column) is used.
    2. The distribution key is generated through the auto-increment of sequences.
    3. The distribution key is generated using a random number. This method is recommended only when any column or any combination of two columns is skewed.

Usamos cookies para aprimorar nosso site e sua experiência. Ao continuar a navegar em nosso site, você aceita nossa política de cookies. Saiba mais

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback