Updated on 2025-08-11 GMT+08:00

Overview

This manual is a comprehensive collection of best practices in using GaussDB(DWS). For each task, a detailed description of the solution and operation procedures is provided. Check out this manual based on the task at hand.

Table 1 Best practices

Category

Document Name

Description

Import/Export

Importing Data to GaussDB(DWS)

Describes how to import data to and export data from GaussDB(DWS) using tools.

Transferring Data Between OBS Buckets and GaussDB(DWS) Clusters

Describes how to upload sample data to OBS, access data in OBS buckets using OBS foreign tables, import data to GaussDB(DWS), or export data from a GaussDB(DWS) table to an OBS bucket.

Using GDS to Import Table Data from a Remote Server to a GaussDB(DWS) Cluster

Describes how to use GDS to import data on a remote server to GaussDB(DWS).

Importing Table Data from MRS Hive to a GaussDB(DWS) Cluster

Describes how to enable GaussDB(DWS) to remotely access or read MRS data sources using an HDFS foreign table.

Enabling Cross-Cluster Access of Hive Metastore Through an External Schema

Describes how to enable a decoupled storage and compute cluster to remotely access HiveMetaStore metadata using an external schema.

Importing Table Data from DLI to a GaussDB(DWS) Cluster

Describes how to import data from DLI to GaussDB(DWS) using the GaussDB(DWS) foreign table function.

Exporting ORC Data from a GaussDB(DWS) Cluster to an MRS Cluster

Describes how to export data in ORC format to MRS using an HDFS foreign table.

Data migration

Using CDM to Migrate Oracle Data to a GaussDB(DWS) Cluster

Describes how to migrate data from Oracle tables to GaussDB(DWS).

Using CDM to Migrate MySQL Data to a GaussDB(DWS) Cluster

Describes how to use Cloud Data Migration (CDM) to migrate MySQL data to GaussDB(DWS) in batches.

Using a Flink Job of DLI to Synchronize MySQL Data to a GaussDB(DWS) Cluster in Real Time

Describes how to use a Flink job of DLI to synchronize MySQL data to GaussDB(DWS) in real time.

Using CDM to Migrate Data from Hologres to a GaussDB(DWS) Cluster

Describes how to use CDM to migrate data from Hologres to GaussDB(DWS).

Using Kettle to Migrate Small Tables from AWS Redshift to a GaussDB(DWS) Cluster

Describes how to use the open-source tool Kettle to migrate data from AWS Redshift to GaussDB(DWS).

Using CDM to Migrate Data from AnalyticDB for MySQL to a GaussDB(DWS) Cluster

Describes how to use CDM to migrate data from AnalyticDB for MySQL to GaussDB(DWS).

Using a Flink Job of DLI to Synchronize Kafka Data to a GaussDB(DWS) Cluster in Real Time

Describes how to use a Flink job of DLI to synchronize consumption data of DMS for Kafka to GaussDB(DWS) in real time.

Migrating Data Between DWS Clusters Using GDS

Describes how to migrate 15 million rows of data between two GaussDB(DWS) clusters within minutes based on the high concurrency of GDS import and export.

Data analysis

Using GaussDB(DWS) to Query Vehicle Routes at Traffic Checkpoints in Seconds

Describes how to analyze passing vehicles at checkpoints. In this practice, 890 million data records from checkpoints are loaded to a single database table on GaussDB(DWS) for accurate and fuzzy query, demonstrating the ability of GaussDB(DWS) to perform high-performance query for historical data.

Using GaussDB(DWS) to Analyze the Supply Chain Requirements of a Company

Describes how to load the sample data set from OBS to a GaussDB(DWS) cluster and perform data queries, demonstrating multi-table analysis and theme analysis of GaussDB(DWS) in data analysis scenarios.

Using DWS to Analyze the Operational Status of a Retail Department Store

Describes how to load the daily business data of each retail store from OBS to the corresponding table in the data warehouse, and then summarize and query KPIs such as the store revenue, customer flow, monthly sales ranking, monthly customer flow conversion rate, monthly price-rent ratio, and sales per unit area.

Interconnecting GaussDB(DWS) with Power BI

Describes how to install Power BI on a Windows ECS and connect it to GaussDB(DWS) using an on-premises data gateway.

Decoupled storage and compute

DWS 3.0 Decoupled Storage and Compute Usage Suggestions and Performance Optimization

Describes the performance optimization and precautions of the decoupled storage and compute clusters.

Data development

Using the GaussDB(DWS) Turbo Engine to Improve Data Query Performance

Describes what the Turbo engine is. Compared to the original column-store execution engines, the Turbo engine optimizes memory and disk storage formats for strings and numeric types, and enhances the performance of common operators like SORT, AGGREGATE (AGG), JOIN, and SCAN. This results in doubling the overall performance of the executor and significantly reducing compute costs.

Cutting Costs by Switching Between Cold and Hot Data Storage in GaussDB(DWS)

Describes hot and cold data switchover of GaussDB(DWS) to ensure that in scenarios where the data volume increases sharply, data can be classified into hot data and cold data based on the data usage frequency of services and managed at different levels. This improves the analysis performance and reduces costs.

Cutting Partition Maintenance Costs for the E-commerce and IoT Industries by Leveraging GaussDB(DWS)'s Automatic Partition Management Feature

Describes how GaussDB(DWS) automatically manages partitions. Based on the table-level parameters (period and ttl), GaussDB(DWS) can automatically create partitions and delete expired partitions. This function is available for time-based partitioned tables (such as orders and IoT data). This function frees users from maintaining partitioned tables, significantly reducing O&M costs and improving query performance.

Improving Development Efficiency by Leveraging GaussDB(DWS)'s View Decoupling and Rebuilding Function

Describes when and how to use the automatic view rebuilding function. Base table objects cannot be modified independently due to view and table dependency. To solve this problem, GaussDB(DWS) supports view decoupling and rebuilding.

Best Practices for Using HStore Tables

Shows how to update and delete traditional column-store tables using the delta table mechanism of HStore tables, optimizing storage and performance.

Best Practice of Converting a Time Series Table to an HStore Table

Describes the benefits of HStore tables over time series tables in terms of data import, compression ratio, and query performance. Therefore, you are advised to replace time series tables with HStore tables.

Best Practices of GIN Index

Describes how to use GIN indexes to search through array and JSONB types, as well as how to conduct full-text searches.

Encrypting and Decrypting Data Columns

Describes SQL function encryption provided by GaussDB(DWS). Data encryption is widely used in information systems to prevent unauthorized access and data leakage. As the core of an information system, GaussDB(DWS) also provides data encryption functions, including transparent encryption and SQL function encryption.

Managing Data Permissions Through Views

Describes how to use views to allow various users to access specific data within a given table, ensuring data permissions management and security.

Database management

Role-based Access Control (RBAC)

Provides an example of applying the role-based access control (RBAC) model to GaussDB(DWS). RBAC grants permissions to roles so that users can obtain the permissions of these roles by associating with the roles.

Configuring Read-Only Permissions

Describes how to configure the read-only permission for an IAM user.

Best Practices for User Management

Describes the permissions of system administrators and common users and describes how to create users and query user information.

Viewing Table and Database Information

Provides SQL examples for querying tables and databases.

Best Practices of Database SEQUENCE

Describes best practices and examples for creating and managing sequences.

Performance tuning

Optimizing Table Structure Design to Enhance GaussDB(DWS) Query Performance

Describes how to optimize table performance in GaussDB(DWS) by properly designing the table structure (for example, by selecting the table model, table storage mode, compression level, distribution mode, distribution column, partitioned table, and local clustering).

Excellent Practices for SQL Queries

Describes how to adjust SQL statements according to certain rules to improve the SQL execution while ensuring correct results.

Data Skew Queries

Provides methods for querying data skew.

Analyzing SQL Statements That Are Being Executed to Handle GaussDB(DWS) Performance Issues

Describes how to use the PG_STAT_ACTIVITY view to analyze and locate SQL problems, such as excessive SQL connections, long SQL query time, and SQL query blocking, that developers may encounter during development.

Cluster management

Binding Different Resource Pools to Two Types of Jobs to Balance Load for GaussDB(DWS)

Demonstrates how to use GaussDB(DWS) for resource management, helping enterprises eliminate bottlenecks in concurrent queries. SQL jobs can run smoothly without affecting each other and consume less resources than before.

Scaling Options for GaussDB(DWS) with a Coupled Storage-Compute Architecture

Describes a critical feature for cloud services, scalability. It refers to cloud services' ability to increase or decrease compute and storage resources to meet changing demand, achieving a balance between performance and cost.

Security management

Security Best Practices

Provides actionable guidance for enhancing the overall security of your service data when using GaussDB(DWS).