Help Center> Data Warehouse Service (DWS)> Technical White Paper> GaussDB(DWS)> GaussDB(DWS) Architecture and Advantages
Updated on 2022-12-16 GMT+08:00

GaussDB(DWS) Architecture and Advantages

GaussDB(DWS) clusters are distributed parallel database clusters based on the shared-nothing architecture.

Figure 1 GaussDB(DWS) architecture

This architecture renders GaussDB(DWS) the following key advantages:

GaussDB(DWS) can be used in multiple cloud forms, including HUAWEI CLOUD, HUAWEI CLOUD Stack, HCSO, and Intelligent EdgeSite (IES), satisfying differentiated deployment and O&M demands.

GaussDB(DWS) uses the Huawei-developed GaussDB database kernel and is compatible with PostgreSQL. The GaussDB database is transformed from a single OLTP database to an enterprise-grade, MPP-based, and distributed OLAP database oriented to massive data analysis.

DWS comes in four types: standard data warehouse, cloud data warehouse, stream data warehouse, and hybrid data warehouse, They provide processing capability of large volume of data across multiple industries and a unified management platform with the following advantages and features:

Ease of use

  • Visualized one-stop management

    GaussDB(DWS) allows you to easily complete the entire process from project concept to production deployment. With the GaussDB(DWS) management console, you do not need to install data warehouse software or deploy data warehouse servers. GaussDB(DWS) offers you a high-performance and high-availability enterprise-grade data warehouse cluster in a couple of minutes.

    With just a few clicks, you can easily connect applications to the data warehouse, back up data, restore data, and monitor data warehouse resources and performance.

  • Heterogeneous database migration tools

    GaussDB(DWS) provides various migration tools to migrate SQL scripts of Oracle and Teradata to GaussDB(DWS).

High performance

  • Cloud-based distributed architecture

    GaussDB(DWS) adopts the MPP architecture so that service data is separately stored on numerous nodes. Data analytics tasks are quickly executed in parallel on the nodes where data is stored.

  • Response to query of trillions of data records within seconds

    GaussDB(DWS) improves data query performance by executing multi-thread operators in parallel, running commands in registers in parallel with the vectorized computing engine, and using the Low Level Virtual Machine (LLVM) compiler to reduce redundant judgment conditions.

    GaussDB(DWS) provides you with a better data compression ratio (column-store), higher index performance (column-store), and better point update and query (row-store) performance.

  • Fast data loading

    GaussDB(DWS) provides you with GDS, a high-speed parallel bulk data loading tool.

High scalability

  • On-demand scale-out: With the shared-nothing open architecture, nodes can be added at any time to enhance the data storage, query, and analysis capabilities of the system. Up to 2048 nodes can be deployed.
  • Enhanced linear performance after scale-out: The capacity and performance increase linearly with the cluster scale. The linear rate is 0.8.
  • Service continuity: During scale-out, data can be added, deleted, modified, and queried, and DDL operations (DROP/TRUNCATE/ALTER TABLE) can be performed. Online table-level scale-out ensures service continuity.

Robust reliability

  • ACID

    Support for the atomicity, consistency, isolation, and durability (ACID) feature, which ensures strong data consistency for distributed transactions.

  • Comprehensive HA design

    All software processes of GaussDB(DWS) are in active/standby mode. Logical components, such as the CNs and DNs of each cluster, also work in active/standby mode. This ensures data reliability and consistency as well as service continuity when any single point of failure (SPOF) occurs.

  • Security

    GaussDB(DWS) supports transparent data encryption and can connect to DBSS to better protect user privacy and data security with network isolation and security group rule setting options. In addition, GaussDB(DWS) supports automatic full and incremental backup of data for higher reliability.

Convergent analysis

  • Convergence of multiple modes: You can perform direct calculation and convergent analysis of stream, time series, GIS, full-text, and AI data on GaussDB(DWS).
  • Convergence of multiple sources: You can use standard SQL statements to query data on the Hadoop Distributed File System (HDFS) and object storage service (OBS) without data migration.
  • Cluster acceleration: The shared cluster Express is provided based on OBS data access for more efficient convergent computing and analysis capabilities.

Strong security

  • Transparent encryption: Database data files are encrypted to prevent malicious attackers from bypassing the database permission control mechanism at the OS layer or stealing disks to access user data.
  • Data masking: Built-in masking functions for digits, characters, and time types are provided. In addition, masking rules can be customized to effectively protect sensitive data while efficiently accessing big data.