Help Center > > Developer Guide

Product Architecture

Updated at: Jul 14, 2021 GMT+08:00

GaussDB(DWS) is a distributed parallel database cluster with the shared-nothing architecture, as shown in Figure 1.

Figure 1 GaussDB(DWS) product architecture
  • Application layer

    Data loading tools, Extract-Transform-Load (ETL) tools, Business Intelligence (BI) tools, and data mining and analysis tools can be integrated with GaussDB(DWS) through standard interfaces. GaussDB(DWS) is compatible with PostgreSQL and the SQL syntax is modified to make it compatible with Oracle and Teradata. Applications can be smoothly migrated to GaussDB(DWS) with few changes.

  • Interface

    Applications can connect to GaussDB(DWS) through standard JDBC 4.0 and ODBC 3.5.

  • GaussDB(DWS) (MPP cluster)

    The cluster consists of modules used for data management. Figure 2 shows these modules and Table 1 describes their functions.

  • Automatic data backup

    Cluster snapshots can be automatically backed up to Object Storage Service (OBS), an EB-level object storage service, which facilitates periodic backup of the cluster during off-peak hours, ensuring data recovery after a cluster exception occurs.

    A snapshot is a complete backup of GaussDB(DWS) at a specified time point. It records all configuration data and service data of the cluster at the specified moment.

  • Tool chain

    The parallel data loading tool General Data Service (GDS), syntax migration tool DSC, and SQL development tool Data Studio are provided. The cluster O&M can be monitored through a console.

Figure 2 shows the logical architecture of GaussDB(DWS). For details, see Table 1.

Figure 2 Cluster logical architecture
Table 1 Cluster architecture description




Global Transaction Manager: generates and maintains the global unique information, such as the global transaction ID, transaction snapshot, and timestamp.


Workload Manager: controls allocation of system resources to prevent service congestion and system crash resulting from excessive workload.


Coordinator: receives access requests from applications, and returns execution results to the client. The CN breaks down tasks and allocates task fragments to different DNs for parallel processing.


Datanode: stores service data by column or row or in the hybrid mode, executes data query tasks, and returns execution results to CNs.


Functions as the server's local storage resources to store data permanently.

DNs in a cluster store data on disks. Figure 3 logically describes the objects on each DN and the relationship among them.

  • Database: A database manages various data objects and is isolated from each other.
  • Data file Segment: A data file, each of which stores data of only one table. A table containing more than 1 GB of data is stored in multiple data file segments.
  • Table: One table belongs to only one database.
  • Block: The basic unit of database management. Its default size is 8 KB.

Data can be distributed in REPLICATION, ROUNDROBIN, or HASH mode. You can set it while creating a table. ROUNDRONIN only applies to foreign tables.

Figure 3 Database logical architecture

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?

Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel