Help Center > > Developer Guide> System Overview> Product Architecture

Product Architecture

Updated at: Jul 15, 2020 GMT+08:00

DWS is a distributed parallel database cluster with the shared-nothing architecture, as shown in Figure 1.

Figure 1 DWS product architecture
  • Application layer

    Data loading tools, Extract-Transform-Load (ETL) tools, Business Intelligence (BI) tools, and data mining and analysis tools can be integrated with DWS through standard interfaces. DWS is compatible with PostgreSQL and the SQL syntax is modified to make it compatible with Oracle and Teradata. Applications can be smoothly migrated to DWS with few changes.

  • Interface

    Applications can connect to DWS through standard JDBC 4.0 and ODBC 3.5.

  • DWS (MPP cluster)

    The cluster consists of modules used for data management. Figure 2 shows these modules and Table 1 describes their functions.

  • Automatic data backup

    Cluster snapshots can be automatically backed up to Object Storage Service (OBS), an EB-level object storage service, which facilitates periodic backup of the cluster during off-peak hours, ensuring data recovery after a cluster exception occurs.

    A snapshot is a complete backup of DWS at a specified time point. It records all configuration data and service data of the cluster at the specified moment.

  • Tool chain

    The parallel data loading tool General Data Service (GDS), syntax migration tool Migration Tool, and SQL development tool Data Studio are provided. The cluster O&M can be monitored through a console.

Figure 2 shows the logical architecture of DWS. For details, see Table 1.

Figure 2 Cluster logical architecture
Table 1 Cluster architecture description

Name

Description

GTM

Global Transaction Manager: generates and maintains the global unique information, such as the global transaction ID, transaction snapshot, and timestamp.

WLM

Workload Manager: controls allocation of system resources to prevent service congestion and system crash resulting from excessive workload.

CN

Coordinator: receives access requests from applications, and returns execution results to the client. The CN breaks down tasks and allocates task fragments to different DNs for parallel processing.

DN

Datanode: stores service data by column or row or in the hybrid mode, executes data query tasks, and returns execution results to CNs.

Storage

Functions as the server's local storage resources to store data permanently.

DNs in a cluster store data on disks. Figure 3 logically describes the objects on each DN and the relationship among them.

  • Database: A database manages various data objects and is isolated from each other.
  • Data file Segment: A data file, each of which stores data of only one table. A table containing more than 1 GB of data is stored in multiple data file segments.
  • Table: One table belongs to only one database.
  • Block: The basic unit of database management. Its default size is 8 KB.

Data can be distributed in REPLICATION, ROUNDROBIN, or HASH mode. You can set it while creating a table. ROUNDRONIN only applies to foreign tables.

Figure 3 Database logical architecture

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel