Help Center/ GaussDB(DWS)/ Technical White Paper/ GaussDB(DWS) Core Technologies/ Data Distribution in a Distributed System
Updated on 2023-03-30 GMT+08:00

Data Distribution in a Distributed System

Background

DWS uses a two-layer data layout mechanism achieve high-performance query and import of PB-level data. At the first layer, users can specify a data distribution policy (hash distribution or replication distribution) when creating a table. When data is written to the system, the system determines the node where the data is stored based on the corresponding distribution policy. At the second layer, the node partitions its stored data according to partitioning rules.