Help Center/ Distributed Database Middleware/ FAQs/ DDM Usage/ How Does DDM Perform Sharding?
Updated on 2024-07-30 GMT+08:00

How Does DDM Perform Sharding?

Distributed databases use shard-based storage, which removes capacity bottlenecks of single-node databases caused by a large amount of data in one unsharded table. Therefore, when creating a schema and logical tables, you need to consider your actual conditions and determine whether to create sharded tables and which sharding rule should be used.

Avoid cross-shard JOIN operations on the data that is stored in different shards, to ensure optimal performance and resource availability.

  • Whether logical tables are sharded

    DDM supports three types of logical tables: broadcast, sharded, and unsharded. You can select the most appropriate type based on your specific needs. For details, see Creating a Table.

    • Unsharded: One physical table is created and stores data only in the first shard.
    • Broadcast: The same physical table is created in each shard and stores the same data.
    • Sharded: The same physical table is created in each shard, and data is distributed to all shards based on a certain sharding rule.
  • Sharding rule of logical tables

    The selection of a sharding key is important for each logical table. Selecting the sharding key based on your needs is recommended. If an entity relationship exists between different logical tables, select the same field as the sharding key to avoid cross-shard JOIN.

Pay attention to the following suggestions before determining whether to use sharding:

  • Do not shard tables that each have less than 10 million data records.
  • Shard the tables that each have more than 10 million data records. Storing data in different sharded tables removes performance bottlenecks caused by a large amount of data in one table, while also improving concurrency capability. Select an appropriate sharding key in advance.
  • Avoid across-table JOIN operations during service reading and cross-shard operations for any individual transaction.
  • Include the sharding key into query conditions to avoid scanning table shards in all sharded tables.