How Do I Configure the Shard Key for a MongoDB Sharded Cluster?
MongoDB shards data at the collection level, distributing the collection data using shard keys.
You choose the shard key when sharding a collection. Each record contains a shard key, and the shard key is either an indexed field or indexed compound fields. MongoDB database distributes data in different chunks according to the shard key, and distributes chunks evenly among the shards. To divide data chunks by shard key, MongoDB database uses two sharding methods: range-based sharding and hashed sharding.
Shard Key Type |
Description |
Application Scenario |
---|---|---|
Range-based sharding |
Ranged-based sharding involves dividing data into contiguous ranges determined by the shard key values. Range-based sharding is the default sharding methodology if no other options are specified. This allows for efficient queries where reads target documents within a contiguous range. The distribution route determines which data chunk stores the data required and forwards the request to the corresponding shard. |
It is recommended when the shard key has high cardinality with low frequency, and the shard key value does not change monotonically. |
Hashed sharding |
Hashed sharding uses a hashed index to partition data across your shared cluster and to create chunks. Hashed sharding provides more even data distribution across the sharded cluster Hash values enable data to be randomly distributed in each chunk, and therefore are randomly distributed in different shards. |
If the shard key values that have a high cardinality or change monotonically, or there are large number of different values, hashed sharding is an ideal option. |
Once you shard a collection, the shard key and the shard key values are immutable. If you need to modify the shard key of a document, you must delete the document. Then modify the shard key and insert the document again.
![](https://support.huaweicloud.com/intl/en-us/drs_faq/public_sys-resources/note_3.0-en-us.png)
The shard key does not support array indexes, text indexes, geographical indexes, and spatial indexes.
Range-based Sharding
- Run the following command to enable database sharding:
sh.enableSharding(database)
database indicates the database for which the sharded collection is enabled.
- Configure the collection's shard key.
sh.shardCollection(namespace, key)
- namespace consists of a string <database>.<collections> specifying the full namespace of the target collection.
- key indicates the index for the shard key.
Hashed Sharding
- Run the following command to enable database sharding:
sh.enableSharding(database)
database indicates the database for which the sharded collection is enabled.
- Set hashed shard keys.
sh.shardCollection("<database>.<collection>", { <shard key> : "hashed" } , false, {numInitialChunks: Number of preconfigured chunks})
The value of numInitialChunks is calculated as follows: db.collection.stats().size / 10*1024*1024*1024.
If the collection contains data, run the following command to create a hashed index for the hashed key:
db.collection.createIndex()
Run the following command to create a hashed shard key:
sh.shardCollection()
General Operations FAQs
- What Can I Do When Information Overlaps on the DRS Console?
- Is the Destination Instance Set to Read-only or Read/Write?
- How Do I Set Global binlog_format=ROW to Take Effect Immediately?
- How Do I Set binlog_row_image=FULL to Take Effect Immediately?
- How Do I Change the Destination Database Password to Meet the Password Policy?
- How Do I Configure the Shard Key for a MongoDB Sharded Cluster?
- Does Bandwidth Expansion Affect the Running DRS Tasks?
- Why Data in MariaDB and SysDB Cannot Be Migrated or Synchronized?
- Constraints and Operation Suggestions on Many-to-One Scenario
- Constraints and Operation Suggestions on One-to-Many Scenario
- Where Can I View DRS Operation Logs?
- Why Is the Language of the Message Sent by DRS Inconsistent with That on the Page?
- Why Is a DRS Task Automatically Stopped?
- How Can I Export a DRS Task List?
- Can a Completed Task Be Restarted?
- What Are the Differences Between Resetting a Task and Recreating a Task?
- Does DRS Support Backward Migration/Synchronization?
- Why Cannot I Select an Existing SMN Topic?
- Can I Change an SMN Topic After a Task Is Created?
- How Do I Set the Number of Source Shards and Source Database Information When DDM Is the Source Database?
- Will Data of DRS Tasks Be Lost After a Primary/Standby Switchover Occurs on the Source MySQL Database?
- What Are the Differences Between All, Tables, and Databases During DRS Object Selection?
- What Do I Do After Changing the Password of the Source or Destination Database?
- How Do I Configure an SMN Topic Policy to Allow DRS to Publish Messages?
- What Can I Do If a DRS Task Times Out Due to Too Many Tables?
- Can I Change the Source or Destination Database After a DRS Task Is Created?
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbotmore