Load Imbalance of Cluster Instances
It is common that load is imbalanced between shard nodes in a cluster instance. If the shard key is incorrectly selected, no chunk is preset, and the load balancing speed between shard nodes is lower than the data insertion speed, load imbalance may occur.
This section describes how to fix load imbalance.
Fault Locating
- Connect to a database from the client.
- Run the following command to check the shard information:
sh.status()
mongos> sh.status() \--- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("60f9d67ad4876dd0fe01af84") } shards: { "_id" : "shard_1", "host" : "shard_1/172.16.51.249:8637,172.16.63.156:8637", "state" : 1 } { "_id" : "shard_2", "host" : "shard_2/172.16.12.98:8637,172.16.53.36:8637", "state" : 1 } active mongoses: "4.0.3" : 2 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: yes Collections with active migrations: test.coll started at Wed Jul 28 2021 11:40:41 GMT+0000 (UTC) Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 300 : Success databases: { "_id" : "test", "primary" : "shard_2", "partitioned" : true, "version" : { "uuid" : UUID("d612d134-a499-4428-ab21-b53e8f866f67"), "lastMod" : 1 } } test.coll shard key: { "_id" : "hashed" } unique: false balancing: true chunks: shard_1 20 shard_2 20
- databases lists databases for which you enable enableSharding.
- test.coll is the collection namespace. test indicates the name of the database where the collection is located, and coll indicates the name of the collection for which sharding is enabled.
- shard key is the shard key of the previous collection. _id: indicates that the shard is hashed based on _id. _id: -1 indicates that the shard is sharded based on the range of _id.
- chunks indicates the distribution of shards.
- Analyze the shard information based on the query result in 2.
- If no shard information is queried, the collections are not sharded.
Run the following command to enable sharding:
mongos> sh.enableSharding("<database>") mongos> use admin mongos> db.runCommand({shardcollection:"<database>.<collection>",key:{"keyname":<value> }})
- If an improper shard key is selected, the load may be imbalanced. For example, if a large number of requests are processed on a range of shards, the load on these shards is heavier than other shards, causing load imbalance.
You can redesign the shard key, for example, changing ranged sharding to hashed sharding.
mongos> db.runCommand({shardcollection:"<database>.<collection>",key:{"keyname":<value> }})
- If a sharding mode is determined, it cannot be changed easily. The sharding mode must be fully considered in the design phase.
- If a large amount of data is inserted and the data volume exceeds the load capacity of a single shard, shard imbalance occurs and the storage usage of the primary shard is too high.
You can run the following command to check the network connection of the server and ceck whether the amount of data transmitted by each network adapter reaches the upper limit.
sar -n DEV 1 //1 is the interval. Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil Average: lo 1926.94 1926.94 25573.92 25573.92 0.00 0.00 0.00 0.00 Average: A1-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: A1-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: NIC0 5.17 1.48 0.44 0.92 0.00 0.00 0.00 0.00 Average: NIC1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: A0-0 8173.06 92420.66 97102.22 133305.09 0.00 0.00 0.00 0.00 Average: A0-1 11431.37 9373.06 156950.45 494.40 0.00 0.00 0.00 0.00 Average: B3-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: B3-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
- rxkB/s is the number of KBs received per second.
- txkB/s is the number of KBs sent per second.
After the check is complete, press Ctrl+Z to exit.
If the network load is too high, analyze MQL statements, optimize the roadmap, reduce bandwidth consumption, and increase specifications to expand network throughput.
- Check whether there are sharded collections that do not carry ShardKey. In this case, requests are broadcast, which increases the bandwidth consumption.
- Control the number of concurrent threads on the client to reduce the network bandwidth traffic.
- If the problem persists, increase instance specifications in a timely manner. High-specification nodes can provide higher network throughput.
- If no shard information is queried, the collections are not sharded.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.