How Do I Delete Orphaned Documents in MongoDB Sharded Clusters?
What Is Orphaned Document?
In a sharded cluster, orphaned documents are those documents on a shard that also exist in chunks on other shards as a result of failed migrations or incomplete migration cleanup due to abnormal shutdown.
Checking Orphaned Documents
- Connect to the mongos node as user root or a privileged account, replace dbName and collName in the following command with the names of the database and collection to be checked, and run the command:
db.getSiblingDB("dbName").collName.find().readPref("secondary").readConcern("local").explain("executionStats")
- Check chunkSkips in the SHARDING_FILTER stage in the command output. The value of chunkSkips indicates the number of orphaned documents on the current shard in the collection to be checked. If the value is greater than 0, there are orphaned documents on the shard. The following is an example of the command output:
- The readPreference parameter in the query statement is set in the command for querying orphaned documents. If the parameter is set to secondary, the statement is executed on secondary nodes in the instance.
- The method of checking orphaned documents in a DDS instance is the same as that in a self-managed MongoDB database.
- You need to execute the preceding statement once on each collection.
- The command for querying orphaned documents scans all documents on all shard nodes in a collection. If the document contains a large amount of data, the query takes a long time and causes query pressure on the DB instance. You are not advised to run the query command.
- You are advised to delete orphaned documents during off-peak hours. (For details, see "Procedure" below.) You need to connect to the shard nodes of a cluster instance. By default, the connection to shard nodes of a DDS cluster instance is disabled. If you cannot connect to the shard nodes, enable Shard IP Address and then perform the operations. Alternatively, submit a service ticket.
Migration Impact
During cluster migration, DRS extracts full data from shards. Normal documents and orphaned documents are on different shards and DRS will migrate them all. If the conflict policy of DRS for MongoDB migration is Ignore, documents that are first migrated to the destination are stored, resulting in data inconsistency.
Procedure
- Download cleanupOrphaned.js.
- Modify the cleanupOrphaned.js script file and replace test with the database name of the orphaned document to be cleared.
- Run the following command to clear the orphaned documents of all collections in the specified database on the shard node:
mongo --host ShardIP --port Primaryport --authenticationDatabase database -u username -p password cleanupOrphaned.js
- ShardIP: indicates the IP address of the shard node.
- Primaryport: indicates the service port of the primary shard node.
- database: indicates the database name.
- username: indicates the username for logging in to the database.
- password: indicates the password for logging in to the database.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot