Help Center> Document Database Service> Best Practices> How Do I Troubleshoot High Memory Usage of DDS DB Instances?
Updated on 2023-09-26 GMT+08:00

How Do I Troubleshoot High Memory Usage of DDS DB Instances?

During DDS usage, if your memory usage is too high or close to 100%, service requests will be responded slowly and OOM risks will increase, affecting stable services.

This section describes how to troubleshoot the high memory usage of DDS DB instances by checking the number of database connections, analyzing slow requests, and checking cursors. After the analysis and optimization, query performance will be improved, indexes will be used more efficiently for all requests, and cursors will be used in a standard manner. If the memory usage increases due to business growth, you are advised to upgrade the specifications in a timely manner.

Checking the Number of Connections

  1. Check the percentage of connections. The total number of connections cannot exceed 80% of the maximum number of connections supported by the current instance. If there are too many connections, the memory and multi-thread context overhead increases, affecting the delay in request processing.
  2. Configure a connection pool. It is recommended that the maximum number of connections in a connection pool be 200. For details, see How Do I Query and Limit the Number of Connections?

Analyzing Slow Query Logs

In addition to reducing the number of connections, pay attention to the memory overhead of a single request to avoid full table scan and memory sorting in query statements.

  1. Query slow query logs generated for the current DB instance.
  2. Analyze slow query logs to locate the cause of memory usage increase. The following is an example of a slow request log. The log shows that a full table scan is performed for the request, 1,561,632 documents are scanned, and no index is used for query.
    {
            "op" : "query",
            "ns" : "taiyiDatabase.taiyiTables$10002e",
            "query" : {
                    "find" : "taiyiTables",
                    "filter" : {
                            "filed19" : NumberLong("852605039766")
                    },
                    "shardVersion" : [
                            Timestamp(1, 1048673),
                            ObjectId("5da43185267ad9c374a72fd5")
                    ],
                    "chunkId" : "10002e"
            },
            "keysExamined" : 0,
            "docsExamined" : 1561632,
            "cursorExhausted" : true,
            "numYield" : 12335,
            "locks" : {
                    "Global" : {
                            "acquireCount" : {
                                    "r" : NumberLong(24672)
                            }
                    },
                    "Database" : {
                            "acquireCount" : {
                                    "r" : NumberLong(12336)
                            }
                    },
                    "Collection" : {
                            "acquireCount" : {
                                    "r" : NumberLong(12336)
                            }
                    }
            },
            "nreturned" : 0,
            "responseLength" : 157,
            "protocol" : "op_command",
            "millis" : 44480,
            "planSummary" : "COLLSCAN",
            "execStats" : {
                  "stage" : "SHARDING_FILTER",                                                                                                                                       [3/1955]
                    "nReturned" : 0,
                    "executionTimeMillisEstimate" : 43701,
                    "works" : 1561634,
                    "advanced" : 0,
                    "needTime" : 1561633,
                    "needYield" : 0,
                    "saveState" : 12335,
                    "restoreState" : 12335,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "chunkSkips" : 0,
                    "inputStage" : {
                            "stage" : "COLLSCAN",
                            "filter" : {
                                    "filed19" : {
                                            "$eq" : NumberLong("852605039766")
                                    }
                            },
                            "nReturned" : 0,
                            "executionTimeMillisEstimate" : 43590,
                            "works" : 1561634,
                            "advanced" : 0,
                            "needTime" : 1561633,
                            "needYield" : 0,
                            "saveState" : 12335,
                            "restoreState" : 12335,
                            "isEOF" : 1,
                            "invalidates" : 0,
                            "direction" : "forward",
                            "docsExamined" : 1561632
                    }
            },
            "ts" : ISODate("2019-10-14T10:49:52.780Z"),
            "client" : "172.16.36.87",
            "appName" : "MongoDB Shell",
            "allUsers" : [
                    {
                            "user" : "__system",
                            "db" : "local"
                    }
            ],
           "user" : "__system@local"
    }
    The following stages can be causes for a slow query:
    • COLLSCAN involves a full collection (full table) scan.
      • When a request (such as query, update, and delete) requires a full table scan, a large amount of memory resources are occupied. If you find COLLSCAN in the slow query log, memory resources may be occupied.
      • If such requests are frequent, create indexes for the fields to be queried.
    • docsExamined involves a full collection (full table) scan.
      • You can view the value of docsExamined to check the number of documents scanned. A larger value indicates a higher memory usage.
    • IXSCAN and keysExamined scan indexes.
    • SORT and hasSortStage may involve sorting a large amount of data.

    If the value of the hasSortStage parameter is true, the query request involves sorting. If the sorting cannot be implemented through indexes, the query results are sorted, and sorting is a memory intensive operation. In this scenario, you need to create indexes for fields that are frequently sorted.

    If there is SORT, you can use indexing to improve sorting speed.

    An excessive number of indexes can affect the write and update performance. You are advised to create indexes based on the ESR principle to improve query efficiency.

    • "Equality" refers to an exact match on a single value. Place fields that require exact matches first in an index.
    • "Sort" determines the order for results. A sort condition follows equality matches.
    • "Range" filters scan fields. Place a range filter after a sort condition.

Checking Cursors

If cursors are used improperly, the memory usage may increase and the cursors may not be released for a long time. When a cursor is used on the client, release it in a timely manner (For details, see official description).
  1. Check whether a cursor is set to noTimeout. By default, the database automatically releases a cursor 10 minutes later. The following is an example of cursor timeout code provided by the Java driver:
    MongoCursor<Document> cursor = collection.find(query)
                        .maxTime(10, TimeUnit.MINUTES)
                        .iterator();
  2. Check whether the client actively releases a cursor after the cursor is used. The following is an example of releasing a cursor in the Java driver:
    cursor.close()
  3. If cursors are set to noTimeout and they cannot be released on the client, restart the instance node with high memory usage to release these cursors. You are advised to optimize the service code to avoid setting noTimeout cursors and release them after using them.