Updated on 2025-11-14 GMT+08:00

Working with Indexes

Scenarios

In database management, indexes play a crucial role in enhancing query performance. You can create both single-field indexes and composite indexes to optimize queries involving multiple fields.

Periodic analysis and optimization of index performance are essential. The database management system usually provides tools and commands to evaluate index usage and identify performance issues. It's important to note that maintaining indexes comes with a significant cost. Operations such as insertion, update, and deletion can trigger index updates, which may affect performance. It is essential to choose an index strategy that fits the specific application scenario, balancing query performance and maintenance costs.

Key to Creating Indexes

  • Do not use aggregation or query statements without filter criteria. Ensure that there are indexes on the fields used in the filter criteria (for example, the fields specified in the $match, find, update, and remove statements in aggregations). This prevents full table scans and ensures more efficient query performance.
  • Use high-selectivity filter criteria to narrow down the query scope and ensure that indexes exist on high-selectivity fields. It is recommended to create indexes only on fields where the number of duplicate values is less than 1% of the total number of documents in the collection. For example, if your collection contains 100,000 documents, create indexes only on fields where the number of duplicate values is no more than 1,000.
  • In composite indexes, it is recommended to place high-selectivity fields that can narrow down the query scope at the beginning.
  • In scenarios where a large amount of data needs to be sorted, you should create indexes on the sort field based on the exact match filter criteria. For example, if the filter condition is {a:xx, b:xx, c:xx, f:xx } and the sort condition is {e:1}, you should create composite index {a:1,b:1,c:1,f:1,e:1} to avoid in-memory sorts.
  • For statements that include equality, range, and sort, the optimal composite index follows the ESR principle: equality fields are placed at the beginning, sort fields are placed in the middle, and range fields are placed at the end.
  • If a query includes a collation, the collation option must be specified when creating an index for the query field.
  • Before deleting an index, ensure that no index is being created on the secondary node (you can use the currentOp command to check the ongoing operations). If an index is being created on the secondary node, do not delete the index immediately. Otherwise, the secondary node may enter a lock wait state, leading to unexpected issues.

Impact of Index Creation

After an index creation request is submitted, the existing documents in the current data are scanned to establish the mapping between index fields and their disk locations. This process consumes I/O and compute resources. Therefore, consider the following:

  • If an index to be created involves a large amount of data, create it during off-peak hours to minimize impact on production services.
  • If multiple indexes need to be created for a given collection, use the createIndexes method instead of createIndex. This allows multiple indexes to be created with a single scan, reducing the compute resources and I/O overhead associated with index creation.
  • Plan the indexes required for core service tables in advance.

Impact of Indexes on Writing Data

While indexes can enhance query performance by reducing the need to scan every document in a collection, they come with certain trade-offs. Each index on a collection requires the database to update both the collection and the relevant index fields whenever a document is inserted, updated, or deleted. For example, if a collection has nine indexes, the database must execute ten write operations before confirming the operation to the client. As a result, each additional index increases write latency, I/O operations, and overall storage usage. For optimal performance, review and minimize the number of indexes in your collections, and add only those that are necessary to enhance performance for common queries. You are advised to keep the number of indexes per collection to 10 or fewer.