Help Center> Document Database Service> Best Practices> Proper Use of Data Definition Languages (DDL) Statements
Updated on 2024-01-18 GMT+08:00

Proper Use of Data Definition Languages (DDL) Statements

A data definition language (DDL) is a SQL used to create, modify, and delete the structure of databases and collections.

There are the following common DDLs in DDS:

  • createCollection: used to create a collection in a specified database.
  • drop: used to delete a specified collection.
  • createIndex: used to create one or more indexes in a collection to improve query efficiency.
  • dropIndex: used to delete a specified index from a collection.
  • shardCollection: Run the sh.shardCollection command to modify the sharding rule of a collection so that the collection is distributed in different shards.
  • collection.stats: used to view metadata of a specified collection, such as the number of documents and storage size.
  • db.stats: used to view metadata of a specified database, such as the number of collections and storage.

Precautions for Using DDLs

  • Creating indexes, deleting indexes, and deleting collections consume a large amount of I/O or compute resources. Perform these operations during off-peak hours to prevent affecting services.
  • Do not execute multiple DDLs at the same time. Otherwise, the execution may fail due to blocking. For example, do not create an index and delete a collection at the same time.
  • Create necessary indexes only to prevent storage waste caused by redundant indexes. For example, do not create an index based on the prefix field of a composite index.
  • When creating an index, you need to create a background index using db.<collection_name>.createIndex({ <field_name>: <index_type> }, { background: true }). Note that creating a background index does not block other services, but still consumes a large amount of I/O resources.
  • Before deleting an index, perform a performance test to ensure that the index to be deleted does not affect the query performance.
  • Do not create too many collections. If there are too many collections, a large amount of metadata is generated, extra resources are consumed, and it is too difficult to perform maintenance. For example, do not name a collection by date or create a new collection every day. Instead, store the date as a field in a given collection.
  • Before deleting a collection, confirm the collection name with caution because data cannot be directly restored after the collection is deleted. You are advised to back up important data first.
  • If you want to delete a collection, do not use the remove or delete command without filtering conditions. Instead, use db.<collection_name>.drop() to delete a collection. If a query condition is specified for the remove or delete command, the corresponding index must be created.
  • In a cluster instance, if a collection has a large storage space, you need to shard the collection. For details, see Sharding.
  • You can use db.stats() and db.<collection_name>.stats() to view the metadata (such as the number of documents and storage size) of databases and collections. The information is important for performance optimization and capacity planning.