Updated on 2024-11-29 GMT+08:00

Configuration File solrconfig.xml in the Solr Config Set

The solrconfig.xml file defines the Solr index, query processing configuration, and component information configuration. This section describes common configurations. For details about how to optimize parameters, see Solr Performance Tuning.

indexConfig

  • <writeLockTimeout>1000</writeLockTimeout>

    writeLockTimeout indicates the maximum waiting timeout for the IndexWriter instance to obtain the write lock. If the instance does not obtain the write lock within the specified timeout period, the IndexWriter write index operation throws an exception.

  • <maxIndexingThreads>8</maxIndexingThreads>

    maxIndexingThreads indicates the maximum number of threads for creating an index. By default, eight threads are used to create a collection.

  • <ramBufferSizeMB>100</ramBufferSizeMB>

    ramBufferSizeMB indicates the cache size for creating an index. The unit is MB, and the maximum value is 100 MB by default.

  • <maxBufferedDocs>1000</maxBufferedDocs>

    maxBufferedDocs indicates the maximum number of cached documents before the documents are written into hard disks. If the number of cached documents exceeds the maximum value, the index flush operation will be triggered.

  • <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">

    <int name="maxMergeAtOnce">10</int>

    <int name="segmentsPerTier">10</int>

    </mergePolicy>

    mergePolicy indicates the index merging policy, which is used to filter the parts that need to be merged. maxMergeAtOnce indicates the maximum number of parts that can be merged at a time.

  • <mergeFactor>10</mergeFactor>

    mergeFactor controls the number of documents that can be cached in the memory before the index is written into the hard disk and the frequency for multiple segment files to be merged.

  • <lockType>${solr.lock.type:native}</lockType>
    • single indicates the read-only lock. There is no another processing thread used to modify the index data.
    • native indicates the implementation of NativeFSLockFactory in Lucene, which uses the OS-based local file lock.
    • simple indicates the implementation of SimpleFSLockFactory in Lucene, which is implemented by creating the write.lock file on the hard disk.
  • <unlockOnStartup>false</unlockOnStartup>

    If this parameter is set to true, the locks owned by the IndexWriter and commit operations will be released after Solr is started. This will break the lock mechanism of Lucene. Exercise caution when using this parameter. If lockType is set to single, the Lucene lock mechanism will not be affected no matter this parameter is set to true or false.

Update

  • <updateHandler class="solr.DirectUpdateHandler2">

    updateHandler indicates the index update handling class. DirectUpdateHandler2 is a high-performance index update handling class, which supports soft submitting.

  • <autoCommit>

    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>

    <openSearcher>true</openSearcher>

    </autoCommit>

    maxTime indicates the interval for the index data in the memory to be automatically committed and the Searcher class is notified to add a new index.

  • <autoSoftCommit>

    <maxTime>${solr.autoSoftCommit.maxTime:1000}</maxTime>

    </autoSoftCommit>

    maxTime indicates the maximum interval for a newly created index to be searched after being submitted. This parameter can be used to search the changes of the index in the current memory without synchronizing the index to the hard disk. If this parameter is set to -1, you must set commit to true when creating an index so that the submitted index can be viewed. In scenarios where you want to batch create indexes, set maxTime to -1 and set openSearcher of autoCommit to true. Generally, do not query indexes when batch creating indexes.

Query

The <query> label indicates a configuration item related to index query.

  • filterCache: By storing an unordered set that matches the ID of the document to be queried, the filters can improve the query performance of Solr. By caching the filters, the repeated calling of Solr can lead to the quick search of the result set. A more common scenario is to cache a filter before initiating subsequent refinement queries that use filters to limit the number of documents to search.
  • queryResultCache: indicates the ordered set of document IDs in the query result.
  • documentCache: indicates the cached Lucene document, which uses the internal Lucene document ID (to be distinguished from the unique Solr ID). The internal Lucene document ID can be changed due to index operations, so the data in the cache cannot be hot data.
  • Named caches: indicates the named cache, which is the customized cache and can be used to the customized plug-ins of Solr.

    Each cache declaration accepts a maximum of four attributes:

    • class: indicates the Java name implemented by the cache.
    • size: indicates the maximum number of entries.
    • initialSize: indicates the initial size of the cache.
    • autoWarmCount: indicates the number of entries obtained from the old cache to warm up the new cache. If there are a large number of entries, there will be more hits in the cache, but it will take a longer warm-up time.

    Note: For all cache modes, it is necessary to balance memory, CPU, and disk access when setting cache parameters. For details, see Solr Performance Tuning.