Solr Public Read/Write Optimization Suggestions
Scenario
Optimize the public read/write performance of Solr as an MRS cluster administrator.
Prerequisites
The Solr client has been installed.
Procedure
When Solr collection data is stored on a local disk or HDFS, you can optimize the configuration from the following aspects:
- Optimize the Schema configuration.
- uniqueKey is defined as the long type.
- The query performance of the long type is better than that of the string type. If uniqueKey is defined as the string type, you can establish a mapping from long to string on the service plane.
- It is recommended that the uniqueKey field be set to required="true".
- It is recommended that the uniqueKey field be set to docValues="true".
- To obtain better query performance, it is recommended that the uniqueKey field be the explicitly specified return field during a query.
- The docValues="true" setting for fields that require sorting and statistics can effectively reduce memory usage.
After configuring docValues="true", you do not need to configure stored="true".
- uniqueKey is defined as the long type.
- Optimize the collection scheme.
You can modify the content of solrconfig.xml to achieve the following optimization effects.
Configuration Item
Changed To
Improving the collection speed and increasing the number of collection threads
<maxIndexingThreads>${solr.maxIndexingThreads:16}</maxIndexingThreads>
Increasing the cache of the document index
<ramBufferSizeMB>1024</ramBufferSizeMB>
Increasing the merge factor of collection segments
<mergeFactor>20</mergeFactor>
Prolonging the automatic hard submission time of a collection
<maxTime>${solr.autoCommit.maxTime:30000}</maxTime>
Increasing the automatic soft submission time of a collection
<maxTime>${solr.autoSoftCommit.maxTime:60000}</maxTime>
Obtaining the value of uniqueKey based on docValues
<useDocValueGetField>true</useDocValueGetField>
Sorting docId and reading disks in sequence
<sortDocIdBeforeGetDoc>true</sortDocIdBeforeGetDoc>
Caching docId to avoid reading the disk for the second time
<useQuickFirstMatch>true</useQuickFirstMatch>
The useDocValueGetField is used in the following scenarios:- The returned field fI is uniqueKey.
- uniqueKey is of the numeric type (long/int/float/double).
- The uniqueKey field is set to docValues=true.
useQuickFirstMatch is used in the following scenarios:
- Collections cannot be modified (for example, deletion or merging) after being saved to the database.
- Optimize the query scheme.
Cache plays an essential role in Solr. The following three types of cache are involved in Solr:
- Filter cache, which is used to store the results of filter (the fq parameter) and faceted search.
- Document cache, which is used to save the fields stored in Lucene documents.
- Query result, which is used to save the query results.
Solr contains the internal cache of Lucene, which cannot be controlled by users.
You can optimize the search instance of Solr by adjusting the three types of caches. Before parameter adjustment, you need to obtain the following information of the Solr instance:
- Number of documents in a collection: Log in to Manager. Choose Cluster > Name of the desired cluster > Service > Solr. On the Solr web UI, click any SolrServerAdmin(XX) to go to the Solr Admin page. Select a core in the target collection from Core Selector and click Query > Execute Query. numFound is the parameter value.
- Number of filters: user expectation (for example, 200)
- Maximum number of documents returned after a query: user expectation (for example, 100)
- Number of different queries and sequences: user expectation (for example, 500)
- Number of fields in a query: user expectation (for example, 3)
- Number of concurrences for instance queries: user expectation (for example, 10)
You can configure the cache by modifying the solrconfig.xml file. (For details about how to modify the file, see 3.)
Cache Type
Modification Solution
Filter cache
<filterCache class="solr.FastLRUCache" size="200" initialSize="200" autowarmCount="50"/>
- Sets the values of size and initialSize to the number of cached document IDs.
- Sets the value of autowarmCount to one quarter the value of initialSize.
- Sets this parameter based on the site requirements. If this parameter is set to a large value, a large amount of memory is occupied.
Query result cache
<queryResultCache class="solr.FastLRUCache" size="3000" initialSize="3000" autowarmCount="750"/>
- Sets the values of size and initialSize based on the following formula: Number of different queries and sorting x Number of fields queried each time x 2
- Sets the value of autowarmCount to one quarter the value of initialSize.
Document cache
<documentCache class="solr.FastLRUCache" size="1000" initialSize="1000"/>
Sets the values of size and initialSize based on the following formula: Maximum number of documents returned in one query x Number of concurrent instance queries
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot