When the data volume is large, sorting increases the response time of queries and occupies system resources. If the stored data is pre-sorted by field, query performance is greatly improved. This also improves the performance of queries on large volumes of data. Pre-sorting can greatly improve the performance of data queries and optimize the capabilities of LindormSearch. This topic describes how to use the pre-sorting feature.

Procedure

  1. Modify MergePolicy in the solrconfig.xml file. For more information, see Customizing merge policies.
  2. Set the segmentTerminateEarly parameter to true when you query data. The following code provides an example on how to configure MergePolicy:
    <mergePolicyFactory class="org.apache.solr.index.SortingMergePolicyFactory">
      <str name="sort">timestamp desc</str>
      <str name="wrapped.prefix">inner</str>
      <str name="inner.class">org.apache.solr.index.TieredMergePolicyFactory</str>
      <int name="inner.maxMergeAtOnce">10</int>
      <int name="inner.segmentsPerTier">10</int>
    </mergePolicyFactory> 

    After the preceding configuration is specified, the stored data is sorted in descending order based on the timestamp field. Run the following command:

    curl "http://localhost:8983/solr/testcollection/query?q=*:*&sort=timestamp+desc&rows=10&segmentTerminateEarly=true" 

    If you set the segmentTerminateEarly parameter to true, the response time of queries is significantly reduced especially when the query is performed on terabytes of data.

    Note
    • The value of the sort parameter specified for the query must be the same as the value of the sort parameter that is specified in the MergePolicy configuration. Otherwise, the pre-sorting feature does not take effect.
    • You must configure the segmentTerminateEarly parameter. Otherwise, all the data is sorted.
    • If pre-sorting is used, an inaccurate value is returned for the numFound parameter.