All Products
Search
Document Center

OpenSearch:Demo code for implementing scroll queries

Last Updated:Aug 31, 2023

Configure environment variables

Configure the ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variables.

Important
  • The AccessKey pair of an Alibaba Cloud account can be used to access all API operations. We recommend that you use a Resource Access Management (RAM) user to call API operations or perform routine O&M. For information about how to use a RAM user, see Create a RAM user.

  • For information about how to create an AccessKey pair, see Create an AccessKey pair.

  • If you use the AccessKey pair of a RAM user, make sure that the required permissions are granted to the AliyunServiceRoleForOpenSearch role by using your Alibaba Cloud account. For more information, see AliyunServiceRoleForOpenSearch and Access authorization rules.

  • We recommend that you do not include your AccessKey pair in materials that are easily accessible to others, such as the project code. Otherwise, your AccessKey pair may be leaked and resources in your account become insecure.

  • Linux and macOS

    Run the following commands. Replace <access_key_id> and <access_key_secret> with the AccessKey ID and AccessKey secret of the RAM user that you use.

    export ALIBABA_CLOUD_ACCESS_KEY_ID=<access_key_id> 
    export ALIBABA_CLOUD_ACCESS_KEY_SECRET=<access_key_secret>
  • Windows

    1. Create an environment variable file, add the ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variables to the file, and then set the environment variables to your AccessKey ID and AccessKey secret.

    2. Restart Windows for the AccessKey pair to take effect.

Demo code for implementing scroll queries by using OpenSearch SDK for Java V4.0.0

Function and applicable scenarios

The returned results of a regular query cannot contain more than 5,000 documents. If the results contain more than 5,000 documents, you can perform scroll queries to obtain all matched results.

Usage notes

  • Scroll queries are used to obtain all matched results and do not support the aggregate, distinct, or rank clause.

  • The start parameter that you specify in the config clause does not take effect for scroll queries. The default value 0 is used. You cannot specify the result page that you want to view. For scroll queries, the number of documents in each result set cannot exceed 500.

  • When you run the first scroll query, a scroll ID is returned. To obtain document data, use this scroll ID to run the scroll query again.

Note: Determine whether an error has occurred based on the error code and message instead of the status information. For more information about errors, see Error codes.

Demo code provided by OpenSearch SDK for Java

package com.aliyun.opensearch;

import com.aliyun.opensearch.OpenSearchClient;
import com.aliyun.opensearch.SearcherClient;
import com.aliyun.opensearch.sdk.dependencies.com.google.common.collect.Lists;
import com.aliyun.opensearch.sdk.dependencies.org.json.JSONObject;
import com.aliyun.opensearch.sdk.generated.OpenSearch;
import com.aliyun.opensearch.sdk.generated.commons.OpenSearchClientException;
import com.aliyun.opensearch.sdk.generated.commons.OpenSearchException;
import com.aliyun.opensearch.sdk.generated.search.*;
import com.aliyun.opensearch.sdk.generated.search.general.SearchResult;
import com.aliyun.opensearch.search.SearchParamsBuilder;

import java.nio.charset.Charset;

public class testScroll {

    // Due to engine performance limits, scroll queries do not support the aggregate, distinct, or rank clause, and support sorting only based on a single field.
    private static String appName = "Name of the OpenSearch application that you want to manage";
    private static String host = "Endpoint of the OpenSearch API in your region";

    public static void main(String[] args) {
      
 				// Specify your AccessKey pair.
      	// Obtain the AccessKey ID and AccessKey secret from the environment variables. You must configure the environment variables before you run the sample code.
        String accesskey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
        String secret = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");

        // Obtain the file encoding format and default encoding format.
        System.out.println(String.format("file.encoding: %s", System.getProperty("file.encoding")));
        System.out.println(String.format("defaultCharset: %s", Charset.defaultCharset().name()));


        // Create an OpenSearch object.
        OpenSearch openSearch = new OpenSearch(accesskey, secret, host);

        // Use the OpenSearch object as a parameter to create an OpenSearchClient object.
        OpenSearchClient serviceClient = new OpenSearchClient(openSearch);

        // Use the OpenSearchClient object as a parameter to create a SearcherClient object.
        SearcherClient searcherClient = new SearcherClient(serviceClient);

        // Create a Config object and use the config clause to configure parameters such as the application name, paging-related parameters, and data format of returned results.
        Config config = new Config(Lists.newArrayList(appName));

        config.setStart(start); // The start parameter that you specify in the config clause does not take effect for scroll queries. The default value 0 is used.
        config.setHits(5);// Specify the number of documents to be displayed on each page. In this example, the number is set to 5.

        // Specify the data format of returned results. Supported formats are JSON and FULLJSON. In this example, the data format is set to FULLJSON.
        config.setSearchFormat(SearchFormat.FULLJSON);

        // Specify the fields to be returned in search results.
        config.setFetchFields(Lists.newArrayList("id", "name", "phone", "int_arr", "literal_arr", "float_arr", "cate_id"));
        // Note: The rerank_size parameter in the Config class is specified by the setReRankSize method of the Rank class.

        // Create a SearchParams object.
        SearchParams searchParams = new SearchParams(config);

        // Specify the query clause. You can specify multiple keywords to perform a query based on multiple index fields. In this case, you must specify the index fields in one setQuery call. If you specify each index field in a separate setQuery call, the last clause overwrites the preceding clauses.
        searchParams.setQuery("name:'opensearch'");

        // Specify filter conditions.
        searchParams.setFilter("cate_id<=3"); // You can also set a filter condition by using the SearchParamsBuilder class.
				
      	// Specify a sorting condition.
        Sort sorter = new Sort();
        sorter.addToSortFields(new SortField("id", Order.DECREASE)); // Specify a field based on which documents are to be sorted, and a sorting method. In this example, documents are sorted based on the id field in descending order.
 
         // Add the Sort object as a query parameter.
        searchParams.setSort(sorter);
      	
        // Create a DeepPaging object to implement iterative scroll queries.
        DeepPaging deep =new DeepPaging();
        // Specify a validity period for the scroll ID to be used by the next scroll query, in minutes. Default value: 1m. In this example, the value is set to 3m.
        deep.setScrollExpire("3m");

        // Add the DeepPaging object as a query parameter.
        searchParams.setDeepPaging(deep);

        // Create a SearchParamsBuilder object. As the utility class of SearchParams, the SearchParamsBuilder class allows you to configure query-related parameters with ease.
        SearchParamsBuilder paramsBuilder = SearchParamsBuilder.create(searchParams);

        // Specify filter conditions.
//        paramsBuilder.addFilter("cate_id<=0", "AND");

        // Run the query and return the results. Determine whether an error has occurred based on the error code and message instead of the status information. For more information about errors, see the "Error codes" topic. 
        SearchResult searchResult;
        try {
            searchResult = searcherClient.execute(paramsBuilder);
            String result = searchResult.getResult();
            JSONObject obj = new JSONObject(result);

            // If the returned results contain 25 documents and the number of documents displayed on each page is set to 5, the sixth page of returned results is empty.
            for(int i=1;i<=6;i++){

                // When you run the first scroll query, a scroll ID is returned. To obtain document data, use this scroll ID to run the scroll query again.
                deep.setScrollId(new JSONObject(obj.get("result").toString()).get("scroll_id").toString());
                deep.setScrollExpire("3m");// Specify a validity period for the scroll ID to be used by the next scroll query, in minutes. Default value: 1m. In this example, the value is set to 3m. If you do not want to use the default value, you must set a validity period each time before you run a scroll query.
                searchResult = searcherClient.execute(paramsBuilder);
                result = searchResult.getResult();
                obj = new JSONObject(result);

                // Display the search results.
                System.out.println("Results for Query No."+i+":" + obj.get("result"));
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                } // Hibernate the thread for one second to prevent errors caused by excessive queries per second (QPS).
            }


        } catch (OpenSearchException e) {
            e.printStackTrace();
        } catch (OpenSearchClientException e) {
            e.printStackTrace();
        }

    }
}