This topic describes aggregation operations in search index-based queries.

Types

  • Group by field values
    • Definition: Group query results based on values for a field. Same values are grouped together. The value of each group and the number of corresponding values are returned. Errors may occur when the number of values in a group is large.
    • Parameters
      Parameter Description
      groupByName The name given to the aggregation to distinguish it from other aggregation operations. You can find the required aggregation results based on its name.
      fieldName The name of the field to be aggregated. Only fields of the keyword, long, double, and bool types are supported. [DO NOT TRANSLATE]
      groupBySorter Add sorting rules for items in a group. By default, group items are sorted in descending order. When you set multiple sorting rules, items are sorted in the order in which the rules were added. Supported parameters are as follows:
      • Sort by parameter keys in alphabetical order.
      • Sort by parameter keys in reverse alphabetical order.
      • Sort by the number of rows in ascending order.
      • Sort by the number of rows in descending order.
      • Sort by the values of sub-aggregations in ascending order.
      • Sort by the values of sub-aggregations in descending order.
      size The number of returned groups.
      subAggregation and subGroupBy Add sub-aggregations. Sub-aggregations perform additional aggregation operations on the group content. Example:
      • Scenario

        Query the number of products in each category and the maximum and minimum product prices in each category.

      • Method

        Group query results by product categories, and then add two sub-aggregations to obtain the maximum and minimum product prices in each category.

      • Result
        • Fruits: 5. The minimum price is CNY 3, and the maximum price is CNY 15.
        • Toiletries: 10. The minimum price is CNY 1, and the maximum price is CNY 98.
        • Electronic devices: 3. The minimum price is CNY 2,300, and the maximum price is CNY 8,699.
        • Other products: 15. The minimum price is CNY 80, and the maximum price is CNY 1,000.
    • Java example
      /**
       * Query the number of products in each category and the maximum and minimum product prices in each category.
       * Example of returned results: "Fruits: 5. The minimum price is CNY 3, and the maximum price is CNY 15. Toiletries: 10. The minimum price is CNY 1, and the maximum price is CNY 98. Electronic devices: 3. The minimum price is CNY 2,300, and the maximum price is CNY 8,699. Other products: 15. The minimum price is CNY 80,
       and the maximum price is CNY 1,000.
       */
      public void groupByField(SyncClient client) {
          // Create a query
          SearchRequest searchRequest = SearchRequest.newBuilder()
              .tableName("tableName")
              .indexName("indexName")
              .searchQuery(
                  SearchQuery.newBuilder()
                      .query(QueryBuilders.matchAll())
                      .limit(0)
                      .addGroupBy(GroupByBuilders
                          .groupByField("name1", "column_type")
                          .addSubAggregation(AggregationBuilders.min("subName1", "column_price"))
                          .addSubAggregation(AggregationBuilders.max("subName2", "column_price"))
                      )
                      .build())
              .build();
          // Execute the query
          SearchResponse resp = client.search(searchRequest);
          // Obtain the aggregation results
          for (GroupByFieldResultItem item : resp.getGroupByResults().getAsGroupByFieldResult("name1").getGroupByFieldResultItems()) {
              // Value
              System.out.println(item.getKey());
              // Count
              System.out.println(item.getRowCount());
              // Minimum price
              System.out.println(item.getSubAggregationResults().getAsMinAggregationResult("subName1").getValue());
              // Maximum price
              System.out.println(item.getSubAggregationResults().getAsMaxAggregationResult("subName2").getValue());
          }
      }
  • Group by range
    • Definition: Group query results based on value ranges of a field. Field values that fall within a certain range are grouped together. The number of items in each range is returned. For example, group the sales volume of a product by [0, 1000), [1000, 10000), and [10000, ∞) ranges to obtain the sales volume of each range.
    • Parameters
      Parameter Description
      groupByName The name given to the aggregation to distinguish it from other aggregation operations. You can find the required aggregation results based on its name.
      fieldName The name of the field to be aggregated. Only fields of the long and double types are supported.
      range[double_from, double_to) Add ranges for grouping. The range can start from the Double.MIN_VALUE and end at the Double.MAX_VALUE.
      subAggregation and subGroupBy Add sub-aggregations. Sub-aggregations perform additional aggregation operations on the group content. For example, after you group query results by sales volume and then by province, you can obtain which province has the largest proportion in a certain range of sales volume. You must add a GroupByField sub-aggregation in the GroupByRange aggregation to execute this query.
    • Java example
      /**
       * Group sales volume by [0, 1000), [1000, 5000), and [5000, ∞) ranges to obtain the sales volume in each range.
       */
      public void groupByRange(SyncClient client) {
          // Create a query
          SearchRequest searchRequest = SearchRequest.newBuilder()
              .tableName("tableName")
              .indexName("indexName")
              .searchQuery(
                  SearchQuery.newBuilder()
                      .query(QueryBuilders.matchAll())
                      .limit(0)
                      .addGroupBy(GroupByBuilders.groupByField()
                          .groupByRange("name1", "column_number")
                          .addRange[0, 1000)
                          .addRange[1000, 5000)
                          .addRange[5000, Double.MAX_VALUE)
                      )
                      .build())
              .build();
          // Execute the query
          SearchResponse resp = client.search(searchRequest);
          // Obtain the aggregation results
          for (GroupByRangeResultItem item : resp.getGroupByResults().getAsGroupByRangeResult("name1").getGroupByRangeResultItems()) {
              // Range
              System.out.println(item.getKey());
              // Count
              System.out.println(item.getRowCount());
          }
      }
  • Group by geographical distance
    • Definition: Group query results based on their geographical distances to a central point. Distance differences that fall within a certain range are grouped together. The number of items in each range is returned. For example, group people based on their geographical distances to a Wanda Plaza to obtain the number of people in each distance range. Group the distance differences by [0, 1000m) [1000m, 5000m), and [5000m, ∞) ranges.
    • Parameters
      Parameter Description
      groupByName The name given to the aggregation to distinguish it from other aggregation operations. You can find the required aggregation results based on its name.
      fieldName The name of the field to be aggregated. Only fields of the geo_point type are supported.
      origin(double lat, double lon) lat indicates the latitude of the central point, and lon indicates the longitude of the central point.
      range[double_from, double_to) Add ranges for grouping. Unit: m. The range can start from the Double.MIN_VALUE and end at the Double.MAX_VALUE.
      subAggregation and subGroupBy Add sub-aggregations. Sub-aggregations perform additional aggregation operations on the group content.
    • Java example
      /**
       * Group people based on their geographical distances to a Wanda Plaza to obtain the number of people in each distance range. Group the distance differences by [0,1000m) [1000m, 5000m), and [5000m, ∞) ranges.
       */
      public void groupByGeoDistance(SyncClient client) {
          // Create a query
          SearchRequest searchRequest = SearchRequest.newBuilder()
              .tableName("tableName")
              .indexName("indexName")
              .searchQuery(
                  SearchQuery.newBuilder()
                      .query(QueryBuilders.matchAll())
                      .limit(0)
                      .addGroupBy(GroupByBuilders
                          .groupByGeoDistance("name1", "column_geo_point")
                          .origin(3.1, 6.5)
                          .addRange[0, 1000)
                          .addRange[1000, 5000)
                          .addRange[5000, Double.MAX_VALUE)
                      )
                      .build())
              .build();
          // Execute the query
          SearchResponse resp = client.search(searchRequest);
          // Obtain the aggregation results
          for (GroupByGeoDistanceResultItem item : resp.getGroupByResults().getAsGroupByGeoDistanceResult("name1").getGroupByGeoDistanceResultItems()) {
              // Range
              System.out.println(item.getKey());
              // Count
              System.out.println(item.getRowCount());
          }
      }
  • Group by filter
    • Definition: Filter the query results and group them together to obtain the number of items matching each filter. Results are returned in the order in which the filters are added. For example, you can add the following three filters to obtain the number of items matching each filter: sales volume exceeds 100, production place is Zhejiang, and description contains Hangzhou.
    • Parameters
      Parameter Description
      groupByName The name given to the aggregation to distinguish it from other aggregation operations. You can find the required aggregation results based on its name.
      filter The added filters of a query. Filters are added using QueryBuilders.
      subAggregation and subGroupBy Add sub-aggregations. Sub-aggregations perform additional aggregation operations on the group content.
    • Java example
      /**
       * Filter the query results. For example, add the following three filters to obtain the number of items matching each filter: sales volume exceeds 100, production place is Zhejiang, and description contains Hangzhou.
       */
      public void groupByFilter(SyncClient client) {
          // Create a query
          SearchRequest searchRequest = SearchRequest.newBuilder()
              .tableName("tableName")
              .indexName("indexName")
              .searchQuery(
                  SearchQuery.newBuilder()
                      .query(QueryBuilders.matchAll())
                      .limit(0)
                      .addGroupBy(GroupByBuilders
                          .groupByFilter("name1")
                          .addFilter(QueryBuilders.range("number").greaterThanOrEqual(100))
                          .addFilter(QueryBuilders.term("place","Zhejiang"))
                          .addFilter(QueryBuilders.match("text","Hangzhou"))
                      )
                      .build())
              .build();
          // Execute the query
          SearchResponse resp = client.search(searchRequest);
          // Obtain the aggregation results in the order of addFilter.
          for (GroupByFilterResultItem item : resp.getGroupByResults().getAsGroupByFilterResult("name1").getGroupByFilterResultItems()) {
              // Count
              System.out.println(item.getRowCount());
          }
      }

Multiple aggregations

A request supports multiple aggregations.

Note Implementing multiple complicated aggregations at the same time may cause a long response time.

Java example:

public void multipleGroupBy(SyncClient client) {
    // Create a query
    SearchRequest searchRequest = SearchRequest.newBuilder()
        .tableName("tableName")
        .indexName("indexName")
        .searchQuery(
            SearchQuery.newBuilder()
                .query(QueryBuilders.matchAll())
                .limit(0)
                .addAggregation(AggregationBuilders.min("name1", "long"))
                .addAggregation(AggregationBuilders.sum("name2", "long"))
                .addAggregation(AggregationBuilders.distinctCount("name3", "long"))
                .addGroupBy(GroupByBuilders.groupByField("name4", "type"))
                .addGroupBy(GroupByBuilders.groupByRange("name5", "long").addRange(1, 15))
                .build())
        .build();
    // Execute the query
    SearchResponse resp = client.search(searchRequest);
    //Obtain the results of the first aggregation
    System.out.println(resp.getAggregationResults().getAsMinAggregationResult("name1").getValue());
    //Obtain the results of the second aggregation
    System.out.println(resp.getAggregationResults().getAsSumAggregationResult("name2").getValue());
    //Obtain the results of the third aggregation
    System.out.println(resp.getAggregationResults().getAsDistinctCountAggregationResult("name3").getValue());
    //Obtain the results of the fourth aggregation
    for (GroupByFieldResultItem item : resp.getGroupByResults().getAsGroupByFieldResult("name4").getGroupByFieldResultItems()) {
        // Key
        System.out.println(item.getKey());
        // Count
        System.out.println(item.getRowCount());
    }
    // Obtain the results of the fifth aggregation
    for (GroupByRangeResultItem item : resp.getGroupByResults().getAsGroupByRangeResult("name4").getGroupByRangeResultItems()) {
        // Count
        System.out.println(item.getRowCount());
    }
}

Nesting

GroupBy type aggregations support nesting. You can add sub-aggregations and GroupBy type sub-aggregations to a GroupBy type aggregation. GroupBy type aggregations can contain endless levels for nesting. However, to ensure the performance and reduce the complexity of aggregations, you are allowed only to set a certain number of levels for nesting.

Nested aggregations are used to perform additional aggregation operations within a group. A two-level nested aggregation is used as an example:
  • GroupBy + SubGroupBy: Group items by province and then by city to obtain data for each city in each province.
  • GroupBy + SubAggregation: Group items by province to obtain the maximum value of an indicator for each province.

Java example:

/**
 * Nested aggregations example: Two aggregations and one GroupByField are added to the outermost level, and two aggregations and one GroupByRange are added to the GroupByField.
 */
public void subGroupBy(SyncClient client) {
    // Create a query
    SearchRequest searchRequest = SearchRequest.newBuilder()
        .indexName("index_name")
        .tableName("table_name")
        .returnAllColumns(true)
        .searchQuery(
            SearchQuery.newBuilder()
                .query(QueryBuilders.match("textField", "hello"))
                .limit(10)
                .addAggregation(AggregationBuilders.min("name1", "fieldName1"))
                .addAggregation(AggregationBuilders.max("name2", "fieldName2"))
                .addGroupBy(GroupByBuilders
                    .groupByField("name3", "fieldName3")
                    .addSubAggregation(AggregationBuilders.max("subName1", "fieldName4"))
                    .addSubAggregation(AggregationBuilders.sum("subName2", "fieldName5"))
                    .addSubGroupBy(GroupByBuilders
                        .groupByRange("subName3", "fieldName6")
                        .addRange(12, 90)
                        .addRange(100, 900)
                    ))
                .build())
        .build();
    // Execute the query
    SearchResponse resp = client.search(searchRequest);
    // The aggregation results of the first level
    AggregationResults aggResults = resp.getAggregationResults();
    System.out.println(aggResults.getAsMinAggregationResult("name1").getValue());
    System.out.println(aggResults.getAsMaxAggregationResult("name2").getValue());

    // Retrieve the GroupByField results of the first level and the results of the aggregations nested in the GroupByField.
    GroupByFieldResult results = resp.getGroupByResults().getAsGroupByFieldResult("someName1");
    for (GroupByFieldResultItem item : results.getGroupByFieldResultItems()) {
        System.out.println("Count" + item.getRowCount());
        System.out.println("key:" + item.getKey());

        //Retrieve the results of sub-aggregations
        //The results of the SubAggregation: min
        System.out.println(item.getSubAggregationResults().getAsMaxAggregationResult("subName1"));
        //The results of the SubAggregation: max
        System.out.println(item.getSubAggregationResults().getAsSumAggregationResult("subName2"));
        //The results of the SubGroupBy: GroupByRange
        GroupByRangeResult subResults = resp.getGroupByResults().getAsGroupByRangeResult("subName3");
        for (GroupByRangeResultItem subItem : subResults.getGroupByRangeResultItems()) {
            System.out.println("Count:" + subItem.getRowCount());
            System.out.println("key:" + subItem.getKey());
        }
    }
}