The collapse (distinct) feature deduplicates search results by a specified column, returning one result per unique value to increase the diversity of returned results.
Prerequisites
-
An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
-
A data table is created and data is written to the data table. For more information, see Create data tables and Write data.
-
A search index is created for the data table. For more information, see Create search indexes.
Notes
-
The collapse feature supports pagination only with the Offset+Limit method. The token method is not supported.
-
If you use statistical aggregation and the collapse feature on a result set, the statistical aggregation is applied to the result set before it is collapsed.
-
After you use the collapse feature, the total number of returned groups is determined by the sum of Offset and Limit. The maximum number of groups that can be returned is 100,000.
-
The total number of rows returned indicates the number of matching rows before the collapse. You cannot retrieve the total number of groups after the collapse.
Parameters
|
Parameter |
Description |
|
TableName |
The name of the data table. |
|
IndexName |
The name of the search index. |
|
Query |
Can be any query type. |
|
Collapse |
The collapse parameter settings, which include FieldName. FieldName: The name of the column to collapse the result set on. This feature supports only integer, floating-point number, and Keyword type columns. It does not support array type columns. |
|
Offset |
The starting position for the query. |
|
Limit |
The maximum number of items to return for the query. If you only need the row count and not the data, set Limit to 0. This returns no rows. |
Examples
The following example queries all rows in a table and collapses the result set by the pk0 column so that only one row per unique pk0 value is returned.
/// <summary>
/// Collapses the result set by the pk0 column.
/// </summary>
/// <param name="otsClient"></param>
public static void UseCollapse(OTSClient otsClient)
{
MatchAllQuery matchAllQuery = new MatchAllQuery();
Collapse collapse = new Collapse();
collapse.FieldName = "pk0";
SearchQuery searchQuery = new SearchQuery();
searchQuery.Query = matchAllQuery;
searchQuery.Collapse = collapse;
SearchRequest searchRequest = new SearchRequest(TableName, IndexName, searchQuery);
SearchResponse searchResponse = otsClient.Search(searchRequest);
foreach (Row row in searchResponse.Rows)
{
Console.WriteLine(JsonConvert.SerializeObject(row));
}
}
FAQ
References
When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, Boolean query, geo query, nested query, and exists query. You can use different query methods to query data from multiple dimensions based on your business requirements.
You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Sorting and paging.
You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.
If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.