Use the collapse feature to deduplicate search results by a specific column. When query results contain large amounts of data of a specific type, collapse displays that data only once in the returned results to ensure the diversity of the result types.
Prerequisites
-
An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
-
A data table is created. Data is written to the table. For more information, see Create a data table and Write data.
-
A search index is created for the data table. For more information, see Create a search index.
Notes
-
The `collapse` feature supports pagination only with `offset` and `limit`. It does not support token-based pagination.
-
When you use statistical aggregation and `collapse` on a result set, the aggregation is applied to the result set before it is collapsed.
-
After you use the `collapse` feature, the total number of groups returned is limited by the `offset` and `limit` parameters. The feature can return a maximum of 100,000 groups.
-
The total row count returned in the results is the number of matching rows before the `collapse` operation. You cannot retrieve the total number of groups after the collapse.
Parameters
|
Parameter |
Description |
|
table_name |
The name of the data table. |
|
index_name |
The name of the search index. |
|
query |
The query type. Set this to any supported query type. |
|
collapse |
Collapses the result set by the specified column. Only INTEGER, FLOATING-POINT, and KEYWORD columns are supported.
|
|
offset |
The starting position for the query. |
|
limit |
The maximum number of rows to return. Set |
Examples
The following sample code shows how to use a search index to obtain up to 10 unique keyword records in table php_sdk_test and return data in columns col1 and col2.
$request = array(
'table_name' => 'php_sdk_test',
'index_name' => 'php_sdk_test_search_index',
'search_query' => array(
'offset' => 0,
'limit' => 10,
'get_total_count' => true,
'collapse' => array(
'field_name' => 'keyword'
),
'query' => array(
'query_type' => QueryTypeConst::MATCH_ALL_QUERY
),
// 'sort' => array(// Specify a specific sorting method if required.
// array(
// 'field_sort' => array(
// 'field_name' => 'keyword',
// 'order' => SortOrderConst::SORT_ORDER_ASC
// )
// ),
// ),
'token' => null,
),
'columns_to_get' => array(
'return_type' => ColumnReturnTypeConst::RETURN_SPECIFIED,
'return_names' => array('col1', 'col2')
)
);
$response = $otsClient->search($request);
FAQ
References
When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, Boolean query, geo query, nested query, and exists query. You can use different query methods to query data from multiple dimensions based on your business requirements.
If you want to sort or paginate the rows that meet the query conditions, you can use the sorting and paging feature. For more information, see Sorting and paging.
If you want to collapse the result set based on a specific column, you can use the collapse (distinct) feature. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, such as obtaining the extreme values, sum, and total number of rows, you can perform aggregation operations or execute SQL statements. For more information, see Aggregation and SQL query.
If you want to quickly obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.