All Products
Search
Document Center

Tablestore:Filter

Last Updated:Apr 07, 2024

Tablestore filters query results on the server and then returns the filtered results. Only the rows that meet the filter conditions are returned.

Prerequisites

  • An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.

  • A data table is created, and data is written to the data table.

Procedure

When you call the GetRow, BatchGetRow, or GetRange operation to query data, you can use a filter to return only the rows that meet the filter conditions.

Filters include SingleColumnCondition and CompositeColumnCondition.

  • SingleColumnCondition: determines whether to return a row based only on the value of a reference column.

  • CompositeColumnCondition: determines whether to return a row based on a combination of filter conditions for values of multiple reference columns.

Limits

  • Filter conditions support relational operators (=, !=, >, >=, <, and <=) and logical operators (NOT, AND, and OR). A filter condition can include up to 10 subconditions.

  • The reference columns that are used by a filter must be included in the returned columns. Otherwise, the filter cannot query the values of the reference columns.

  • When you use filters for the GetRow, BatchGetRow, and GetRange operations, the native semantics or limits of these operations are not affected.

    When you call the GetRange operation, up to 5,000 rows or 4 MB of data can be scanned at the same time.

    If the scanned 5,000 rows or 4 MB of data does not meet the filter conditions, the rows in the response are empty. However, the next_start_primary_key parameter may not be empty. In this case, you must use the next_start_primary_key parameter to continue reading data until the parameter value is empty.

Parameters

Parameter

Description

column_name

The name of the reference column used by the filter.

column_value

The value of the reference column used by the filter.

ComparatorType

The relational operators used by the filter. For more information about relational operators, see ComparatorType.

The following relational operators are supported: EQUAL (=), NOT_EQUAL (!=), GREATER_THAN (>), GREATER_EQUAL (>=), LESS_THAN (<), and LESS_EQUAL (<=).

LogicOperator

The logical operators used by the filter. For information about logical operators, see LogicalOperator.

The following logical operators are supported: NOT, AND, and OR.

pass_if_missing

Specifies whether to return a row if a reference column does not exist in the row. The data type of this parameter is Boolean. The default value is True, which specifies that a row is returned if the reference column does not exist in the row.

If you set the pass_if_missing parameter to False and the reference column does not exist in a row, the row is not returned.

latest_version_only

Specifies whether to use only the latest version of data in a reference column for comparison when the reference column contains multiple versions of data. The data type of this parameter is Boolean. The default value is True, which specifies that only the latest version of data is used for comparison if the reference column contains multiple versions of data.

If you set the latest_version_only parameter to False and a reference column contains multiple versions of data, all versions of data in the column are used for comparison. In this case, the row is returned when at least one version of data meets the filter conditions.

Examples

Use SingleColumnCondition to filter data

The following sample code provides an example on how to read a row from a data table. In this example, the latest version of data is read and data is filtered based on the value of the name column.

def get_row_with_condition(client):
    # Specify the primary key of the row that you want to read. 
    primary_key = [('uid',1), ('gid',101)]
    # Specify the columns that you want to return. If you do not specify the columns that you want to return, all columns are returned. 
    columns_to_get = [] 
    # Specify that the row is returned when the value of the name column is Hangzhou. 
    cond = SingleColumnCondition("name", 'Hangzhou', ComparatorType.EQUAL, pass_if_missing = True)
    consumed, return_row, next_token = client.get_row('Sampletable', primary_key, columns_to_get, cond, 1)

    print('Read succeed, consume %s read cu.' % consumed.read)

    print('Value of primary key: %s' % return_row.primary_key)
    print('Value of attribute: %s' % return_row.attribute_columns)
    for att in return_row.attribute_columns:
        print('name:%s\tvalue:%s\ttimestamp:%d' % (att[0], att[1], att[2]))

Use CompositeColumnCondition to filter data

The following sample code provides an example on how to read a row from a data table. In this example, data is filtered based on a combination of filter conditions for the values of the growth and name columns.

def get_row_with_composite_condition(client):
    # Specify the primary key of the row that you want to read. 
    primary_key = [('uid',1), ('gid',101)]
    # Specify the columns that you want to return. If you do not specify the columns that you want to return, all columns are returned. 
    columns_to_get = [] 
    # Specify that the row is returned when the value of the growth column is 0.9 and the value of the name column is Hangzhou. 
    cond = CompositeColumnCondition(LogicalOperator.AND)
    cond.add_sub_condition(SingleColumnCondition("growth", 0.9, ComparatorType.EQUAL))
    cond.add_sub_condition(SingleColumnCondition("name", 'Hangzhou', ComparatorType.EQUAL))

    consumed, return_row, next_token = client.get_row('Sampletable', primary_key, columns_to_get, cond, 1)

    print('Read succeed, consume %s read cu.' % consumed.read)

    print('Value of primary key: %s' % return_row.primary_key)
    print('Value of attribute: %s' % return_row.attribute_columns)
    for att in return_row.attribute_columns:
        print('name:%s\tvalue:%s\ttimestamp:%d' % (att[0], att[1], att[2]))

References

  • If an application needs to use different attribute columns as query conditions to query data, you can specify the attribute columns as the primary key columns of a secondary index. This way, you can use the secondary index to query data based on the attribute columns. This accelerates data queries. For more information, see Secondary Index.

  • If your business requires multi-dimensional queries and data analysis, you can create a search index and specify the required attributes as the fields of the search index. Then, you can query and analyze data by using the search index. For example, you can use a search index to perform queries based on non-primary key columns, Boolean queries, and fuzzy queries. You can also use a search index to obtain the maximum and minimum values, collect statistics about the number of rows, and group query results. For more information, see Search index.