All Products
Search
Document Center

Tablestore:Configure a filter

Last Updated:Mar 07, 2024

After you configure a filter, Tablestore filters query results on the server. Only rows that match the filter conditions are returned.

Prerequisites

  • An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.

  • A data table is created, and data is written to the data table.

Usage notes

When you call the GetRow, BatchGetRow, or GetRange operation to query data, you can use a filter to return only the rows that match the filter conditions.

The available filters are SingleColumnValueFilter, SingleColumnValueRegexFilter, and CompositeColumnValueFilter.

  • SingleColumnValueFilter determines whether to filter a row based on the value of a reference column.

  • SingleColumnValueRegexFilter uses regular expressions to match column values of the String type and extract matched substrings. Then, this filter converts the data type of the extracted substrings to String, Integer, or Double and filters the values after conversion.

  • CompositeColumnValueFilter determines whether to filter a row based on a logical combination of the matching results for the values of multiple reference columns.

Note

For more information about filters, see Configure a filter.

Precautions

  • Filter conditions support relational operators of =, !=, >, >=, <, and <= and logical operators of NOT, AND, and OR. A filter condition can inlude up to 10 subconditions.

  • The reference columns that are used by a filter must be included in the query results. If the specified columns from which data is read do not include reference columns, the filter cannot query the values of reference columns.

  • When you use the GetRange operation, up to 5,000 rows or 4 MB of data can be scanned at a time.

    If no data matches the filter conditions in the range of the scan, the returned rows are empty. However, NextStartPrimaryKey may not be empty. If NextStartPrimaryKey is not empty, use the parameter value to continue scanning until the return value of NextStartPrimaryKey is empty.

Parameters

Parameter

Description

ColumnName

The name of the reference column used by a filter.

ColumnValue

The value of the reference column used by a filter.

CompareOperator

The relational operator used by a filter.

The following relational operators are supported: EQUAL (=), NOT_EQUAL (!=), GREATER_THAN (>), GREATER_EQUAL (>=), LESS_THAN (<), and LESS_EQUAL (<=).

LogicOperator

The logical operator used by a filter.

The following logical operators are supported: NOT, AND, and OR.

PassIfMissing

Specifies whether to return a row when a reference column does not exist in the row. Valid values:

  • true (default): If a reference column does not exist in a row, the row is returned.

  • false: If a reference column does not exist in a row, the row is not returned.

LatestVersionsOnly

Specifies whether to use only the latest version of data in a reference column for comparison when the reference column contains data of multiple versions. Valid values:

  • true (default): If a reference column contains data of multiple versions, only the latest version of data is used for comparison.

  • false: If a reference column contains data of multiple versions, all versions of data are used for comparison. If one version of data in the reference column meets the filter conditions, the row is returned.

Regex

The regular expression used to match subfield values. The regular expression must meet the following conditions:

  • A regular expression can be up to 256 bytes in length.

  • The syntax of regular expressions in Perl is supported.

  • Single-byte regular expressions are supported.

  • Regular expression matching in Chinese is not supported.

  • The full matching mode and partial matching mode of a regular expression are supported.

    In the partial matching mode, the subfield values to match are in a pair of parentheses () in a regular expression.

    If the full matching mode is used, the first matching result is returned. If the partial matching mode is used, the first submatch result is returned. For example, if the column value is 1aaa51bbb5 and the regular expression is 1[a-z]+5, the return value is 1aaa5. If the regular expression is 1([a-z]+)5, the return value is aaa.

VariantType

The data type of the subfield values after conversion when you use a regular expression to match the subfield values. Valid values: VT_INTEGER (integer type), VT_STRING (string type), and VT_DOUBLE (double-precision floating-point type).

Examples

Use SingleColumnValueFilter to filter data

The following sample code provides an example on how to read data of the latest version from a row in a data table and use a filter to filter data based on the value of the Col0 column.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);

    // Configure a filter to return a row in which the value of the Col0 column is 0. 
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    // If the Col0 column does not exist, the row is not returned. 
    singleColumnValueFilter.setPassIfMissing(false);
    criteria.setFilter(singleColumnValueFilter);

    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Use SingleColumnValueRegexFilter to filter data based on regular expression matching

The following sample code provides an example on how to read data whose primary key values are in the range of ["pk:2020-01-01.log", "pk:2021-01-01.log") from the Col1 column and use a regular expression to filter data in the Col1 column.

private static void getRange(SyncClient client) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria criteria = new RangeRowQueryCriteria("<TABLE_NAME>");
 
    // Specify ["pk:2020-01-01.log", "pk:2021-01-01.log") as the range of the primary key of the data that you want to read. The range is a left-closed and right-open interval. 
    PrimaryKey pk0 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2020-01-01.log"))
        .build();
    PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2021-01-01.log"))
        .build();
    criteria.setInclusiveStartPrimaryKey(pk0);
    criteria.setExclusiveEndPrimaryKey(pk1);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN,ColumnValue.fromLong(100));
    criteria.setFilter(filter);

    while (true) {
        GetRangeResponse resp = client.getRange(new GetRangeRequest(criteria));
        for (Row row : resp.getRows()) {
            // do something
            System.out.println(row);
        }
        if (resp.getNextStartPrimaryKey() != null) {
            criteria.setInclusiveStartPrimaryKey(resp.getNextStartPrimaryKey());
        } else {
            break;
        }
   }
}

Use CompositeColumnValueFilter to filter data

The following sample code is used to return the rows whose primary key is within the ["a","h") range. In addition, the value of Col0 is 0 and either the value of Col1 is greater than 100 or the value of Col2 is less than or equal to 10.

private static void getRange(SyncClient client) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria criteria = new RangeRowQueryCriteria("<TABLE_NAME>");

    // Specify the primary key range. The primary key range is a left-closed and right-open interval. 
    PrimaryKey pk0 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
            .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("a"))
            .build();
    PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
            .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("h"))
            .build();
    criteria.setInclusiveStartPrimaryKey(pk0);
    criteria.setExclusiveEndPrimaryKey(pk1);

    // Set the maxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);

    // Set the composite1 condition to (Col0 == 0) AND (Col1 > 100). 
    CompositeColumnValueFilter composite1 = new CompositeColumnValueFilter(CompositeColumnValueFilter.LogicOperator.AND);
    SingleColumnValueFilter single1 = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    SingleColumnValueFilter single2 = new SingleColumnValueFilter("Col1",
            SingleColumnValueFilter.CompareOperator.GREATER_THAN, ColumnValue.fromLong(100));
    composite1.addFilter(single1);
    composite1.addFilter(single2);

    // Set the composite2 condition to ( (Col0 == 0) AND (Col1 > 100) ) OR (Col2 <= 10). 
    CompositeColumnValueFilter composite2 = new CompositeColumnValueFilter(CompositeColumnValueFilter.LogicOperator.OR);
    SingleColumnValueFilter single3 = new SingleColumnValueFilter("Col2",
            SingleColumnValueFilter.CompareOperator.LESS_EQUAL, ColumnValue.fromLong(10));
    composite2.addFilter(composite1);
    composite2.addFilter(single3);
    criteria.setFilter(composite2);

    while (true) {
        GetRangeResponse resp = client.getRange(new GetRangeRequest(criteria));
        for (Row row : resp.getRows()) {
            // do something
            System.out.println(row);
        }
        if (resp.getNextStartPrimaryKey() != null) {
            criteria.setInclusiveStartPrimaryKey(resp.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}

References

  • If an application needs to use different attributes as query conditions to query data, you can specify these attributes as the primary key columns of a secondary index table to speed up the query. For more information, see Secondary index.

  • If your business requires multi-dimensional queries and data analysis, you can create a search index and specify the required attributes as the fields of the search index. Then, you can query and analyze the data by using the search index. The multi-dimensional queries include queries based on non-primary key columns, Boolean queries, and fuzzy queries. The data analysis requirements include obtaining extreme values, counting rows, and grouping data. For more information, see Search Index.