All Products
Search
Document Center

Tablestore:Read data

Last Updated:Mar 14, 2025

Tablestore provides various operations that you can call to read data from data tables. For example, you can read a single row of data, read multiple rows of data at the same time, read data whose primary key values are in the specified range, read data by using an iterator, and concurrently read data.

Query methods

Tablestore provides the GetRow, BatchGetRow, and GetRange operations that you can call to read data. Before you read data, select an appropriate query method based on the actual query scenario.

Important

If you want to read data from a table that contains an auto-increment primary key column, make sure that you obtain the values of all primary key columns, including the auto-increment primary key column. For more information, see Configure an auto-increment primary key column. If no value is recorded for the auto-increment primary key column, you can call the GetRange operation to determine the primary key range of data that you want to read based on the values of the first primary key column.

Method

Description

Scenario

Read a single row of data

You can call the GetRow operation to read a single row of data.

This method is suitable for scenarios in which the values of all primary key columns of the row that you want to read can be determined and the number of rows that you want to read is small.

Read multiple rows of data at the same time

You can call the BatchGetRow operation to read multiple rows of data from one or more tables at the same time.

The BatchGetRow operation consists of multiple GetRow operations. When you call the BatchGetRow operation, the process of constructing each GetRow operation is the same as the process of constructing the GetRow operation when you call the GetRow operation.

This operation is suitable for scenarios in which the values of all primary key columns of the rows that you want to read can be determined and the number of rows that you want to read is large or you want to read data from multiple tables.

Read data whose primary key values are in the specified range

You can call the GetRange operation to read data whose primary key values are in the specified range.

The GetRange operation allows you to read data whose primary key values are in the specified range in a forward or backward direction. You can also specify the number of rows that you want to read. If the range is large and the number of scanned rows or the volume of scanned data exceeds the upper limit, the scan stops, and the rows that are read and information about the primary key of the next row are returned. You can initiate a request to start from the position in which the last operation left off and read the remaining rows based on the information about the primary key of the next row returned by the previous operation.

This method is suitable for scenarios in which the range of primary key values or the prefix of primary key values of the rows that you want to read can be determined.

Important

If you cannot determine the prefix of the primary key values of the rows that you want to read, you can set the start primary key to INF_MIN and the end primary key to INF_MAX to scan all data in the table. This consumes a large amount of computing resources. Proceed with caution.

Read data whose primary key values are in the specified range by using an iterator

You can call the createRangeIterator operation to read data whose primary key values are in the specified range by using an iterator.

This method is suitable for scenarios in which the range of primary key values or the prefix of primary key values of the rows that you want to read can be determined and an iterator is required to read data.

Concurrently read data

Tablestore SDK for Java provides the TableStoreReader class that encapsulates the BatchGetRow operation. You can use this class to concurrently query data in a data table. TableStoreReader also supports multi-table queries, statistics on query status, row-level callback, and custom configurations.

This operation is suitable for scenarios in which the values of all primary key columns of the rows that you want to read can be determined and the number of rows that you want to read is large or you want to read data from multiple tables.

Prerequisites

  • An OTSClient instance is initialized. For more information, see Initialize a client.

  • A data table is created, and data is written to the data table.

Read a single row of data

You can call the GetRow operation to read a single row of data. After you call the GetRow operation, one of the following results may be returned:

  • If the row exists, the primary key columns and attribute columns of the row are returned.

  • If the row does not exist, no row is returned and no error is reported.

Parameters

Parameter

Description

tableName

The name of the data table.

primaryKey

The primary key of the row. The value of this parameter consists of the name, type, and value of each primary key column.

Important

The number and types of primary key columns that you specify must be the same as the actual number and types of primary key columns in the data table.

columnsToGet

The columns that you want to return. You can specify the names of primary key columns or attribute columns.

  • If you do not specify a column, all data in the row is returned.

  • If you specify columns but the row does not contain the specified columns, the return value is null. If the row contains some of the specified columns, the data in some of the specified columns of the row is returned.

Note
  • By default, Tablestore returns the data from all columns of a row when you query the row. You can configure the columnsToGet parameter to return specific columns. For example, if col0 and col1 are specified for the columnsToGet parameter, only the values of the col0 and col1 columns are returned.

  • If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions.

maxVersions

The maximum number of data versions that can be returned.

Important

You must configure at least one of the following parameters: maxVersions and timeRange.

  • If you configure only the maxVersions parameter, data of the specified number of versions is returned from the most recent version to the earliest version.

  • If you configure only the timeRange parameter, all data whose versions are in the specified time range or data of the specified version is returned.

  • If you configure the maxVersions and timeRange parameters, data of the specified number of versions in the specified time range is returned from the most recent version to the earliest version.

timeRange

The range of versions or a specific version that you want to return. For more information, see TimeRange.

Important

You must configure at least one of the following parameters: maxVersions and timeRange.

  • If you configure only the maxVersions parameter, data of the specified number of versions is returned from the most recent version to the earliest version.

  • If you configure only the timeRange parameter, all data whose versions are in the specified time range or data of the specified version is returned.

  • If you configure the maxVersions and timeRange parameters, data of the specified number of versions in the specified time range is returned from the most recent version to the earliest version.

  • To query data whose versions are in the specified time range, you must configure the start and end parameters. The start parameter specifies the start timestamp. The end parameter specifies the end timestamp. The specified range is a left-closed, right-open interval that is in the [start, end) format.

  • To query data of a specific version, you must configure the timestamp parameter. The timestamp parameter specifies a specific timestamp.

You need to only configure one of timestamp and [start, end).

Valid values of the timeRange parameter: 0 to Long.MAX_VALUE. Unit: millisecond.

filter

The filter that you want to use to filter the query results on the server side. Only rows that meet the filter conditions are returned. For more information, see Configure a filter.

Note

If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions.

Examples

You can specify the data version and the columns that you want to read, and filter data by using a filter or a regular expression.

Read data of the latest version from the specified columns of a row

The following sample code provides an example on how to read data of the latest version from the specified columns of a row in a data table.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);

    // Specify the columns that you want to read. 
    criteria.addColumnsToGet("Col0");
    getRowResponse = client.getRow(new GetRowRequest(criteria));
    row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
} 

Use a filter to filter data that is read

The following sample code provides an example on how to read data of the latest version from a row in a data table and use a filter to filter data based on the value of the Col0 column.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);

    // Configure a filter to return a row in which the value of the Col0 column is 0. 
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    // If the Col0 column does not exist, the row is not returned. 
    singleColumnValueFilter.setPassIfMissing(false);
    criteria.setFilter(singleColumnValueFilter);

    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Use a regular expression to filter data that is read

The following sample code provides an example on how to read the data of the Col1 column from a row in a data table and use a regular expression to filter data in the column.

private static void getRow(SyncClient client, String pkValue) {
    // Specify the name of the data table. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>");
 
    // Construct the primary key. 
    PrimaryKey primaryKey = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue))
        .build();
    criteria.setPrimaryKey(primaryKey);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN, ColumnValue.fromLong(100));
    criteria.setFilter(filter);
 
    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Read multiple rows of data at the same time

You can call the BatchGetRow operation to read multiple rows of data from one or more tables at the same time. The BatchGetRow operation consists of multiple GetRow operations. When you call the BatchGetRow operation, the process of constructing each GetRow operation is the same as the process of constructing the GetRow operation when you call the GetRow operation.

If you call the BatchGetRow operation, each GetRow operation is separately performed. Tablestore separately returns the response to each GetRow operation.

Usage notes

  • The BatchGetRow operation uses the same parameter configurations for all rows. For example, if the ColumnsToGet parameter is set to [colA], only the value of the colA column is read from all rows.

  • When you call the BatchGetRow operation to read multiple rows at the same time, some rows may fail to be read. In this case, Tablestore does not return exceptions but returns BatchGetRowResponse, which includes error messages for the failed rows. Therefore, when you call the BatchGetRow operation, you must check the return values. You can use the isAllSucceed method of BatchGetRowResponse to check whether all rows are read or use the getFailedRows method of BatchGetRowResponse to obtain information about the failed rows.

  • You can call the BatchGetRow operation to read up to 100 rows at the same time.

Parameters

For more information, see the Parameters for reading a single row of data section of this topic.

Examples

The following sample code provides an example on how to configure the version conditions, columns to read, and filters to read 10 rows of data.

private static void batchGetRow(SyncClient client) {
    // Specify the name of the data table. 
    MultiRowQueryCriteria multiRowQueryCriteria = new MultiRowQueryCriteria("<TABLE_NAME>");
    // Specify 10 rows that you want to read. 
    for (int i = 0; i < 10; i++) {
        PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("pk" + i));
        PrimaryKey primaryKey = primaryKeyBuilder.build();
        multiRowQueryCriteria.addRow(primaryKey);
    }
    // Add conditions. 
    multiRowQueryCriteria.setMaxVersions(1);
    multiRowQueryCriteria.addColumnsToGet("Col0");
    multiRowQueryCriteria.addColumnsToGet("Col1");
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    singleColumnValueFilter.setPassIfMissing(false);
    multiRowQueryCriteria.setFilter(singleColumnValueFilter);

    BatchGetRowRequest batchGetRowRequest = new BatchGetRowRequest();
    // The BatchGetRow operation allows you to read data from multiple tables. Each multiRowQueryCriteria parameter specifies query conditions for one table. You can add multiple multiRowQueryCriteria parameters to read data from multiple tables. 
    batchGetRowRequest.addMultiRowQueryCriteria(multiRowQueryCriteria);

    BatchGetRowResponse batchGetRowResponse = client.batchGetRow(batchGetRowRequest);

    System.out.println("Whether all operations are successful:" + batchGetRowResponse.isAllSucceed());
    System.out.println("Read complete. Result:");
    for (BatchGetRowResponse.RowResult rowResult : batchGetRowResponse.getSucceedRows()) {
        System.out.println(rowResult.getRow());
    }
    if (!batchGetRowResponse.isAllSucceed()) {
        for (BatchGetRowResponse.RowResult rowResult : batchGetRowResponse.getFailedRows()) {
            System.out.println("Failed rows:" + batchGetRowRequest.getPrimaryKey(rowResult.getTableName(), rowResult.getIndex()));
            System.out.println("Cause of failures:" + rowResult.getError());
        }

        /**
         * You can use the createRequestForRetry method to construct another request to retry the operations on failed rows. In this example, only the retry request is constructed. 
         * We recommend that you use the custom retry policy in Tablestore SDKs as the retry method. This way, you can retry failed rows after batch operations are performed. After you specify the retry policy, you do not need to add retry code to call the operation. 
         */
        BatchGetRowRequest retryRequest = batchGetRowRequest.createRequestForRetry(batchGetRowResponse.getFailedRows());
    }
}

For the detailed sample code, visit BatchGetRow@GitHub.

Read data whose primary key values are in the specified range

You can call the GetRange operation to read data whose primary key values are in the specified range.

The GetRange operation allows you to read data whose primary key values are in the specified range in a forward or backward direction. You can also specify the number of rows that you want to read. If the range is large and the number of scanned rows or the volume of scanned data exceeds the upper limit, the scan stops, and the rows that are read and information about the primary key of the next row are returned. You can initiate a request to start from the position in which the last operation left off and read the remaining rows based on the information about the primary key of the next row returned by the previous operation.

Note

In Tablestore tables, all rows are sorted by primary key. The primary key of a table sequentially consists of all primary key columns. Therefore, the rows are not sorted based on a specific primary key column.

Usage notes

The GetRange operation follows the leftmost matching principle. Tablestore compares values in sequence from the first primary key column to the last primary key column to read data whose primary key values are in the specified range. For example, the primary key of a data table consists of the following primary key columns: PK1, PK2, and PK3. When data is read, Tablestore first determines whether the PK1 value of a row is in the range that is specified for the first primary key column. If the PK1 value of a row is in the range, Tablestore stops determining whether the values of other primary key columns of the row are in the ranges that are specified for each primary key column and returns the row. If the PK1 value of a row is not in the range, Tablestore continues to determine whether the values of other primary key columns of the row are in the ranges that are specified for each primary key column in the same manner as PK1.

If one of the following conditions is met, the GetRange operation may stop and return data:

  • The amount of scanned data reaches 4 MB.

  • The number of scanned rows reaches 5,000.

  • The number of returned rows reaches the upper limit.

  • The read throughput is insufficient to read the next row of data because all reserved read throughput is consumed.

Each GetRange call scans data once. If the size of data that you want to scan by calling the GetRange operation is large, the scanning stops when the number of scanned rows reaches 5,000 or the size of scanned data reaches 4 MB. Tablestore does not return the remaining data that meets the query conditions. You can use the paging method to obtain the remaining data that meets the query conditions.

Parameters

Parameter

Description

tableName

The name of the data table.

direction

The order in which you want to sort the rows in the response.

  • If you set this parameter to FORWARD, the value of the inclusiveStartPrimaryKey parameter must be less than the value of the exclusiveEndPrimaryKey parameter, and the rows in the response are sorted in ascending order of primary key values.

  • If you set this parameter to BACKWARD, the value of the inclusiveStartPrimaryKey parameter must be greater than the value of the exclusiveEndPrimaryKey parameter, and the rows in the response are sorted in descending order of primary key values.

For example, a table has two primary key values A and B, and Value A is less than Value B. If you set the direction parameter to FORWARD and specify the [A, B) range for the table, Tablestore returns rows whose primary key value is greater than or equal to Value A but less than Value B in ascending order from Value A to Value B. If you set the direction parameter to BACKWARD and specify the [B, A) range for the table, Tablestore returns rows whose primary key value is less than or equal to Value B and greater than Value A in descending order from Value B to Value A.

inclusiveStartPrimaryKey

The start primary key and end primary key of the range that you want to read. The start primary key and end primary key must be valid primary keys or virtual points that contain data of the INF_MIN type or INF_MAX type. The number of columns for each virtual point must be the same as the number of columns of each primary key.

INF_MIN specifies an infinitely small value. All values of other types are greater than a value of the INF_MIN type. INF_MAX specifies an infinitely large value. All values of other types are smaller than a value of the INF_MAX type.

  • The inclusiveStartPrimaryKey parameter specifies the start primary key. If a row that contains the start primary key exists, the row of data is returned.

  • The exclusiveEndPrimaryKey parameter specifies the end primary key. Regardless of whether a row that contains the end primary key exists, the row of data is not returned.

The rows in a data table are sorted in ascending order based on primary key values. The range that is used to read data is a left-closed, right-open interval. If data is read in the forward direction, the rows whose primary key values are greater than or equal to the start primary key value but less than the end primary key value are returned.

exclusiveEndPrimaryKey

limit

The maximum number of rows that you want to return. The value of this parameter must be greater than 0.

Tablestore stops an operation when the maximum number of rows that can be returned in the forward or backward direction is reached, even if some rows in the specified range are not returned. You can use the value of the nextStartPrimaryKey parameter returned in the response to read data in the next request.

columnsToGet

The columns that you want to return. You can specify the names of primary key columns or attribute columns.

  • If you do not specify a column, all data in the row is returned.

  • If you specify columns but the row does not contain the specified columns, the return value is null. If the row contains some of the specified columns, the data in some of the specified columns of the row is returned.

Note
  • By default, Tablestore returns the data from all columns of a row when you query the row. You can configure the columnsToGet parameter to return specific columns. For example, if col0 and col1 are specified for the columnsToGet parameter, only the values of the col0 and col1 columns are returned.

  • If a row is in the specified range that you want to read based on primary key values but does not contain the specified columns that you want to return, the response excludes the row.

  • If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions.

maxVersions

The maximum number of data versions that can be returned.

Important

You must configure at least one of the following parameters: maxVersions and timeRange.

  • If you configure only the maxVersions parameter, data of the specified number of versions is returned from the most recent version to the earliest version.

  • If you configure only the timeRange parameter, all data whose versions are in the specified time range or data of the specified version is returned.

  • If you configure the maxVersions and timeRange parameters, data of the specified number of versions in the specified time range is returned from the most recent version to the earliest version.

timeRange

The range of versions or a specific version that you want to return. For more information, see TimeRange.

Important

You must configure at least one of the following parameters: maxVersions and timeRange.

  • If you configure only the maxVersions parameter, data of the specified number of versions is returned from the most recent version to the earliest version.

  • If you configure only the timeRange parameter, all data whose versions are in the specified time range or data of the specified version is returned.

  • If you configure the maxVersions and timeRange parameters, data of the specified number of versions in the specified time range is returned from the most recent version to the earliest version.

  • To query data whose versions are in the specified time range, you must configure the start and end parameters. The start parameter specifies the start timestamp. The end parameter specifies the end timestamp. The specified range is a left-closed, right-open interval that is in the [start, end) format.

  • To query data of a specific version, you must configure the timestamp parameter. The timestamp parameter specifies a specific timestamp.

You need to only configure one of timestamp and [start, end).

Valid values of the timeRange parameter: 0 to Long.MAX_VALUE. Unit: millisecond.

filter

The filter that you want to use to filter the query results on the server side. Only rows that meet the filter conditions are returned. For more information, see Configure a filter.

Note

If you configure the columnsToGet and filter parameters, Tablestore queries the columns that are specified by the columnsToGet parameter, and then returns the rows that meet the filter conditions.

nextStartPrimaryKey

The start primary key value of the next read request. The value of the nextStartPrimaryKey parameter can be used to determine whether all data is read.

  • If the value of the nextStartPrimaryKey parameter is not empty in the response, the value can be used as the start primary key value for the next GetRange operation.

  • If the value of the nextStartPrimaryKey parameter is empty in the response, all data in the specified range is returned.

Examples

Read data whose primary key values are in the specified range

The following sample code provides an example on how to read data whose primary key values are in the specified range in the forward direction. If the value of the nextStartPrimaryKey parameter is empty in the response, all data whose primary key values are in the specified range is read. Otherwise, you must initiate another request until all data whose primary key values are in the specified range is returned.

private static void getRange(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria("<TABLE_NAME>");

    // Specify the start primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(startPkValue));
    rangeRowQueryCriteria.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(endPkValue));
    rangeRowQueryCriteria.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeRowQueryCriteria.setMaxVersions(1);

    System.out.println("GetRange result:");
    while (true) {
        GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
        for (Row row : getRangeResponse.getRows()) {
            System.out.println(row);
        }

        // If the value of the nextStartPrimaryKey parameter is not null, continue the read operation. 
        if (getRangeResponse.getNextStartPrimaryKey() != null) {
            rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}         

Read data within the range determined by the value of the first primary key column

The following sample code provides an example on how to read data within the range determined by the value of the first primary key column in the forward direction. In this example, the start value of the second primary key column is set to INF_MIN, and the end value of the second primary key column is set to INF_MAX. If the value of the nextStartPrimaryKey parameter is null in the response, all data in the specified range is read. Otherwise, you must initiate another request until all data within the range determined by the value of the first primary key column is returned.

private static void getRange(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria("<TABLE_NAME>");
    // Specify the start primary key. In this example, two primary key columns are used. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromString(startPkValue));// Set the value of the first primary key column to a specific value. 
    primaryKeyBuilder.addPrimaryKeyColumn("pk2", PrimaryKeyValue.INF_MIN);// Set the value of the second primary key column to an infinitely small value. 
    rangeRowQueryCriteria.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromString(endPkValue));// Set the value of the first primary key column to a specific value. 
    primaryKeyBuilder.addPrimaryKeyColumn("pk2", PrimaryKeyValue.INF_MAX);// Set the value of the second primary key column to an infinitely great value. 
    rangeRowQueryCriteria.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeRowQueryCriteria.setMaxVersions(1);

    System.out.println("GetRange result:");
    while (true) {
        GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
        for (Row row : getRangeResponse.getRows()) {
            System.out.println(row);
        }

        // If the value of the nextStartPrimaryKey parameter is not null, continue the read operation. 
        if (getRangeResponse.getNextStartPrimaryKey() != null) {
            rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}    

Read data whose primary key values are in the specified range and use a regular expression to filter data in the specified column

The following sample code provides an example on how to read data whose primary key values are in the range of ["pk:2020-01-01.log", "pk:2021-01-01.log") from the Col1 column and use a regular expression to filter data in the Col1 column.

private static void getRange(SyncClient client) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria criteria = new RangeRowQueryCriteria("<TABLE_NAME>");
 
    // Specify ["pk:2020-01-01.log", "pk:2021-01-01.log") as the range of the primary key of the data that you want to read. The range is a left-closed, right-open interval. 
    PrimaryKey pk0 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2020-01-01.log"))
        .build();
    PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2021-01-01.log"))
        .build();
    criteria.setInclusiveStartPrimaryKey(pk0);
    criteria.setExclusiveEndPrimaryKey(pk1);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN,ColumnValue.fromLong(100));
    criteria.setFilter(filter);

    while (true) {
        GetRangeResponse resp = client.getRange(new GetRangeRequest(criteria));
        for (Row row : resp.getRows()) {
            // do something
            System.out.println(row);
        }
        if (resp.getNextStartPrimaryKey() != null) {
            criteria.setInclusiveStartPrimaryKey(resp.getNextStartPrimaryKey());
        } else {
            break;
        }
   }
}

For the detailed sample code, visit GetRange@GitHub.

Read data whose primary key values are in the specified range by using an iterator

The following sample code provides an example on how to call the createRangeIterator operation to read data whose primary key values are in the specified range by using an iterator.

private static void getRangeByIterator(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeIteratorParameter rangeIteratorParameter = new RangeIteratorParameter("<TABLE_NAME>");

    // Specify the start primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(startPkValue));
    rangeIteratorParameter.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(endPkValue));
    rangeIteratorParameter.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeIteratorParameter.setMaxVersions(1);

    Iterator<Row> iterator = client.createRangeIterator(rangeIteratorParameter);

    System.out.println("Results obtained when an iterator is used in a GetRange operation:");
    while (iterator.hasNext()) {
        Row row = iterator.next();
        System.out.println(row);
    }
}           

For the detailed sample code, visit GetRangeByIterator@GitHub.

Concurrently read data

Tablestore SDK for Java provides the TableStoreReader class that encapsulates the BatchGetRow operation. You can use this class to concurrently query data in a data table. TableStoreReader also supports multi-table queries, statistics on query status, row-level callback, and custom configurations.

Important

TableStoreReader is supported by Tablestore SDK for Java V5.16.1 and later. Make sure that you use a valid SDK version. For information about the version history of Tablestore SDK for Java, see Version history of Tablestore SDK for Java.

Getting started

  1. Construct the TableStoreReader.

    // Specify the name of the instance.
    String instanceName = "yourInstanceName";
    // Specify the endpoint of the instance.
    String endpoint = "yourEndpoint";
    // Obtain the AccessKey ID and AccessKey secret from the environment variables.
    String accessKeyId = System.getenv("TABLESTORE_ACCESS_KEY_ID");
    String accessKeySecret = System.getenv("TABLESTORE_ACCESS_KEY_SECRET");
    
    AsyncClientInterface client = new AsyncClient(endpoint, accessKeyId, accessKeySecret, instanceName);
    TableStoreReaderConfig config = new TableStoreReaderConfig();
    ThreadPoolExecutor executor = new ThreadPoolExecutor(4, 4, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue(1024))
    
    TableStoreReader reader = new DefaultTableStoreReader(client, config, executor, null);
  2. Construct a query request.

    Cache the data that you want to query in the memory. You can add one or more rows of data at the same time.

    PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
            .addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromLong(0))
            .addPrimaryKeyColumn("pk2", PrimaryKeyValue.fromLong(0))
            .build();
    // Add a primary key whose value is pk1 to query the attribute columns of the row. 
    Future<ReaderResult> readerResult = reader.addPrimaryKeyWithFuture("<TABLE_NAME1>", pk1);
    // You can also use the List method to add multiple primary key values at the same time to query the rows. 
    List<PrimaryKey> primaryKeyList = new ArrayList<PrimaryKey>();
    Future<ReaderResult> readerResult = reader.addPrimaryKeysWithFuture("<TABLE_NAME2>", primaryKeyList);
  3. Query data.

    Send a request to query the data that is cached in the memory. You can query data in synchronous or asynchronous mode.

    • Query data in synchronous mode

      reader.flush();
    • Query data in asynchronous mode

      reader.send();
  4. Obtain the query result.

    // Display information about successful and failed queries. 
    for (RowReadResult success : readerResult.get().getSucceedRows()) {
        System.out.println(success.getRowResult());
    }
    
    for (RowReadResult fail : readerResult.get().getFailedRows()) {
        System.out.println(fail.getRowResult());
    }
  5. Close the TableStoreReader.

    reader.close();
    // Close the client and executor based on your business requirements. 
    client.shutdown();
    executor.shutdown();

Parameters

You can modify TableStoreReaderConfig to specify custom configurations for the TableStoreReader.

Parameter

Description

checkTableMeta

Specifies whether to check the schema of the table when you add the primary key values of the rows that you want to query. Default value: true.

If you do not want to check the schema of the table when you add the primary key values of the rows that you want to query, set this parameter to false.

bucketCount

The number of cache buckets in the memory of the TableStoreReader. Default value: 4.

bufferSize

The size of the RingBuffer for each bucket. Default value: 1024.

concurrency

The maximum concurrency that is allowed for the batchGetRow operation. Default value: 10.

maxBatchRowsCount

The maximum number of rows that can be queried by calling the batchGetRow operation. Default value: 100. Maximum value: 100.

defaultMaxVersions

The maximum number of data versions that can be queried by calling the getRow operation. Default value: 1.

flushInterval

The interval at which the data cached in the memory is automatically flushed. Default value: 10000. Unit: milliseconds.

logInterval

The interval at which the status of tasks is automatically printed. Default value: 10000. Unit: millisecond.

Specify query conditions

You can specify table-level parameters to query data, such as the maximum number of data versions, the columns that you want to query, and the time range within which you want to query data.

// Query data of up to 10 versions in the col1 column of the specified table within the previous 60 seconds. 
// Specify the name of the data table. 
RowQueryCriteria criteria = new RowQueryCriteria("<TABLE_NAME>");
// Specify the columns that you want to return. 
criteria.addColumnsToGet("col1");
// Specify the maximum number of versions that you want to return. 
criteria.setMaxVersions(10);
criteria.setTimeRange(new TimeRange(System.currentTimeMillis() - 60 * 1000, System.currentTimeMillis()));
reader.setRowQueryCriteria(criteria);

Complete sample code

public class TableStoreReaderDemo {
    // Specify the name of the instance.
    private static final String instanceName = "yourInstanceName";
    // Specify the endpoint of the instance.
    private static final String endpoint = "yourEndpoint";
    // Obtain the AccessKey ID and AccessKey secret from the environment variables.
    private static final String accessKeyId = System.getenv("TABLESTORE_ACCESS_KEY_ID");
    private static final String accessKeySecret = System.getenv("TABLESTORE_ACCESS_KEY_SECRET");
    private static AsyncClientInterface client;
    private static ExecutorService executor;
    private static AtomicLong succeedRows = new AtomicLong();
    private static AtomicLong failedRows = new AtomicLong();

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        /**
         * Step 1: Construct the TableStoreReader. 
         */
        // Construct the AsyncClient. 
        client = new AsyncClient(endpoint, accessKeyId, accessKeySecret, instanceName);
        // Construct the configuration class of the TableStoreReader. 
        TableStoreReaderConfig config = new TableStoreReaderConfig();
        {
            // The following parameters have default values and can be left empty. 
            // Check the schema of the table before you add the primary key values of the rows that you want to query. 
            config.setCheckTableMeta(true);  
            // Specify the maximum number of rows that you can query by using a request. In this example, up to 100 rows can be queried by using a request. 
            config.setMaxBatchRowsCount(100);    
            // Specify the maximum number of versions that can be returned. 
            config.setDefaultMaxVersions(1);
            // The total number of concurrent requests that can be sent. 
            config.setConcurrency(16); 
            // Specify the number of buckets in the memory. 
            config.setBucketCount(4);      
            // Specify the interval at which all cached data is flushed. 
            config.setFlushInterval(10000);      
            // Specify the interval at which the status of the TableStoreReader is recorded. 
            config.setLogInterval(10000);                   
        }
        // Construct an executor that is used to send the request. 
        ThreadFactory threadFactory = new ThreadFactory() {
            private final AtomicInteger counter = new AtomicInteger(1);

            @Override
            public Thread newThread(Runnable r) {
                return new Thread(r, "reader-" + counter.getAndIncrement());
            }
        };
        executor = new ThreadPoolExecutor(4, 4, 0L, TimeUnit.MILLISECONDS,
                new LinkedBlockingQueue(1024), threadFactory, new ThreadPoolExecutor.CallerRunsPolicy());

        // Construct the callback function of the TableStoreReader. 
        TableStoreCallback<PrimaryKeyWithTable, RowReadResult> callback = new TableStoreCallback<PrimaryKeyWithTable, RowReadResult>() {
            @Override
            public void onCompleted(PrimaryKeyWithTable req, RowReadResult res) {
                succeedRows.incrementAndGet();
            }

            @Override
            public void onFailed(PrimaryKeyWithTable req, Exception ex) {
                failedRows.incrementAndGet();
            }
        };
        TableStoreReader reader = new DefaultTableStoreReader(client, config, executor, callback);
        /**
         * Step 2: Construct the request. 
         */
        // Add the primary key value of a row that you want to query to the memory. 
        PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
                .addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromLong(0))
                .addPrimaryKeyColumn("pk2", PrimaryKeyValue.fromLong(0))
                .build();
        reader.addPrimaryKey("<TABLE_NAME1>", pk1);

        // Add the primary key value of another row that you want to query to the memory and obtain a Future object. 
        PrimaryKey pk2 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
                .addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromLong(0))
                .addPrimaryKeyColumn("pk2", PrimaryKeyValue.fromLong(0))
                .build();
        Future<ReaderResult> readerResult = reader.addPrimaryKeyWithFuture("<TABLE_NAME2>", pk2);
        /**
         * Step 3: Query data. 
         */
        // Send data in the memory in asynchronous mode. 
        reader.send();
        /**
         * Step 4: Obtain the query result. 
         */
        // Display information about successful and failed queries. 
        for (RowReadResult success : readerResult.get().getSucceedRows()) {
            System.out.println(success.getRowResult());
        }
        for (RowReadResult fail : readerResult.get().getFailedRows()) {
            System.out.println(fail.getRowResult());
        }
        /**
         * Step 5: Close the TableStoreReader. 
         */
        reader.close();
        client.shutdown();
        executor.shutdown();
    }
}

References

  • If you want to use indexes to accelerate data queries, you can use the secondary index or search index feature. For more information, see Secondary index or Search index.

  • If you want to visualize data in a table, you can connect the table to DataV or Grafana. For more information, see Data visualization.

  • If you want to download data from a table to a local file, you can use DataX or the Tablestore CLI. For more information, see Download data in Tablestore to a local file.

  • If you want to compute and analyze data in a table, you can use the SQL query feature of Tablestore. For more information, see SQL query.

    Note

    You can also use compute engines, such as MaxCompute, Spark, Hive, HadoopMR, Function Compute, and Flink, to compute and analyze data in tables. For more information, see Overview.