Read data from a table - Tablestore - Alibaba Cloud Documentation Center

Note

When you read data from a table that has an auto-increment primary key column, you must provide the complete primary key, including the value of the auto-increment column. For more information, see Auto-increment primary key column.

Important

If you do not have the value of the auto-increment primary key column, you can perform a range query on the first primary key column to read the data.

Read a single row of data

You can call the GetRow operation to read a single row of data. This operation is useful if you know the complete primary key and need to read only a few rows.

When you read a single row, you can configure the following conditions to filter the data.

By default, all columns are returned. You can configure the operation to return only specific columns.
Use a filter to return rows that meet specific filter conditions. For more information, see Filters.
If you have configured multiple data versions for the table, you can specify the maximum number of versions to read. You can also read data from a specific time range or with a specific version number. For more information about data versions, see Data versions and lifecycle.

The read operation has two possible outcomes:

If the row exists, the operation returns the primary key columns and attribute columns of the row.
If the row does not exist, the result does not include the row, and no error is reported.

Batch read data

You can call the BatchGetRow operation to read multiple rows in a single request or to read data from multiple tables simultaneously. This operation is useful if you know the complete primary keys and need to read many rows or data from multiple tables.

A BatchGetRow operation consists of multiple GetRow sub-operations. The process of constructing a sub-operation is the same as that for the GetRow operation.

When you batch read data, you can configure the following conditions to filter the data.

Read data from multiple tables in a single request.

A single request can read a maximum of 100 rows.
By default, all columns are returned. You can configure the operation to return only specific columns.

All rows in a batch read use the same parameter conditions. For example, ColumnsToGet=[colA] means that only the colA column is retrieved for all specified rows.
Use a filter to retrieve rows that meet specific filter conditions. For more information, see Filters.
If you have configured multiple data versions for the table, you can specify the maximum number of versions to read. You can also read data from a specific time range or with a specific version number. For more information about data versions, see Data versions and lifecycle.

The sub-operations of a BatchGetRow operation are executed independently, and Tablestore returns the result for each sub-operation.

Read data in a range

You can call the GetRange operation to read data within a specified range. This operation is useful if you can determine a complete primary key range or a key prefix.

Note

In Tablestore tables, all rows are sorted by primary key. The primary key of a table sequentially consists of all primary key columns. Therefore, the rows are not sorted based on a specific primary key column.

The GetRange operation follows the leftmost matching principle. Tablestore compares values in sequence from the first primary key column to the last primary key column to read data whose primary key values are in the specified range. For example, the primary key of a data table consists of the following primary key columns: PK1, PK2, and PK3. When data is read, Tablestore first determines whether the PK1 value of a row is in the range that is specified for the first primary key column. If the PK1 value of a row is in the range, Tablestore stops determining whether the values of other primary key columns of the row are in the ranges that are specified for each primary key column and returns the row. If the PK1 value of a row is not in the range, Tablestore continues to determine whether the values of other primary key columns of the row are in the ranges that are specified for each primary key column in the same manner as PK1.

When you read data in a range, you can configure the following conditions to filter the data.

Specify a key prefix and use the virtual points INF_MIN (infinity minimum) and INF_MAX (infinity maximum) for the other primary key columns. You can also specify a complete primary key range to read data.

Important
If you cannot determine the key prefix, you can perform a full table scan by setting the entire primary key range from INF_MIN to INF_MAX. This operation consumes a large amount of compute resources. Use it with caution.

If the range is large, the scan stops when the number of scanned rows or the data volume exceeds a certain limit. The operation returns the retrieved rows and information about the next primary key. You can use the returned next primary key information to initiate another request to retrieve the remaining rows in the range.
- The amount of scanned data reaches 4 MB.
- The number of scanned rows reaches 5,000.
- The number of returned rows reaches the upper limit.
- The read throughput is insufficient to read the next row of data because all reserved read throughput is consumed.
Read a specified maximum number of rows in forward or reverse order. For example, you can read a maximum of 5 rows in forward order.
By default, all columns are returned. You can configure the operation to return only specific columns.
Use a filter to retrieve rows that meet specific filter conditions. For more information, see Filters.

All rows in a range read use the same parameter conditions. For example, ColumnsToGet=[colA] means that only the colA column is retrieved for all rows in the range.
If you have configured multiple data versions for the table, you can specify the maximum number of versions to read. You can also read data from a specific time range or with a specific version number. For more information about data versions, see Data versions and lifecycle.

When you use GetRange to scan a large volume of data, Tablestore performs only one scan per request. The scan stops if the number of rows exceeds 5,000 or the data size exceeds 4 MB. Data that exceeds these limits is not returned. You must use pagination to retrieve the subsequent data.

How to use

Use the console

You can use the console to query a single row or query data in a range.

Log on to the Tablestore console.
On the Overview page, find the instance and click Instance Management in the Actions column.
On the Instance Details tab, find the table in the Data Table List section and click Query/Search in the Actions column.
On the Data Management tab, click Query Data. Then, select whether to read a single row or read data in a range.
Read a single row of data
1. In the Query Data dialog box, set Query Range to Single Row Query and select the table to query.
2. By default, all columns are returned. To return only specific attribute columns, disable Get All Columns and enter the desired attribute columns.
  
  Separate multiple attribute columns with a comma (,).
3. Enter the Primary Key Value for the target row.
  
  The completeness and accuracy of the primary key value affect the query results.
4. Enter the Max Versions to specify the number of versions to return.
5. Click OK.
Read data in a range
1. In the Query Data dialog box, set Query Range to Range Query and select the table to query.
2. By default, all columns are returned. To return only specific attribute columns, disable Get All Columns and enter the desired attribute columns.
  
  Separate multiple attribute columns with a comma (,).
3. Enter the start and end primary key columns.
4. Enter the Max Versions to specify the number of versions to return.
5. Set the sorting direction for the query result. You can select Forward Query or Backward Query.
6. Click OK.

Use the command-line interface (CLI)

You can use the command-line interface (CLI) to run the following commands to read data.

Run the get command to read a single row of data. For more information, see Read data.

The following example reads the row where the first primary key column has a value of "86" and the second primary key column has a value of 6771.
```
get --pk '["86",6771]'
```
Run the scan command to read data in a range. For more information, see Export data.

The following example reads data in reverse order from the primary key range between ["86",7000] and ["86",6770], and returns only the pid column.
```
scan --begin '["86",7000]' --end '["86",6770]' --backward --columns pid
```

Use an SDK

You can use the Java SDK, Go SDK, Python SDK, Node.js SDK, .NET SDK, or PHP SDK to read data. This section uses the Java software development kit (SDK) as an example.

Read a single row of data

When you read data, you can specify the data version, columns to read, filters, and regular expression filters.

Read the latest version of data and specific columns

The following sample code provides an example on how to read data of the latest version from the specified columns of a row in a data table.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);

    // Specify the columns that you want to read. 
    criteria.addColumnsToGet("Col0");
    getRowResponse = client.getRow(new GetRowRequest(criteria));
    row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Reading data with a filter

The following sample code provides an example on how to read data of the latest version from a row in a data table and use a filter to filter data based on the value of the Col0 column.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);

    // Configure a filter to return a row in which the value of the Col0 column is 0. 
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    // If the Col0 column does not exist, the row is not returned. 
    singleColumnValueFilter.setPassIfMissing(false);
    criteria.setFilter(singleColumnValueFilter);

    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Use a regular expression filter when reading data

The following sample code provides an example on how to read the data of the Col1 column from a row in a data table and use a regular expression to filter data in the column.

private static void getRow(SyncClient client, String pkValue) {
    // Specify the name of the data table. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>");
 
    // Construct the primary key. 
    PrimaryKey primaryKey = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue))
        .build();
    criteria.setPrimaryKey(primaryKey);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN, ColumnValue.fromLong(100));
    criteria.setFilter(filter);
 
    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Batch read data

The following sample code provides an example on how to configure the version conditions, columns to read, and filters to read 10 rows of data.

private static void batchGetRow(SyncClient client) {
    // Specify the name of the data table. 
    MultiRowQueryCriteria multiRowQueryCriteria = new MultiRowQueryCriteria("<TABLE_NAME>");
    // Specify 10 rows that you want to read. 
    for (int i = 0; i < 10; i++) {
        PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("pk" + i));
        PrimaryKey primaryKey = primaryKeyBuilder.build();
        multiRowQueryCriteria.addRow(primaryKey);
    }
    // Add conditions. 
    multiRowQueryCriteria.setMaxVersions(1);
    multiRowQueryCriteria.addColumnsToGet("Col0");
    multiRowQueryCriteria.addColumnsToGet("Col1");
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    singleColumnValueFilter.setPassIfMissing(false);
    multiRowQueryCriteria.setFilter(singleColumnValueFilter);

    BatchGetRowRequest batchGetRowRequest = new BatchGetRowRequest();
    // The BatchGetRow operation allows you to read data from multiple tables. Each multiRowQueryCriteria parameter specifies query conditions for one table. You can add multiple multiRowQueryCriteria parameters to read data from multiple tables. 
    batchGetRowRequest.addMultiRowQueryCriteria(multiRowQueryCriteria);

    BatchGetRowResponse batchGetRowResponse = client.batchGetRow(batchGetRowRequest);

    System.out.println("Whether all operations are successful:" + batchGetRowResponse.isAllSucceed());
    System.out.println("Read complete. Result:");
    for (BatchGetRowResponse.RowResult rowResult : batchGetRowResponse.getSucceedRows()) {
        System.out.println(rowResult.getRow());
    }
    if (!batchGetRowResponse.isAllSucceed()) {
        for (BatchGetRowResponse.RowResult rowResult : batchGetRowResponse.getFailedRows()) {
            System.out.println("Failed rows:" + batchGetRowRequest.getPrimaryKey(rowResult.getTableName(), rowResult.getIndex()));
            System.out.println("Cause of failures:" + rowResult.getError());
        }

        /**
         * You can use the createRequestForRetry method to construct another request to retry the operations on failed rows. In this example, only the retry request is constructed. 
         * We recommend that you use the custom retry policy in Tablestore SDKs as the retry method. This way, you can retry failed rows after batch operations are performed. After you specify the retry policy, you do not need to add retry code to call the operation. 
         */
        BatchGetRowRequest retryRequest = batchGetRowRequest.createRequestForRetry(batchGetRowResponse.getFailedRows());
    }
}

Read data in a range

Read data within a defined range

The following sample code provides an example on how to read data whose primary key values are in the specified range in the forward direction. If the value of the nextStartPrimaryKey parameter is empty in the response, all data whose primary key values are in the specified range is read. Otherwise, you must initiate another request until all data whose primary key values are in the specified range is returned.

private static void getRange(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria("<TABLE_NAME>");

    // Specify the start primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(startPkValue));
    rangeRowQueryCriteria.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(endPkValue));
    rangeRowQueryCriteria.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeRowQueryCriteria.setMaxVersions(1);

    System.out.println("GetRange result:");
    while (true) {
        GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
        for (Row row : getRangeResponse.getRows()) {
            System.out.println(row);
        }

        // If the value of the nextStartPrimaryKey parameter is not null, continue the read operation. 
        if (getRangeResponse.getNextStartPrimaryKey() != null) {
            rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}

Read data based on a range defined by the first primary key column

The following sample code provides an example on how to read data within the range determined by the value of the first primary key column in the forward direction. In this example, the start value of the second primary key column is set to INF_MIN, and the end value of the second primary key column is set to INF_MAX. If the value of the nextStartPrimaryKey parameter is null in the response, all data in the specified range is read. Otherwise, you must initiate another request until all data within the range determined by the value of the first primary key column is returned.

private static void getRange(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria("<TABLE_NAME>");
    // Specify the start primary key. In this example, two primary key columns are used. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromString(startPkValue));// Set the value of the first primary key column to a specific value. 
    primaryKeyBuilder.addPrimaryKeyColumn("pk2", PrimaryKeyValue.INF_MIN);// Set the value of the second primary key column to an infinitely small value. 
    rangeRowQueryCriteria.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromString(endPkValue));// Set the value of the first primary key column to a specific value. 
    primaryKeyBuilder.addPrimaryKeyColumn("pk2", PrimaryKeyValue.INF_MAX);// Set the value of the second primary key column to an infinitely great value. 
    rangeRowQueryCriteria.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeRowQueryCriteria.setMaxVersions(1);

    System.out.println("GetRange result:");
    while (true) {
        GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
        for (Row row : getRangeResponse.getRows()) {
            System.out.println(row);
        }

        // If the value of the nextStartPrimaryKey parameter is not null, continue the read operation. 
        if (getRangeResponse.getNextStartPrimaryKey() != null) {
            rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}

Read data within a defined range and apply a regular expression filter to a specific column

The following sample code provides an example on how to read data whose primary key values are in the range of ["pk:2020-01-01.log", "pk:2021-01-01.log") from the Col1 column and use a regular expression to filter data in the Col1 column.

private static void getRange(SyncClient client) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria criteria = new RangeRowQueryCriteria("<TABLE_NAME>");
 
    // Specify ["pk:2020-01-01.log", "pk:2021-01-01.log") as the range of the primary key of the data that you want to read. The range is a left-closed, right-open interval. 
    PrimaryKey pk0 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2020-01-01.log"))
        .build();
    PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2021-01-01.log"))
        .build();
    criteria.setInclusiveStartPrimaryKey(pk0);
    criteria.setExclusiveEndPrimaryKey(pk1);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN,ColumnValue.fromLong(100));
    criteria.setFilter(filter);

    while (true) {
        GetRangeResponse resp = client.getRange(new GetRangeRequest(criteria));
        for (Row row : resp.getRows()) {
            // do something
            System.out.println(row);
        }
        if (resp.getNextStartPrimaryKey() != null) {
            criteria.setInclusiveStartPrimaryKey(resp.getNextStartPrimaryKey());
        } else {
            break;
        }
   }
}

Billing description

You are charged based on the number of Capacity Units (CUs) that an operation consumes. Charges for metered read and write CUs are separate from charges for reserved read and write CUs. The instance type determines whether metered or reserved CUs are consumed.

Note

For more information about instance types and CUs, see Instances and Read/write throughput.

Reading data consumes Read Capacity Units (RCUs) but not Write Capacity Units (WCUs). The consumed RCUs are calculated as follows:

RCUs consumed by a GetRow operation

The number of consumed RCUs is based on the total size of the read data, which is the sum of the data sizes of the primary key and the retrieved attribute columns. This total size is divided by 4 KB and rounded up to the nearest integer. If the specified row does not exist, the operation consumes one RCU.
RCUs consumed by a BatchGetRow operation

The RCUs for a BatchGetRow operation are calculated by treating each RowInBatchGetRowRequest as a separate GetRow operation.
RCUs consumed by a GetRange operation

The number of consumed RCUs is based on the total size of all scanned data, which is the sum of the data sizes of the primary keys and the scanned attribute columns for all rows in the range. This total size is divided by 4 KB and rounded up to the nearest integer. For example, if a scan covers 10 rows and the combined size of the primary key and the scanned attribute columns for each row is 330 bytes, the total data size is 3.3 KB (10 rows × 330 bytes). This operation consumes 1 RCU.