How do I read the data of a Wide Column model? - Tablestore

Tablestore provides multiple operations for you to read data. You can call the GetRow operation to read a single row of data, the BatchGetRow operation to read multiple rows of data at a time, and the GetRange operation to read data whose primary key values are in the specified range.

Note

Rows are the basic units of tables. Rows consist of primary key columns and attribute columns. Primary key columns are required for each row. Rows in a table contain primary key columns of the same names and same data types. Attribute columns are optional for each row. Rows in a table can contain different attribute columns. For more information, see Overview.

Usage notes

If you want to read data from a table that contains an auto-increment primary key column, make sure that you have queried the values of all primary key columns that include the values of the auto-increment primary key column. For more information, see Configure an auto-increment primary key column.

Important

If no value is recorded for the auto-increment primary key column, you can call the GetRange operation to specify the range within which data is read based on primary key values from the first primary key column.

Read a single row of data

You can call the GetRow operation to read a single row of data. This method is applicable to scenarios in which all primary key columns of a table can be determined and the number of rows to be read is small.

When you read a single row of data, you can use the following methods to filter the data based on your business requirements:

Specify the columns that you want to read. If you do not specify a column, all columns are returned.
Use a filter to obtain the data that matches the filter conditions. For more information, see Configure a filter.
Specify the maximum number or time range of data versions that you want to read or the required version numbers if the max versions feature is enabled for the data table. For more information, see Data versions and TTL.

After you call the GetRow operation, one of the following results may be returned:

If the row exists, the primary key columns and attribute columns of the row are returned.
If the row does not exist, no row is returned and no error is reported.

Read multiple rows of data at a time

You can call the BatchGetRow operation to read multiple rows of data from one or more tables at a time. This method is applicable to scenarios in which all primary key columns of a table can be determined and the number of rows to be read is large or data is to be read from multiple tables.

The BatchGetRow operation consists of multiple GetRow operations. When you call the BatchGetRow operation, the process of constructing each GetRow operation is the same as the process of constructing the GetRow operation when you call the GetRow operation.

When you read multiple rows of data at a time, you can use the following methods to filter the data based on your business requirements:

Read data from multiple tables at a time.
You can call the BatchGetRow operation to read a maximum of 100 rows at a time.
Specify the columns that you want to read. If you do not specify a column, all columns are returned.
The BatchGetRow operation uses the same parameter settings for all rows. For example, if the ColumnsToGet parameter is set to [colA], only the value of the colA column is read from all rows.
Use a filter to obtain the data that matches the filter conditions. For more information, see Configure a filter.
Specify the maximum number or time range of data versions that you want to read or the required version numbers if the max versions feature is enabled for the data table. For more information, see Data versions and TTL.

If you call the BatchGetRow operation, each GetRow operation is separately performed, and Tablestore separately returns the response to each GetRow operation.

Read data whose primary key values are in the specified range

You can call the GetRange operation to read data whose primary key values are in the specified range. This method is applicable to scenarios in which the range of all primary key columns of a table or the prefix of primary key columns can be determined.

Note

In Tablestore tables, all rows are sorted by the primary key. The primary key of a table sequentially consists of all primary key columns. Therefore, the rows are not sorted based on a specific primary key column.Tablestore

The GetRange operation follows the leftmost matching principle. Tablestore compares values in sequence from the first primary key column to the last primary key column to read data whose primary key values are in the specified range. For example, the primary key of a data table consists of the following primary key columns: PK1, PK2, and PK3. When data is read, Tablestore first determines whether the PK1 value of a row is in the range that is specified for the first primary key column. If the PK1 value of a row is in the range, Tablestore stops determining whether the values of other primary key columns of the row are in the ranges that are specified for each primary key column and returns the row. If the PK1 value of a row is not in the range, Tablestore continues to determine whether the values of other primary key columns of the row are in the ranges that are specified for each primary key column in the same manner as PK1.

When you read data whose primary key values are in the specified range, you can use the following methods to filter the data based on your business requirements:

If the prefix of primary key columns is specified, use virtual columns whose data is of the INF_MIN type and INF_MAX type to specify the range of primary key columns, or specify the range of all primary key columns to read data.
Important
If you cannot determine the prefix of primary key columns, you can specify the start primary key column whose data is of the INF_MIN type and the end primary key column whose data is of the INF_MAX type to determine the range of all primary key columns of a table. This operation scans all data in the table but consumes a large amount of computing resources. Proceed with caution.
If the range is large and the number of scanned rows or the volume of scanned data exceeds the upper limit, the scan stops, and the rows that are read and information about the primary key of the next row are returned. You can initiate a request to start from where the last operation left off and read the remaining rows based on the information about the primary key of the next row returned by the previous operation.
If one of the following conditions is met, the GetRange operation may stop and return data:
- The amount of scanned data reaches 4 MB.
- The number of scanned rows reaches 5,000.
- The number of returned rows reaches the upper limit.
- The read throughput is insufficient to read the next row of data because all reserved read throughput is consumed.
Specify the maximum number of rows that can be returned in the forward or backward direction. For example, specify that up to five rows of data are returned in the forward direction.
Specify the columns that you want to read. If you do not specify a column, all columns are returned.
Use a filter to obtain the data that matches the filter conditions. For more information, see Configure a filter.
The GetRange operation uses the same parameter settings for all rows. For example, if the ColumnsToGet parameter is set to [colA], only the value of the colA column is read from all rows.
Specify the maximum number or time range of data versions that you want to read or the required version numbers if the max versions feature is enabled for the data table. For more information, see Data versions and TTL.

Each GetRange call scans data once. If the size of data that you want to scan by calling the GetRange operation is large, the scanning stops when the number of scanned rows reaches 5,000 or the size of scanned data reaches 4 MB. Tablestore does not return the remaining data that meets the query conditions. You can use the paging method to obtain the remaining data that meets the query conditions.

Usage methods

Read data by using the Tablestore console

You can use the Tablestore console to read a single row of data or read data whose primary key values are in the specified range.

Log on to the Tablestore console.
On the Overview page, find the instance that you want to manage and click Manage Instance in the Actions column.
In the Tables section of the Instance Details tab, find the data table whose data you want to read and click Query in the Actions column.
On the Query Data tab, click Search and perform the following operations based on your business requirements.
Read a single row of data
1. In the Search dialog box, set the Modes parameter to Get Row and select the table whose data you want to query.
2. By default, all columns are returned. If you want to specify the attribute columns that you want to read, turn off All Columns and enter the names of the attribute columns that you want to read.
  Separate multiple attribute columns with commas (,).
3. Configure the Primary Key Value parameter of the row that you want to query.
  The integrity and accuracy of the primary key values affect the query results.
4. Configure the Max Versions parameter to specify the maximum number of versions to return.
5. Click OK.
Read data whose primary key values are in the specified range
1. In the Search dialog box, set the Modes parameter to Range Search and select the table whose data you want to query.
2. By default, all columns are returned. If you want to specify the attribute columns that you want to read, turn off All Columns and enter the names of the attribute columns that you want to read.
  Separate multiple attribute columns with commas (,).
3. Configure the Start Primary Key Column and End Primary Key Column parameters.
4. Configure the Max Versions parameter to specify the maximum number of versions to return.
5. Set the Sequence parameter to Forward Search or Backward Search.
6. Click OK.

Read data by using the Tablestore CLI

You can run the following commands to read data by using the Tablestore CLI.

Run the get command to read a single row of data. For more information, see the "Read data" section of the Operations on data topic.
The following sample code provides an example on how to read a row of data whose value of the first primary key column is 86 and value of the second primary key column is 6771.
```
get --pk '["86",6771]'
```
Run the scan command to read data whose primary key values are in the specified range. For more information, see the "Export data" section of the Operations on data topic.
The following sample code provides an example on how to read data whose primary key values are in the range from ["86",7000] to ["86",6770] in the backward direction and return only data in the pid column.
```
scan --begin '["86",7000]' --end '["86",6770]' --backward --columns pid
```

Read data by using Tablestore SDKs

You can use Tablestore SDK for Java, Tablestore SDK for Go, Tablestore SDK for Python, Tablestore SDK for Node.js,Tablestore SDK for .NET, and Tablestore SDK for PHP to read data. In this example, Tablestore SDK for Java is used to read data.

Read a single row of data

You can specify the data version and the columns that you want to read, and filter the data by using a filter or a regular expression.

Read data of the latest version from the specified columns of a row

The following sample code provides an example on how to read data of the latest version from the specified columns of a row in a data table.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);

    // Specify the columns that you want to read. 
    criteria.addColumnsToGet("Col0");
    getRowResponse = client.getRow(new GetRowRequest(criteria));
    row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Use a filter to filter data that is read

The following sample code provides an example on how to read data of the latest version from a row in a data table and use a filter to filter data based on the value of the Col0 column.

private static void getRow(SyncClient client, String pkValue) {
    // Construct the primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue));
    PrimaryKey primaryKey = primaryKeyBuilder.build();

    // Specify the table name and primary key to read a row of data. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>", primaryKey);
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);

    // Configure a filter to return a row in which the value of the Col0 column is 0. 
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    // If the Col0 column does not exist, the row is not returned. 
    singleColumnValueFilter.setPassIfMissing(false);
    criteria.setFilter(singleColumnValueFilter);

    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Use a regular expression to filter data that is read

The following sample code provides an example on how to read the data of the Col1 column from a row in a data table and use a regular expression to filter data in the column.

private static void getRow(SyncClient client, String pkValue) {
    // Specify the name of the data table. 
    SingleRowQueryCriteria criteria = new SingleRowQueryCriteria("<TABLE_NAME>");
 
    // Construct the primary key. 
    PrimaryKey primaryKey = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(pkValue))
        .build();
    criteria.setPrimaryKey(primaryKey);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN, ColumnValue.fromLong(100));
    criteria.setFilter(filter);
 
    GetRowResponse getRowResponse = client.getRow(new GetRowRequest(criteria));
    Row row = getRowResponse.getRow();

    System.out.println("Read complete. Result:");
    System.out.println(row);
}

Read multiple rows of data at a time

The following sample code provides an example on how to configure the version conditions, columns to read, and filters to read 10 rows of data.

private static void batchGetRow(SyncClient client) {
    // Specify the name of the data table. 
    MultiRowQueryCriteria multiRowQueryCriteria = new MultiRowQueryCriteria("<TABLE_NAME>");
    // Specify 10 rows that you want to read. 
    for (int i = 0; i < 10; i++) {
        PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
        primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("pk" + i));
        PrimaryKey primaryKey = primaryKeyBuilder.build();
        multiRowQueryCriteria.addRow(primaryKey);
    }
    // Add conditions. 
    multiRowQueryCriteria.setMaxVersions(1);
    multiRowQueryCriteria.addColumnsToGet("Col0");
    multiRowQueryCriteria.addColumnsToGet("Col1");
    SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("Col0",
            SingleColumnValueFilter.CompareOperator.EQUAL, ColumnValue.fromLong(0));
    singleColumnValueFilter.setPassIfMissing(false);
    multiRowQueryCriteria.setFilter(singleColumnValueFilter);

    BatchGetRowRequest batchGetRowRequest = new BatchGetRowRequest();
    // BatchGetRow allows you to read data from multiple tables. Each multiRowQueryCriteria parameter specifies query conditions for one table. You can add multiple multiRowQueryCriteria parameters to read data from multiple tables. 
    batchGetRowRequest.addMultiRowQueryCriteria(multiRowQueryCriteria);

    BatchGetRowResponse batchGetRowResponse = client.batchGetRow(batchGetRowRequest);

    System.out.println("Whether all operations are successful:" + batchGetRowResponse.isAllSucceed());
    System.out.println("Read complete. Result:");
    for (BatchGetRowResponse.RowResult rowResult : batchGetRowResponse.getSucceedRows()) {
        System.out.println(rowResult.getRow());
    }
    if (!batchGetRowResponse.isAllSucceed()) {
        for (BatchGetRowResponse.RowResult rowResult : batchGetRowResponse.getFailedRows()) {
            System.out.println("Failed rows:" + batchGetRowRequest.getPrimaryKey(rowResult.getTableName(), rowResult.getIndex()));
            System.out.println("Cause of failures:" + rowResult.getError());
        }

        /**
         * You can use the createRequestForRetry method to construct another request to retry the operations on failed rows. Only the retry request is constructed here. 
         * We recommend that you use the custom retry policy in Tablestore SDKs as the retry method. This feature allows you to retry failed rows after batch operations. After you set the retry policy, you do not need to add retry code to call the operation. 
         */
        BatchGetRowRequest retryRequest = batchGetRowRequest.createRequestForRetry(batchGetRowResponse.getFailedRows());
    }
}

Read data whose primary key values are in the specified range

The following sample code provides an example on how to read data whose primary key values are in the specified range in the forward direction. If the value of the nextStartPrimaryKey parameter is empty in the response, all data whose primary key values are in the specified range is read.

private static void getRange(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria("<TABLE_NAME>");

    // Specify the start primary key. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(startPkValue));
    rangeRowQueryCriteria.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString(endPkValue));
    rangeRowQueryCriteria.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeRowQueryCriteria.setMaxVersions(1);

    System.out.println("GetRange result:");
    while (true) {
        GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
        for (Row row : getRangeResponse.getRows()) {
            System.out.println(row);
        }

        // If the value of the nextStartPrimaryKey parameter is not null, continue the read operation. 
        if (getRangeResponse.getNextStartPrimaryKey() != null) {
            rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}

Read data within the range determined by the value of the first primary key column

The following sample code provides an example on how to read data in the forward direction within the range determined by the value of the first primary key column. The start value of the second primary key column is of the INF_MIN type. The end value of the second primary key column is of the INF_MAX type. If the value of the nextStartPrimaryKey parameter is null in the response, all data in the specified range is read.

private static void getRange(SyncClient client, String startPkValue, String endPkValue) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria rangeRowQueryCriteria = new RangeRowQueryCriteria("<TABLE_NAME>");
    // Specify the start primary key. In this example, two primary key columns are used. 
    PrimaryKeyBuilder primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromString(startPkValue));// Set the value of the first primary key column to a specific value. 
    primaryKeyBuilder.addPrimaryKeyColumn("pk2", PrimaryKeyValue.INF_MIN);// Set the value of the second primary key column to an infinitely small value. 
    rangeRowQueryCriteria.setInclusiveStartPrimaryKey(primaryKeyBuilder.build());

    // Specify the end primary key. 
    primaryKeyBuilder = PrimaryKeyBuilder.createPrimaryKeyBuilder();
    primaryKeyBuilder.addPrimaryKeyColumn("pk1", PrimaryKeyValue.fromString(endPkValue));// Set the value of the first primary key column to a specific value. 
    primaryKeyBuilder.addPrimaryKeyColumn("pk2", PrimaryKeyValue.INF_MAX);// Set the value of the second primary key column to an infinitely great value. 
    rangeRowQueryCriteria.setExclusiveEndPrimaryKey(primaryKeyBuilder.build());

    rangeRowQueryCriteria.setMaxVersions(1);

    System.out.println("GetRange result:");
    while (true) {
        GetRangeResponse getRangeResponse = client.getRange(new GetRangeRequest(rangeRowQueryCriteria));
        for (Row row : getRangeResponse.getRows()) {
            System.out.println(row);
        }

        // If the value of the nextStartPrimaryKey parameter is not null, continue the read operation. 
        if (getRangeResponse.getNextStartPrimaryKey() != null) {
            rangeRowQueryCriteria.setInclusiveStartPrimaryKey(getRangeResponse.getNextStartPrimaryKey());
        } else {
            break;
        }
    }
}

Read data whose primary key values are in the specified range and use a regular expression to filter data in the specified column

The following sample code provides an example on how to read data whose primary key values are in the range of ["pk:2020-01-01.log", "pk:2021-01-01.log") from the Col1 column and use a regular expression to filter data in the Col1 column.

private static void getRange(SyncClient client) {
    // Specify the name of the data table. 
    RangeRowQueryCriteria criteria = new RangeRowQueryCriteria("<TABLE_NAME>");
 
    // Specify ["pk:2020-01-01.log", "pk:2021-01-01.log") as the range of the primary key of the data that you want to read. The range is a left-closed and right-open interval. 
    PrimaryKey pk0 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2020-01-01.log"))
        .build();
    PrimaryKey pk1 = PrimaryKeyBuilder.createPrimaryKeyBuilder()
        .addPrimaryKeyColumn("pk", PrimaryKeyValue.fromString("2021-01-01.log"))
        .build();
    criteria.setInclusiveStartPrimaryKey(pk0);
    criteria.setExclusiveEndPrimaryKey(pk1);
 
    // Set the MaxVersions parameter to 1 to read the latest version of data. 
    criteria.setMaxVersions(1);
 
    // Configure a filter. A row is returned when cast<int>(regex(Col1)) is greater than 100. 
    RegexRule regexRule = new RegexRule("t1:([0-9]+),", RegexRule.CastType.VT_INTEGER);
    SingleColumnValueRegexFilter filter =  new SingleColumnValueRegexFilter("Col1",
        regexRule,SingleColumnValueRegexFilter.CompareOperator.GREATER_THAN,ColumnValue.fromLong(100));
    criteria.setFilter(filter);

    while (true) {
        GetRangeResponse resp = client.getRange(new GetRangeRequest(criteria));
        for (Row row : resp.getRows()) {
            // do something
            System.out.println(row);
        }
        if (resp.getNextStartPrimaryKey() != null) {
            criteria.setInclusiveStartPrimaryKey(resp.getNextStartPrimaryKey());
        } else {
            break;
        }
   }
}

Billing

You are charged based on the number of capacity units (CUs) consumed by an operation. The consumed CUs may include the metered read and write CUs and the reserved read and write CUs based on the instance type.

Note

For more information about instance types and CUs, see Instance and Read and write throughput.

The read operations do not consume write CUs. The number of read CUs consumed varies based on the operations that you call.

Read CUs consumed by the GetRow operation
The number of read CUs consumed is rounded up from the calculation result of the following formula: Number of read CUs consumed = (Size of the data in all primary key columns of the row + Size of the data in the attribute columns that are actually read)/4 KB. If the specified row does not exist, one read CU is consumed by the operation.
Read CUs consumed by the BatchGetRow operation
Each RowInBatchGetRowRequest operation is considered as one GetRow operation. The number of read CUs consumed by the BatchGetRow operation is equal to the total number of read CUs consumed by all the GetRow operations that constitute the BatchGetRow operation.
Read CUs consumed by the GetRange operation
The number of read CUs consumed by the GetRange operation is calculated from the start point of the range to the start point of the next row that is unread. The number of read CUs consumed by the GetRange operation is rounded up from the calculation result of the following formula: Number of read CUs consumed = (Size of the data in all primary key columns of the rows that meet the query conditions + Size of the data in the attribute columns that are actually read)/4 KB. For example, if 10 rows that meet the query conditions are read and the sum of the size of the data in all primary key columns of the rows and the size of the data in the attribute columns that are actually read is 330 bytes, the number of read CUs consumed is rounded up from the calculation result of the following formula: Number of read CUs consumed = (3.3 KB/4 KB). In this case, the GetRange operation consumes one read CU.