Read data in a range by using the Tablestore Python SDK - Tablestore

Prerequisites

Method

def get_range(self, table_name, direction,
              inclusive_start_primary_key,
              exclusive_end_primary_key,
              columns_to_get=None,
              limit=None,
              column_filter=None,
              max_version=1,
              time_range=None,
              start_column=None,
              end_column=None,
              token=None,
              transaction_id=None):

Parameters

Parameter	Type	Description
table_name (Required)	str	The name of the data table.
inclusive_start_primary_key (Required)	List[Tuple]	The start primary key of the range. The key includes the names and values of the primary key columns. The result includes the row that corresponds to the start primary key. The number and data types of the primary key columns must match the table schema. For a forward read, the start primary key must be less than the end primary key. For a backward read, the start primary key must be greater than the end primary key. `INF_MIN` represents negative infinity, and `INF_MAX` represents positive infinity.
exclusive_end_primary_key (Required)	List[Tuple]	The end primary key of the range. The key includes the names and values of the primary key columns. The result excludes the row that corresponds to the end primary key. The number and data types of the primary key columns must match the table schema. `INF_MIN` represents negative infinity, and `INF_MAX` represents positive infinity.
direction (Required)	Direction	The read direction. `FORWARD`: The default value. Performs a forward read, which reads data in ascending order of primary key values. `BACKWARD`: Performs a backward read, which reads data in descending order of primary key values.
max_version (Optional)	int	The maximum number of versions to return for each column. The default value is 1. You must specify either `max_version` or `time_range`. If the number of matching data versions exceeds the maximum number of versions, Tablestore returns the newest versions up to that limit.
time_range (Optional)	Tuple	The version range of the data. You must specify either `max_version` or `time_range`. Each attribute column in a Tablestore table can have multiple versions. If you specify a time range, only data within that range is returned.
limit (Optional)	int	The maximum number of rows to return in a single request. This value must be greater than 0. If the number of matching rows exceeds this limit, the response includes the specified number of rows and the start primary key for the next request.
columns_to_get (Optional)	List[str]	The data columns to read. Specify primary key columns or attribute columns. If you omit this parameter, the method returns the entire row. If a returned row does not contain any of the specified data columns, the row is still returned, but its list of attribute columns is empty.
column_filter (Optional)	ColumnCondition	The filter condition. For more information, see Filter. If you specify both `columns_to_get` and `column_filter`, the system first retrieves the columns specified in `columns_to_get` and then applies the `column_filter` to the results.
transaction_id (Optional)	str	The local transaction ID. This ID uniquely identifies a local transaction. For more information, see Local transactions.

Examples

The following example reads all rows from the test_table table with primary key values greater than row1.

try:
    # Set the start primary key for the query.
    inclusive_start_primary_key = [('id', 'row1')]
    # Set the end primary key for the query. The end primary key is exclusive.
    exclusive_end_primary_key = [('id', INF_MAX)]

    # Call the get_range method to query data.
    consumed, next_start_primary_key, row_list, next_token = client.get_range('test_table', Direction.FORWARD,
                                                                              inclusive_start_primary_key, exclusive_end_primary_key)

    # Process the results.
    print('* Read CU Cost: %s' % consumed.read)
    print('* Write CU Cost: %s' % consumed.write)
    print('* Rows Data:')
    for row in row_list:
        print(row.primary_key, row.attribute_columns)
except Exception as e:
    print("Range get failed with error: %s" % e)

A single range scan can return a maximum of 5,000 rows or 4 MB of data. If the result set exceeds this limit, the response includes a token (next_start_primary_key) for retrieving the next batch. Use this token to iterate through the results, as shown in the following example.

while True:
    # Call the get_range method to query data.
    consumed, next_start_primary_key, row_list, next_token = client.get_range('test_table', Direction.FORWARD,
                                                                              inclusive_start_primary_key,
                                                                              exclusive_end_primary_key)

    # Process the results.
    print('* Read CU Cost: %s' % consumed.read)
    print('* Write CU Cost: %s' % consumed.write)
    print('* Rows Count: %s' % len(row_list))
    print('* Rows Data:')
    for row in row_list:
        print(row.primary_key, row.attribute_columns)

    # Set the start primary key for the next read.
    if next_start_primary_key:
        inclusive_start_primary_key = next_start_primary_key
    else:
        break

Customize the query with the following options:

Set a version range to return only data within that range.

# Set the version range for the query to the last 24 hours.
time_range = (int(time.time() * 1000 - 86400 * 1000), int(time.time() * 1000))

consumed, next_start_primary_key, row_list, next_token = client.get_range('test_table', Direction.FORWARD,
                                                                          inclusive_start_primary_key, exclusive_end_primary_key,
                                                                          time_range= time_range)

Specify the attribute columns to read.

columns_to_get = ['col1']

# Call the get_range method to query data.
consumed, next_start_primary_key, row_list, next_token = client.get_range('test_table', Direction.FORWARD,
                                                                          inclusive_start_primary_key, exclusive_end_primary_key,
                                                                          columns_to_get)

Set the maximum number of rows to return in a single request.

limit = 10

# Call the get_range method to query data.
consumed, next_start_primary_key, row_list, next_token = client.get_range('test_table', Direction.FORWARD,
                                                                          inclusive_start_primary_key, exclusive_end_primary_key,
                                                                          limit=limit)

References

Batch read data