Use Tablestore SDK for Python to write data - Tablestore

Tablestore allows you to write a single row of data, update a single row of data, and write multiple rows of data at a time by calling different operations. To write data to a data table, you must specify the complete primary key information and the attribute columns that you want to add, delete, or modify. To write data to a highly concurrent application, you can configure row existence conditions or column conditions to update data based on the specified conditions.

Operations

Tablestore provides the PutRow, UpdateRow, and BatchWriteRow operations to allow you to write data. Select an operation to write data based on your business requirements.

Operation	Description	Scenario
Insert a single row of data	You can call the PutRow operation to insert a single row of data. If the row exists, Tablestore deletes all versions of data in all columns from the existing row and writes new data.	This operation is suitable for scenarios in which you want to write a small amount of data.
Update a single row of data	You can call the UpdateRow operation to update a single row of data. You can add attribute columns to a row, remove attribute columns from a row, delete a specific version of data from an attribute column, or update the value of an attribute column. If the row does not exist, a new row is inserted.	This operation is suitable for scenarios in which you want to update the existing data. For example, you want to remove an attribute column, delete a specific version of data, or update the value of an attribute column.
Write multiple rows of data at the same time	You can call the BatchWriteRow operation to write multiple rows of data to one or more tables at the same time. The BatchWriteRow operation consists of multiple PutRow, UpdateRow, and DeleteRow operations. When you call the BatchWriteRow operation, the process of constructing a suboperation is the same as the process of calling the PutRow, UpdateRow, or DeleteRow operation.	This operation is suitable for scenarios in which you want to write, delete, or update a large amount of data and scenarios in which you want to write, delete, and update data at the same time.

Prerequisites

An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
A data table is created and data is written to the data table.

Insert a single row of data

You can call the PutRow operation to write a row of data. If the row exists, Tablestore deletes all versions of data in all columns from the existing row and writes new data.

API operation

"""
Description: This operation writes a single row of data. The number of capacity units (CUs) that are consumed by the operation is returned. 
table_name: the name of the table. 
row: the row that you want to write to the data table, including primary key columns and attribute columns. 
condition: the condition that must be met to perform the operation. After you specify a condition, Tablestore checks whether the specified condition is met before Tablestore performs the operation. The operation is performed only if the condition is met. The condition parameter is an instance of the tablestore.metadata.Condition class. 
Row existence conditions and conditions based on column values are supported. If you want to specify a row existence condition, you can set the condition parameter to IGNORE, EXPECT_EXIST, or EXPECT_NOT_EXIST based on your business requirements. 
return_type: the type of data that you want to return. The return_type parameter is an instance of the tablestore.metadata.ReturnType class. Only the primary key can be returned. In most cases, this parameter is used by the auto-increment primary key column feature. 
Response: the number of CUs that are consumed by the operation and the row data that is returned. 
consumed: the number of CUs consumed by the operation. The consumed parameter is an instance of the tablestore.metadata.CapacityUnit class. 
return_row: the row data that is returned, which may include primary key columns and attribute columns. 
Example:
    primary_key = [('gid',1), ('uid',101)]
    attribute_columns = [('name','Mary'), ('mobile',111111111), ('address','City A in China'), ('age',20)]
    row = Row(primary_key, attribute_columns)
    condition = Condition('EXPECT_NOT_EXIST')
    consumed, return_row = client.put_row('myTable', row, condition)
"""
def put_row(self, table_name, row, condition = None, return_type = None, transaction_id = None):

Parameters

Parameter	Description
table_name	The name of the data table.
row	The row of data. This parameter includes the following configuration items: `primary_key`: the primary key information about the row. The primary key information includes the name, type, and value of the primary key column. Important The number and types of primary key columns that you specify must be the same as the actual number and types of primary key columns in the table. If a primary key column is an auto-increment primary key column, you need to only set the value of the auto-increment primary key column to a placeholder. For more information, see Configure an auto-increment primary key column. `attribute_columns`: the attribute columns of the row. An attribute column is specified by parameters in the following sequence: the attribute column name, attribute column value (ColumnValue), attribute column type (ColumnType), and timestamp. The attribute column type and timestamp are optional. The attribute column name is the name of the attribute column, and the attribute column type is the data type of the attribute column. For more information, see Naming conventions and data types. You can set the ColumnType parameter to ColumnType.INTEGER that specifies an INTEGER value, ColumnType.STRING that specifies a UTF-8 encoded STRING, ColumnType.BINARY that specifies a BINARY value, ColumnType.BOOLEAN that specifies a BOOLEAN value, or ColumnType.DOUBLE that specifies a DOUBLE value. If the attribute column type is BINARY, you must set the ColumnType parameter to ColumnType.BINARY. In other cases, you can leave the ColumnType parameter empty. The timestamp is a data version number. For more information, see Data versions and TTL. You can use the data version number that is automatically generated by the system or specify a custom data version number. If you do not specify a data version number, the data version number that is automatically generated by the system is used. The data version number generated by Tablestore is the number of milliseconds that have elapsed since 00:00:00 UTC on January 1, 1970. If you specify a custom version number, make sure that the version number is a 64-bit timestamp that is accurate to milliseconds and is in the valid version range.
condition	The condition that must be met to perform the operation. You can specify a row existence condition or a condition based on column values. For more information, see Configure conditional update. Note RowExistenceExpectation.IGNORE specifies that new data is inserted into a row regardless of whether the specified row exists. If the specified row exists, the existing data is overwritten. RowExistenceExpectation.EXPECT_EXIST specifies that new data is inserted only if the specified row exists. The existing data is overwritten. RowExistenceExpectation.EXPECT_NOT_EXIST specifies that data is inserted only if the specified row does not exist.
return_type	The type of data that you want to return. The return_type parameter is an instance of the tablestore.metadata.ReturnType class. Only the primary key can be returned. In most cases, this parameter is used by the auto-increment primary key column feature.
transaction_id	The ID of the local transaction. If you want to use the local transaction feature to delete data, you must configure this parameter.

Examples

The following sample code provides an example on how to insert a row of data:

Note

In the following example, the data version in the age attribute column is 1498184687000, which indicates June 23, 2017. If the difference between the current time and the value of the max_time_deviation parameter is greater than 1498184687000, the PutRow operation is prohibited. The value of the max_time_deviation parameter is specified when you create the table.

# Specify the name of the data table. 
table_name = '<TABLE_NAME>'

# The first primary key column is gid and the value is the integer 1. The second primary key column is uid and the value is the integer 101. 
primary_key = [('gid',1), ('uid',101)]
# The row contains five attribute columns:
# The first attribute column is name and the value is the string John. The data version number of the first attribute column is not specified. The system uses the current time as the data version number. 
# The second attribute column is mobile and the value is the integer 1390000****. The data version number of the second attribute column is not specified. The system uses the current time as the data version number. 
# The third attribute column is address and the value is the binary value China. The data version number of the third attribute column is not specified. The system uses the current time as the data version number. 
# The fourth attribute column is female and the value is the Boolean value False. The data version number of the fourth attribute column is not specified. The system uses the current time as the data version number. 
# The fifth attribute column is age and the value is 29.7. 1498184687000 is specified as the data version number of the fifth attribute column. 
attribute_columns = [('name','John'), ('mobile',1390000****),('address', bytearray('China', encoding='utf8')),('female', False), ('age', 29.7, 1498184687000)]
# Construct a row by using primary_key and attribute_columns. 
row = Row(primary_key, attribute_columns)

# Specify a row existence condition that expects the specified row to not exist. If the specified row exists, the Condition Update Failed error occurs. 
condition = Condition(RowExistenceExpectation.EXPECT_NOT_EXIST)
try:
    # If you do not specify the ReturnType parameter when you call the put_row method, the value of the return_row parameter is None. 
    consumed, return_row = client.put_row(table_name, row, condition)

    # Display the number of write CUs that are consumed by the request. 
    print('put row succeed, consume %s write cu.' % consumed.write)
# In most cases, client exceptions are caused by parameter errors or network exceptions. 
except OTSClientError as e:
    print("put row failed, http_status:%d, error_message:%s" % (e.get_http_status(), e.get_error_message()))
# In most cases, server exceptions are caused by parameter or throttling errors. 
except OTSServiceError as e:
    print("put row failed, http_status:%d, error_code:%s, error_message:%s, request_id:%s" % (e.get_http_status(), e.get_error_code(), e.get_error_message(), e.get_request_id()))

To view the detailed sample code, visit PutRow@GitHub.

Update a single row of data

You can call the UpdateRow operation to update the data in a row. You can add attribute columns to a row, remove attribute columns from a row, delete a specific version of data from an attribute column, or update the value of an attribute column. If the row does not exist, a new row is inserted.

Note

If you call the UpdateRow operation only to remove columns from a row and the row does not exist, no row is inserted into the table.

API operation

"""
Description: This operation updates a single row of data. 
table_name: the name of the table. 
row: the row that you want to update, including primary key columns and attribute columns. The primary key columns are in the list structure and the attribute columns are in the dict structure. 
condition: the condition that must be met to perform the operation. After you specify a condition, Tablestore checks whether the specified condition is met before Tablestore performs the operation. The operation is performed only if the condition is met. The condition parameter is an instance of the tablestore.metadata.Condition class. 
Row existence conditions and conditions based on column values are supported. If you want to specify a row existence condition, you can set the condition parameter to IGNORE, EXPECT_EXIST, or EXPECT_NOT_EXIST based on your business requirements. 
return_type: the type of data that you want to return. The return_type parameter is an instance of the tablestore.metadata.ReturnType class. Only the primary key can be returned. In most cases, this parameter is used by the auto-increment primary key column feature. 
Response: the number of CUs that are consumed by the operation and the row data that is specified by return_row.
consumed: the number of CUs consumed by the operation. The consumed parameter is an instance of the tablestore.metadata.CapacityUnit class. 
return_row: the row data that is returned. 
Example:
    primary_key = [('gid',1), ('uid',101)]
    update_of_attribute_columns = {
        'put' : [('name','Jack'), ('address','City A in China')],
        'delete' : [('mobile', 1493725896147)],
        'delete_all' : [('age')],
        'increment' : [('counter', 1)]
    }
    row = Row(primary_key, update_of_attribute_columns)
    condition = Condition('EXPECT_EXIST')
    consumed = client.update_row('myTable', row, condition)
"""
def update_row(self, table_name, row, condition, return_type = None, transaction_id = None):

Parameters

Parameter	Description
table_name	The name of the data table.
row	The row of data. This parameter includes the following configuration items: `primary_key`: the primary key information about the row. The primary key information includes the name, type, and value of the primary key column. Important The number and types of primary key columns that you specify must be the same as the actual number and types of primary key columns in the table. If a primary key column is an auto-increment primary key column, you need to only set the value of the auto-increment primary key column to a placeholder. For more information, see Configure an auto-increment primary key column. `update_of_attribute_columns`: the attribute columns that you want to update. When you add or modify an attribute column, you must specify the attribute column name and attribute column value. The attribute column value type and timestamp are optional. The attribute column name is the name of the attribute column, and the attribute column value type is the data type of the attribute column. For more information, see Naming conventions and data types. A timestamp is a data version number. You can use the data version number that is automatically generated by the system or specify a custom data version number. By default, if you do not specify a data version number, the data version number that is automatically generated by the system is used. For more information, see Data versions and TTL. By default, the system uses the current UNIX timestamp as a data version number. A UNIX timestamp represents the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC. If you specify a custom data version number, make sure that the version number is a 64-bit timestamp that is accurate to milliseconds and is in the valid version range. To delete a specified version of data from an attribute column, you need to only specify the attribute column name and timestamp. The timestamp is a 64-bit integer in units of milliseconds, which specifies a version of data. To remove an attribute column, you need to only specify the attribute column name. Note After you remove all attribute columns from a row, the row still exists. To delete a row, use the DeleteRow operation.
condition	The condition that must be met to perform the operation. You can specify a row existence condition or a condition based on column values. For more information, see Configure conditional update.
return_type	The type of data that you want to return. The return_type parameter is an instance of the tablestore.metadata.ReturnType class. Only the primary key can be returned. In most cases, this parameter is used by the auto-increment primary key column feature.
transaction_id	The ID of the local transaction. If you want to use the local transaction feature to delete data, you must configure this parameter.

Examples

The following sample code provides an example on how to update a row of data:

# Specify the name of the data table. 
table_name = '<TABLE_NAME>'

# The first primary key column is gid and the value is the integer 1. The second primary key column is uid and the value is the integer 101. 
primary_key = [('gid',1), ('uid',101)]
# The update types include PUT, DELETE, and DELETE_ALL. 
# PUT: adds an attribute column or updates the existing value in an attribute column. In this example, two attribute columns are added. The first column is name and the value is David. The second column is address and the value is Hongkong. 
# DELETE: deletes a specific version (timestamp) of data. In this example, data whose data version number is 1488436949003 in the address column is deleted. 
# DELETE_ALL: deletes the column. In this example, all version of data in the mobile and age columns are deleted. 
update_of_attribute_columns = {
    'PUT' : [('name','David'), ('address','Hongkong')],
    'DELETE' : [('address', None, 1488436949003)],
    'DELETE_ALL' : [('mobile'), ('age')],
}
row = Row(primary_key, update_of_attribute_columns)

# In this example, the Condition parameter is set to RowExistenceExpectation.IGNORE, which specifies that the row is updated regardless of whether the row exists. 
condition = Condition(RowExistenceExpectation.IGNORE, SingleColumnCondition("age", 20, ComparatorType.EQUAL)) # update row only when this row is exist
try:
    consumed, return_row = client.update_row(table_name, row, condition)
    
    # Display the number of write CUs that are consumed by the request. 
    print('put row succeed, consume %s write cu.' % consumed.write)
# In most cases, client exceptions are caused by parameter errors or network exceptions. 
except OTSClientError as e:
    print("update row failed, http_status:%d, error_message:%s" % (e.get_http_status(), e.get_error_message()))
# In most cases, server exceptions are caused by parameter or throttling errors. 
except OTSServiceError as e:
    print("update row failed, http_status:%d, error_code:%s, error_message:%s, request_id:%s" % (e.get_http_status(), e.get_error_code(), e.get_error_message(), e.get_request_id()))

To view the detailed sample code, visit UpdateRow@GitHub.

Write multiple rows of data at the same time

You can call the BatchWriteRow operation to write multiple rows of data to one or more tables at a time.

When you call the BatchWriteRow operation, each PutRow, UpdateRow, or DeleteRow operation is separately performed and the response to each operation is separately returned by Tablestore.

Usage notes

When you call the BatchWriteRow operation to write multiple rows of data at the same time, some rows may fail to be written. In this case, Tablestore does not return exceptions. Tablestore returns BatchWriteRowResponse in which the indexes and error messages of the failed rows are included. When you call the BatchWriteRow operation, make sure that you check the return values to determine whether all rows are written. If you do not check the return values, the rows that fail to be written may be ignored.

If the server detects that invalid parameters exist in some operations, an error message may return before the operations of the request are performed.

API operation

"""
Description: This operation writes multiple rows of data at the same time. 
request = MiltiTableInBatchWriteRowItem()
request.add(TableInBatchWriteRowItem(table0, row_items))
request.add(TableInBatchWriteRowItem(table1, row_items))
response = client.batch_write_row(request)
response: the returned result. The type of the response parameter is tablestore.metadata.BatchWriteRowResponse. 
"""
def batch_write_row(self, request):

Examples

The following sample code provides an example on how to write multiple rows of data at the same time:

put_row_items = []
# Call the PutRow operation to insert a row. 
for i in range(0, 20):
    primary_key = [('gid',i), ('uid',i+1)]
    attribute_columns = [('name','somebody'+str(i)), ('address','somewhere'+str(i)), ('age',i)]
    row = Row(primary_key, attribute_columns)
    condition = Condition(RowExistenceExpectation.IGNORE)
    item = PutRowItem(row, condition)
    put_row_items.append(item)

# Call the UpdateRow operation to update a row. 
for i in range(0, 10):
    primary_key = [('gid',i), ('uid',i+1)]
    attribute_columns = {'put': [('name','somebody'+str(i)), ('address','somewhere'+str(i)), ('age',i)]}
    row = Row(primary_key, attribute_columns)
    condition = Condition(RowExistenceExpectation.IGNORE, SingleColumnCondition("age", i, ComparatorType.EQUAL))
    item = UpdateRowItem(row, condition)
    put_row_items.append(item)

# Call the DeleteRow operation to delete a row. 
delete_row_items = []
for i in range(10, 20):
    primary_key = [('gid',i), ('uid',i+1)]
    row = Row(primary_key)
    condition = Condition(RowExistenceExpectation.IGNORE)
    item = DeleteRowItem(row, condition)
    delete_row_items.append(item)

# Construct a request to write multiple rows of data at the same time. 
request = BatchWriteRowRequest()
request.add(TableInBatchWriteRowItem('<TABLE_NAME>', put_row_items))
request.add(TableInBatchWriteRowItem('<DELETE_TABLE_NAME>', delete_row_items))

# When you call the batch_write_row method to write multiple rows of data at the same time, exceptions may occur if errors such as request parameter errors occur. If the operation fails for part of the rows, an exception may not occur but the internal items may fail. 
try:
    result = client.batch_write_row(request)
    print('Result status: %s'%(result.is_all_succeed()))

    # Check the result of the PutRow operation. 
    print('check first table\'s put results:')
    succ, fail = result.get_put()
    for item in succ:
        print('Put succeed, consume %s write cu.' % item.consumed.write)
    for item in fail:
        print('Put failed, error code: %s, error message: %s' % (item.error_code, item.error_message))

    # Check the result of the UpdateRow operation. 
    print('check first table\'s update results:')
    succ, fail = result.get_update()
    for item in succ:
        print('Update succeed, consume %s write cu.' % item.consumed.write)
    for item in fail:
        print('Update failed, error code: %s, error message: %s' % (item.error_code, item.error_message))

    # Check the result of the DeleteRow operation. 
    print('check second table\'s delete results:')
    succ, fail = result.get_delete()
    for item in succ:
        print('Delete succeed, consume %s write cu.' % item.consumed.write)
    for item in fail:
        print('Delete failed, error code: %s, error message: %s' % (item.error_code, item.error_message)) 
# In most cases, client exceptions are caused by parameter errors or network exceptions. 
except OTSClientError as e:
    print("get row failed, http_status:%d, error_message:%s" % (e.get_http_status(), e.get_error_message()))
# In most cases, server exceptions are caused by parameter or throttling errors. 
except OTSServiceError as e:
    print("get row failed, http_status:%d, error_code:%s, error_message:%s, request_id:%s" % (e.get_http_status(), e.get_error_code(), e.get_error_message(), e.get_request_id()))

To view the detailed sample code, visit BatchWriteRow@GitHub.

FAQ

References

To update data in a highly concurrent application based on the specified conditions, you can use the conditional update feature. For more information, see Configure conditional update.
To collect real-time statistics about online applications, such as the number of page views (PVs) on various topics, you can use the atomic counter feature. For more information, see Use the atomic counter feature.
To perform atomic operations to write one or more rows of data, you can use the local transaction feature. For more information, see Use the local transaction feature.
After you write data to a table, you can read or delete the data in the table based on your business requirements. For more information, see Read data and Delete data.