All Products
Search
Document Center

Tablestore:Write data

Last Updated:Apr 11, 2024

Tablestore allows you to write a single row of data, update a single row of data, and write multiple rows of data at a time by calling different operations. To write data to a data table, you must specify the complete primary key information and the attribute columns that you want to add, delete, or modify. To write data to a highly concurrent application, you can configure row existence conditions or column conditions to update data based on the specified conditions.

Operations

Tablestore provides the PutRow, UpdateRow, and BatchWriteRow operations to allow you to write data. Select an operation to write data based on your business requirements.

Operation

Description

Scenario

Insert a single row of data

You can call the PutRow operation to insert a single row of data. If the row exists, Tablestore deletes all versions of data in all columns from the existing row and writes new data.

This operation is suitable for scenarios in which you want to write a small amount of data.

Update a single row of data

You can call the UpdateRow operation to update a single row of data. You can add attribute columns to a row, remove attribute columns from a row, delete a specific version of data from an attribute column, or update the value of an attribute column. If the row does not exist, a new row is inserted.

This operation is suitable for scenarios in which you want to update the existing data. For example, you want to remove an attribute column, delete a specific version of data, or update the value of an attribute column.

Write multiple rows of data at the same time

You can call the BatchWriteRow operation to write multiple rows of data to one or more tables at the same time.

The BatchWriteRow operation consists of multiple PutRow, UpdateRow, and DeleteRow operations. When you call the BatchWriteRow operation, the process of constructing a suboperation is the same as the process of calling the PutRow, UpdateRow, or DeleteRow operation.

This operation is suitable for scenarios in which you want to write, delete, or update a large amount of data and scenarios in which you want to write, delete, and update data at the same time.

Prerequisites

Insert a single row of data

You can call the PutRow operation to write a row of data. If the row exists, Tablestore deletes all versions of data in all columns from the existing row and writes new data.

API operation

"""
Description: This operation writes a single row of data. The number of capacity units (CUs) that are consumed by the operation is returned. 
table_name: the name of the data table. 
row: the row that you want to write to the data table, including primary key columns and attribute columns. 
condition: the condition that must be met to perform the operation. After you specify a condition, Tablestore checks whether the specified condition is met before Tablestore performs the operation. The operation is performed only if the condition is met. The condition parameter is an instance of the tablestore.metadata.Condition class. You can specify a row existence condition or a condition based on column values. If you specify a row existence condition, you can set the condition parameter to IGNORE, EXPECT_EXIST, or EXPECT_NOT_EXIST. 
return_type: the type of data that you want to return. The return_type parameter is an instance of the tablestore.metadata.ReturnType class. Only the primary key can be returned. In most cases, this parameter is used by the auto-increment primary key column feature. 

Response: the number of CUs that are consumed by the operation and the row data that is returned. 
consumed: the number of capacity units (CUs) that are consumed by the operation. The consumed parameter is an instance of the tablestore.metadata.CapacityUnit class. 
return_row: the row data that is returned, which may include primary key columns and attribute columns. 
"""
def put_row(self, table_name, row, condition = None, return_type = None)

Parameters

Parameter

Description

table_name

The name of the data table.

primary_key

The primary key information about the row. The primary key information includes the name, type, and value of the primary key column.

Important
  • The number and types of primary key columns that you specify must be the same as the actual number and types of primary key columns in the table.

  • If a primary key column is an auto-increment primary key column, you need to only set the value of the auto-increment primary key column to a placeholder. For more information, see Configure an auto-increment primary key column.

attribute_columns

The attribute columns of the row. An attribute column is specified by parameters in the following sequence: the attribute column name, attribute column value (ColumnValue), attribute column type (ColumnType), and timestamp. The attribute column type and timestamp are optional.

  • The attribute column name is the name of the attribute column, and the attribute column type is the data type of the attribute column. For more information, see Naming conventions and data types.

    You can set the ColumnType parameter to ColumnType.INTEGER that specifies an INTEGER value, ColumnType.STRING that specifies a UTF-8 encoded STRING, ColumnType.BINARY that specifies a BINARY value, ColumnType.BOOLEAN hat specifies a BOOLEAN value, or ColumnType.DOUBLE that specifies a DOUBLE value. If the attribute column type is BINARY, you must set the ColumnType parameter to ColumnType.BINARY. In other cases, you can leave the ColumnType parameter empty.

  • The timestamp is a data version number. For more information, see Data versions and TTL.

    You can use the data version number that is automatically generated by the system or specify a custom data version number. If you do not specify a data version number, the data version number that is automatically generated by the system is used.

    • The data version number generated by Tablestore is the number of milliseconds that have elapsed since 00:00:00 UTC on January 1, 1970.

    • If you specify a data version number, make sure that the data version number is a 64-bit timestamp that is accurate to milliseconds and is in the valid version range.

condition

The conditions that must be met to perform the operation. You can specify a row existence condition or a condition based on column values. For more information, see Configure conditional update.

Note
  • RowExistenceExpectation.IGNORE specifies that new data is inserted into a row regardless of whether the specified row exists. If the specified row exists, the existing data is overwritten.

  • RowExistenceExpectation.EXPECT_EXIST specifies that new data is inserted only if the specified row exists. The existing data is overwritten.

  • RowExistenceExpectation.EXPECT_NOT_EXIST specifies that data is inserted only if the specified row does not exist.

Examples

The following sample code provides an example on how to insert a row of data:

Note

In the following example, the data version in the age attribute column is 1498184687000, which indicates June 23, 2017. If the difference between the current time and the value of the max_time_deviation parameter is greater than 1498184687000, the PutRow operation is prohibited. The value of max_time_deviation is specified when you create the table.

# The first primary key column is gid and the value is the integer 1. The second primary key column is uid and the value is the integer 101. 
primary_key = [('gid',1), ('uid',101)]

# The row contains five attribute columns:
# The first attribute column is name and the value is the string John. The data version number of the first attribute column is not specified. The system uses the current time as the data version number. 
# The second attribute column is mobile and the value is the integer 1390000****. The data version number of the second attribute column is not specified. The system uses the current time as the data version number. 
# The third attribute column is address and the value is the binary value China. The data version number of the third attribute column is not specified. The system uses the current time as the data version number. 
# The fourth attribute column is female and the value is the Boolean value False. The data version number of the fourth attribute column is not specified. The system uses the current time as the data version number. 
# The fifth attribute column is age and the value is 29.7. 1498184687000 is specified as the data version number of the fifth attribute column. 
attribute_columns = [('name','John'), ('mobile',1390000****),('address', bytearray('China')),('female', False), ('age', 29.7, 1498184687000)]

# Construct a row by using primary_key and attribute_columns. 
row = Row(primary_key, attribute_columns)

# Specify a row existence condition that expects the specified row to not exist. If the specified row exists, the Condition Update Failed error occurs. 
condition = Condition(RowExistenceExpectation.EXPECT_NOT_EXIST)

try:
    # If you do not specify the ReturnType parameter when you call the put_row method, the value of the return_row parameter is None. 
    consumed, return_row = client.put_row(table_name, row, condition)

    # Display the number of write CUs that are consumed by the request. 
    print('put row succeed, consume %s write cu.' % consumed.write)
# In most cases, client exceptions are caused by parameter errors or network exceptions. 
except OTSClientError as e:
    print("put row failed, http_status:%d, error_message:%s" % (e.get_http_status(), e.get_error_message()))
# In most cases, server exceptions are caused by parameter or throttling errors. 
except OTSServiceError as e:
    print("put row failed, http_status:%d, error_code:%s, error_message:%s, request_id:%s" % (e.get_http_status(), e.get_error_code(), e.get_error_message(), e.get_request_id()))                    

To view the detailed sample code, visit PutRow@GitHub.

Update a single row of data

You can call the UpdateRow operation to update the data in a row. You can add attribute columns to a row, remove attribute columns from a row, delete a specific version of data from an attribute column, or update the value of an attribute column. If the row does not exist, a new row is inserted.

Note

If you call the UpdateRow operation only to remove columns from a row and the row does not exist, no row is inserted into the table.

API operation

"""
Description: This operation updates a single row of data. 
table_name: the name of the data table. 
row: the row that you want to update, including primary key columns and attribute columns of the LIST type. 
condition: the condition that must be met to perform the operation. After you specify a condition, Tablestore checks whether the specified condition is met before Tablestore performs the operation. The operation is performed only if the condition is met. The condition parameter is an instance of the tablestore.metadata.Condition class. You can specify a row existence condition or a condition based on column values. If you specify a row existence condition, you can set the condition parameter to IGNORE, EXPECT_EXIST, or EXPECT_NOT_EXIST. 
return_type: the type of data that you want to return. The return_type parameter is an instance of the tablestore.metadata.ReturnType class. Only the primary key can be returned. In most cases, this parameter is used by the auto-increment primary key column feature. 
Response: the number of CUs consumed by the operation and the row data that is specified by return_row to return. 
consumed: the number of CUs that are consumed by the operation. The consumed parameter is an instance of the tablestore.metadata.CapacityUnit class. 
return_row: the row data that is returned. 
"""
def update_row(self, table_name, row, condition, return_type = None)                    

Parameters

Parameter

Description

table_name

The name of the data table.

primary_key

The primary key information about the row. The primary key information includes the name, type, and value of the primary key column.

Important

The number and types of primary key columns that you specify must be the same as the actual number and types of primary key columns in the table.

update_of_attribute_columns

The attribute columns that you want to update.

  • When you add or modify an attribute column, you must specify the attribute column name and attribute column value. The attribute column value type and timestamp are optional.

    The attribute column name is the name of the attribute column, and the attribute column value type is the data type of the attribute column. For more information, see Naming conventions and data types.

    A timestamp is a data version number. You can use the data version number that is automatically generated by the system or specify a custom data version number. By default, if you do not specify a data version number, the data version number that is automatically generated by the system is used. For more information, see Data versions and TTL.

    • By default, the system uses the current UNIX timestamp as a data version number. A UNIX timestamp represents the number of milliseconds that have elapsed since January 1, 1970, 00:00:00 UTC.

    • If you specify a custom data version number, make sure that the version number is a 64-bit timestamp that is accurate to milliseconds and is in the valid version range.

  • To delete a specified version of data from an attribute column, you need to only specify the attribute column name and timestamp.

    The timestamp is a 64-bit integer in units of milliseconds, which specifies a version of data.

  • To remove an attribute column, you need to only specify the attribute column name.

    Note

    After you remove all attribute columns from a row, the row still exists. To delete a row, use the DeleteRow operation.

condition

The conditions that must be met to perform the operation. You can specify a row existence condition or a condition based on column values. For more information, see Configure conditional update.

Examples

The following sample code provides an example on how to update a row of data:

# The first primary key column is gid and the value is the integer 1. The second primary key column is uid and the value is the integer 101. 
primary_key = [('gid',1), ('uid',101)]

# The update types include PUT, DELETE, and DELETE_ALL. 
# PUT: adds a value to the column or updates the existing value in the column. In this example, two columns are added. The first column is name and the value is David. The second column is address and the value is Hongkong. 
# DELETE: deletes a specific version (timestamp) of data. In this example, data whose data version number is 1488436949003 in the address column is deleted. 
# DELETE_ALL: deletes the column. In this example, all version of data in the mobile and age columns are deleted. 
update_of_attribute_columns = {
    'PUT' : [('name','David'), ('address','Hongkong')],
    'DELETE' : [('address', None, 1488436949003)],
    'DELETE_ALL' : [('mobile'), ('age')],
}
row = Row(primary_key, update_of_attribute_columns)

# In this example, the Condition parameter is set to RowExistenceExpectation.IGNORE, which specifies that the row is updated regardless of whether the row exists. 
condition = Condition(RowExistenceExpectation.IGNORE, SingleColumnCondition("age", 20, ComparatorType.EQUAL)) # update row only when this row is exist

try:
    consumed, return_row = client.update_row(table_name, row, condition)
# In most cases, client exceptions are caused by parameter errors or network exceptions. 
except OTSClientError as e:
    print("update row failed, http_status:%d, error_message:%s" % (e.get_http_status(), e.get_error_message()))
# In most cases, server exceptions are caused by parameter or throttling errors. 
except OTSServiceError as e:
    print("update row failed, http_status:%d, error_code:%s, error_message:%s, request_id:%s" % (e.get_http_status(), e.get_error_code(), e.get_error_message(), e.get_request_id()))

To view the detailed sample code, visit UpdateRow@GitHub.

Write multiple rows of data at the same time

You can call the BatchWriteRow operation to write multiple rows of data to one or more tables at a time.

When you call the BatchWriteRow operation, each PutRow, UpdateRow, or DeleteRow operation is separately performed and the response to each operation is separately returned by Tablestore.

Usage notes

When you call the BatchWriteRow operation to write multiple rows of data at the same time, some rows may fail to be written. In this case, Tablestore does not return exceptions. Tablestore returns BatchWriteRowResponse in which the indexes and error messages of the failed rows are included. When you call the BatchWriteRow operation, make sure that you check the return values to determine whether all rows are written. If you do not check the return values, the rows that fail to be written may be ignored.

If the server detects that invalid parameters exist in some operations, an error message may return before the operations of the request are performed.

API operation

"""
Description: This operation writes multiple rows of data at the same time. 
request = MiltiTableInBatchWriteRowItem()
request.add(TableInBatchWriteRowItem(table0, row_items))
request.add(TableInBatchWriteRowItem(table1, row_items))
response = client.batch_write_row(request)
response: the returned result. The type of the response parameter is tablestore.metadata.BatchWriteRowResponse. 
"""
def batch_write_row(self, request):                   

Examples

The following sample code provides an example on how to write multiple rows of data at the same time:

put_row_items = []
# Call the PutRow operation to insert a row. 
for i in range(0, 10):
    primary_key = [('gid',i), ('uid',i+1)]
    attribute_columns = [('name','somebody'+str(i)), ('address','somewhere'+str(i)), ('age',i)]
    row = Row(primary_key, attribute_columns)
    condition = Condition(RowExistenceExpectation.IGNORE)
    item = PutRowItem(row, condition)
    put_row_items.append(item)

# Call the UpdateRow operation to update a row. 
for i in range(10, 20):
    primary_key = [('gid',i), ('uid',i+1)]
    attribute_columns = {'put': [('name','somebody'+str(i)), ('address','somewhere'+str(i)), ('age',i)]}
    row = Row(primary_key, attribute_columns)
    condition = Condition(RowExistenceExpectation.IGNORE, SingleColumnCondition("age", i, ComparatorType.EQUAL))
    item = UpdateRowItem(row, condition)
    put_row_items.append(item)

# Call the DeleteRow operation to delete a row. 
delete_row_items = []
for i in range(10, 20):
    primary_key = [('gid',i), ('uid',i+1)]
    row = Row(primary_key)
    condition = Condition(RowExistenceExpectation.IGNORE)
    item = DeleteRowItem(row, condition)
    delete_row_items.append(item)

# Construct a request to write multiple rows of data at the same time. 
request = BatchWriteRowRequest()
request.add(TableInBatchWriteRowItem(table_name, put_row_items))
request.add(TableInBatchWriteRowItem('notExistTable', delete_row_items))

# When you call the batch_write_row method to write multiple rows of data at the same time, exceptions may occur if errors such as request parameter errors occur. If the operation fails for part of the rows, an exception may not occur but the internal items may fail. 
try:
    result = client.batch_write_row(request)
    print('Result status: %s'%(result.is_all_succeed()))

    # Check the result of the PutRow operation. 
    print('check first table\'s put results:')
    succ, fail = result.get_put()
    for item in succ:
        print('Put succeed, consume %s write cu.' % item.consumed.write)
    for item in fail:
        print('Put failed, error code: %s, error message: %s' % (item.error_code, item.error_message))

    # Check the result of the UpdateRow operation. 
    print('check first table\'s update results:')
    succ, fail = result.get_update()
    for item in succ:
        print('Update succeed, consume %s write cu.' % item.consumed.write)
    for item in fail:
        print('Update failed, error code: %s, error message: %s' % (item.error_code, item.error_message))

    # Check the result of the DeleteRow operation. 
    print('check second table\'s delete results:')
    succ, fail = result.get_delete()
    for item in succ:
        print('Delete succeed, consume %s write cu.' % item.consumed.write)
    for item in fail:
        print('Delete failed, error code: %s, error message: %s' % (item.error_code, item.error_message)) 
# In most cases, client exceptions are caused by parameter errors or network exceptions. 
except OTSClientError as e:
    print("get row failed, http_status:%d, error_message:%s" % (e.get_http_status(), e.get_error_message()))
# In most cases, server exceptions are caused by parameter or throttling errors. 
except OTSServiceError as e:
    print("get row failed, http_status:%d, error_code:%s, error_message:%s, request_id:%s" % (e.get_http_status(), e.get_error_code(), e.get_error_message(), e.get_request_id()))

To view the detailed sample code, visit BatchWriteRow@GitHub.

FAQ

References

  • To update data in a highly concurrent application based on the specified conditions, you can use the conditional update feature. For more information, see Configure conditional update.

  • To collect real-time statistics about online applications, such as the number of page views (PVs) on various topics, you can use the atomic counter feature. For more information, see Configure atomic counter.

  • To perform atomic operations to write one or more rows of data, you can use the local transaction feature. For more information, see Configure local transaction.

  • After you write data to a table, you can read or delete the data in the table based on your business requirements. For more information, see Read data and Delete data.