You can use Function Compute to perform real-time computing on incremental data in Tablestore.
Background information
Function Compute (FC) is an event-driven fully managed computing service. It allows you to focus on coding without the need to procure and manage infrastructure resources such as servers. You need to only upload your code or image. Function Compute prepares computing resources for you, elastically and reliably runs tasks, and provides features such as log queries, performance monitoring, and alerting.
Tablestore Stream is a data tunnel that is used to obtain incremental data in Tablestore tables. By creating a Tablestore trigger, you can automatically connect Stream with Function Compute, allowing custom program logic in computing functions to automatically process data modifications in Tablestore tables.
Scenarios
The following figure shows the tasks that you can perform by using Function Compute.
Data synchronization: You can synchronize real-time data in Tablestore to data caches, search engines, or other database instances.
Data archiving: You can use Function Compute to archive incremental data that is stored in Tablestore to OSS for cold backup.
Event-driven application: You can create triggers to trigger functions to call API operations that are provided by IoT Hub and cloud applications. You can also create triggers to send notifications.
Prerequisites
You have created a Tablestore instance and a data table.
You have activated Function Compute.
Considerations
Tablestore triggers are supported in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Shenzhen), Japan (Tokyo), Singapore, Germany (Frankfurt), and China (Hong Kong).
When you create a Tablestore trigger, you can select only Tablestore instances and their data tables that reside in the same region as the current Function Compute service. Therefore, make sure that the Function Compute service and the selected data table reside in the same region.
If you want to access Tablestore over an internal network in a function that corresponds to a Tablestore trigger, we recommend that you use a virtual private cloud (VPC) endpoint of Tablestore. For more information, see What is a VPC? and Obtain endpoints.
When you write function code for Tablestore, make sure not to use the following logic: Function B is invoked by a trigger for Table A and then Function B updates the data in Table A. This logic creates an infinite loop of function invocations.
The execution duration of a function that is invoked by a trigger cannot exceed one minute.
If an exception occurs during function execution, the function is retried an indefinite number of times until the log data in Tablestore expires.
NoteA function execution exception occurs in one of the following scenarios:
A function instance is started but the function code does not run as expected. In this case, fees are generated for the instance.
A function instance fails to start due to reasons such as startup command errors. In this case, fees are not generated for the instance.
If a function execution exception occurs, you can disable the Stream feature for the data table to prevent the function from being retried for an indefinite number of times. Before you disable the Stream feature for the data table, make sure that no other triggers use the data table to prevent exceptions in other triggers.
Step 1: Enable the stream feature for the data table
Before you create a trigger, you must enable the Stream feature for the data table in the Tablestore console to allow the function to process incremental data that is written to the table.
Log on to the Tablestore console.
In the top navigation bar, select a region.
On the Overview page, click the instance alias or click Manage Instance in the Actions column.
On the Instance Details tab, click the Data Tables tab. Then, click the name of the data table and select the Stream tab, or click
and select Stream.
On the Stream tab, click Enable next to Stream Information.
In the Enable Stream dialog box, set the Log Expiration Time parameter and click Enable.
The value of the Log Expiration Time parameter must be a non-zero integer. Unit: hours. Maximum value: 168.
ImportantThe Log Expiration Time parameter cannot be modified after it is specified. Proceed with caution.
Step 2: Create a function and a Tablestore trigger
Create a function.
Log on to the Function Compute console.
Optional: In the upper-right corner of the page, click Try Function Compute 3.0.
NoteFunction Compute 3.0 provides various enhanced features. In this example, Function Compute 3.0 is used.
If you have accessed the new console page (the button in the upper-right corner of the page is Back To Function Compute 2.0), skip this step.
In the left-side navigation pane, click Functions.
In the top navigation bar, select a region. Then, on the Functions page, click Create Function.
On the Create Function page, select a method to create the function, configure the following parameters, and then click Create.
In this example, an Event Function is created to perform real-time computing on data modifications in Tablestore.
NoteWhen you use Function Compute, you can create an Event Function, HTTP Function, or Task Function to process data in Tablestore. For more information, see Function selection.
If you want data changes in Tablestore to automatically trigger data processing, create an Event Function. For more information, see Create an event function.
If you want to trigger data processing by using specific HTTP requests, create an HTTP Function. For more information, see Create an HTTP function.
If you want to trigger data processing at scheduled intervals or asynchronously, create a Task Function. For more information, see Create a task function.
Basic Settings: Set Function Name.
Function Code: Configure the runtime environment and code-related information for the function.
Parameter
Description
Example
Runtime
Select a runtime that you prefer, such as a Python, Java, PHP, Node.js, or Custom Container Image.
Custom Container Image.
In this example, Python 3.9 is selected.
Code Upload Method
Specify how to upload code to Function Compute.
Use Sample Code: This is the default method. You can select sample code that is provided by Function Compute to create a function based on your business requirements.
Upload Code By Using ZIP Package: Select and upload a ZIP package that contains function code.
Upload Code By Using Folder: Select and upload a folder that contains function code.
Upload Code By Using OSS: Select the Bucket Name and File Name of the function code that you want to upload.
In this example, select Use Sample Code and then select Hello, World! Sample from the sample code list.
Advanced Configuration: Configure instance-related information and the timeout period for the function.
Parameter
Description
Example
Specification
Select or manually enter a reasonable combination of VCPU Specification and Memory Specification based on your business requirements. For more information about the billing of resource usage, see Billing overview.
NoteThe ratio of vCPU specification to memory capacity (in GB) must range from 1:1 to 1:4.
0.35 vCPU, 512 MB
Temporary Disk Size
Specify the size of the disk used to temporarily store files based on your business requirements.
Valid values:
512 MB: the default value. You are not charged for using a temporary disk of this size. Function Compute provides you with a free disk space of 512 MB.
10 GB: You are charged based on 9.5 GB.
NoteData can be written to all directories in the temporary hard disk. The directories share the space of the temporary hard disk.
The temporary hard disk is consistent with the lifecycle of the underlying instance. After the instance is recycled by the system, the data on the hard disk is cleared. To store files persistently, you can mount a File Storage NAS (NAS) system or Object Storage Service (OSS) bucket to your function. For more information, see Configure NAS file systems and Configure OSS.
512 MB
Timeout Period
Specify the timeout period for function execution. The default value of Timeout Period is 180 seconds. The maximum value is 86400 seconds.
180
Handler
Specify the handler of the function. The Function Compute runtime loads and invokes the handler to process requests. If you select HTTP Function as the method to create the function, you do not need to configure this parameter.
NoteIf you select Use Sample Code for Code Upload Method, you do not need to modify the Handler parameter. If you select another code upload method, you must modify the Handler parameter based on your actual requirements. Otherwise, an error occurs when the function is executed.
index.handler
Time Zone
Select the time zone of your function. After you set the time zone for the function, an environment variable named TZ is automatically added to the function. The value of this environment variable is the time zone that you set.
UTC
Function Role
Function Compute uses this RAM role to generate temporary keys that are used to access your Alibaba Cloud resources and passes the keys to your code.
ImportantYou must grant the function role the permissions to access Tablestore. For more information, see Appendix: Grant Function Compute the permissions to access Tablestore.
AliyunFCDefaultRole
Allow Access To VPC
Specify whether to allow the function to access VPC resources. For more information, see Configure networks.
Yes
VPC
This parameter is required if you select Yes for Allow Access To VPC. Create a VPC or select the ID of an existing VPC that you want to access from the drop-down list.
fc.auto.create.vpc.1632317****
VSwitch
This parameter is required if you select Yes for Allow Access To VPC. Create a vSwitch or select the ID of an existing vSwitch from the drop-down list.
fc.auto.create.vswitch.vpc-bp1p8248****
Security Group
This parameter is required if you select Yes for Allow Access To VPC. Create a security group or select an existing security group from the drop-down list.
fc.auto.create.SecurityGroup.vsw-bp15ftbbbbd****
Allow Function Default NIC To Access Internet
Specify whether to allow the function to access the Internet through the default network interface controller (NIC) of Function Compute. If you select No, functions in the current service cannot access the Internet through the default NIC of Function Compute.
ImportantIf you want to use the static public IP address feature, you must select No for Allow Function Default NIC to Access Internet. Otherwise, the configured static public IP address does not take effect. For more information, see Configure static public IP addresses.
Yes
Logging
Specify whether to integrate with Simple Log Service. Valid values:
Enable: The execution logs of the function are persistently stored in Simple Log Service. You can use these logs to debug code, analyze failures, and analyze data.
NoteAfter you enable the logging feature, logs that are printed to standard output (stdout) are collected by Simple Log Service. Then, you can use these logs to debug code, analyze failures, and analyze data.
Click Configure logs to view more information.
You are charged for the Simple Log Service resources that Function Compute creates for you in the background. For more information, see Billing items in the pay-as-you-go billing method.
Disable: The execution logs of the function cannot be stored or queried by using Simple Log Service.
Enable
(Optional) Environment Variables: Set environment variables for the function runtime environment. For more information, see Configure environment variables.
Create a Tablestore trigger.
On the Function Details tab, click the Configure tab. In the left-side navigation pane, click Triggers, and then click Create Trigger.
In the Create Trigger panel, enter the required information and click OK.
Parameter
Operation
Example
Trigger Type
Select Tablestore.
Tablestore
Name
Enter a name for the trigger.
Tablestore-trigger
Version Or Alias
The default value is LATEST. If you want to create a trigger for another version or alias, you must first select the version or alias from the Version Or Alias drop-down list on the function details page. For more information about versions and aliases, see Version management and Alias management.
LATEST
Instance
Select a Tablestore instance from the drop-down list.
d00dd8xm****
Table
Select a table from the drop-down list.
mytable
Role Name
Select AliyunTableStoreStreamNotificationRole.
NoteIf you create this type of trigger for the first time, you must click OK and then click Authorize Now in the dialog box that appears.
AliyunTableStoreStreamNotificationRole
After the trigger is created, it is displayed on the Triggers tab. To modify or delete a trigger, see Trigger management.
Step 3: Configure test parameters for the function
On the Function Details tab, click the Code tab. Click the Test Function
icon and select Configure Test Parameters from the drop-down list.
In the Configure Test Parameters dialog box, click the Create New Test Event tab, select Tablestore from the Event Template drop-down list, and then enter the event name and event content. Click OK.
NoteIf you have created a Tablestore test event, you can click the Edit Existing Test Event tab and select an existing event name.
Tablestore triggers use the CBOR format to encode incremental data and form the event for Function Compute. The following code provides an example of the event content:
{ "Version": "Sync-v1", "Records": [ { "Type": "PutRow", "Info": { "Timestamp": 1506416585740836 }, "PrimaryKey": [ { "ColumnName": "pk_0", "Value": 1506416585881590900 }, { "ColumnName": "pk_1", "Value": "2017-09-26 17:03:05.8815909 +0800 CST" }, { "ColumnName": "pk_2", "Value": 1506416585741000 } ], "Columns": [ { "Type": "Put", "ColumnName": "attr_0", "Value": "hello_table_store", "Timestamp": 1506416585741 }, { "Type": "Put", "ColumnName": "attr_1", "Value": 1506416585881590900, "Timestamp": 1506416585741 } ] } ] }
The following table describes the parameters in the event content.
Parameter
Description
Version
The version of the payload. Example: Sync-v1. The value is of the STRING type.
Records
The array that stores the rows of incremental data in the data table. Each element contains the following parameters:
Type: the type of the operation that is performed on the row. Valid values: PutRow, UpdateRow, and DeleteRow. The value is of the STRING type.
Info: the information about the row, including the Timestamp parameter, which specifies the time when the row was last modified. The value of Timestamp must be in UTC. The value is of the INT64 type.
PrimaryKey
The array that stores the primary key columns. Each element contains the following parameters:
ColumnName: the name of the primary key column. The value is of the STRING type.
Value: the value of the primary key column. The value is of the formated_value type. Valid values: INTEGER, STRING, and BLOB.
Columns
The array that stores the attribute columns. Each element contains the following parameters:
Type: the type of the operation that is performed on the attribute column. Valid values: Put, DeleteOneVersion, and DeleteAllVersions. The value is of the STRING type.
ColumnName: the name of the attribute column. The value is of the STRING type.
Value: the value of the attribute column. The value is of the formated_value type. Valid values: INTEGER, BOOLEAN, DOUBLE, STRING, and BLOB.
Timestamp: the time when the attribute column was last modified. The value must be in UTC. The value is of the INT64 type.
Step 4: Write and test the function
After you create the Tablestore trigger, you can write function code and test the function to verify whether the code is correct. Functions are automatically invoked by triggers when the data in Tablestore is updated.
On the Function Details tab, click the Code tab. Use the code editor to write code, and then click Deploy Code.
In this example, the function code is written in Python. For more code examples in other runtimes, see Examples of Tablestore triggers for Function Compute.
import logging import cbor import json def get_attribute_value(record, column): attrs = record[u'Columns'] for x in attrs: if x[u'ColumnName'] == column: return x['Value'] def get_pk_value(record, column): attrs = record[u'PrimaryKey'] for x in attrs: if x['ColumnName'] == column: return x['Value'] def handler(event, context): logger = logging.getLogger() logger.info("Begin to handle event") # records = cbor.loads(event) records = json.loads(event) for record in records['Records']: logger.info("Handle record: %s", record) pk_0 = get_pk_value(record, "pk_0") attr_0 = get_attribute_value(record, "attr_0") return 'OK'
Click Test Function.
After the execution is complete, you can view the execution result at the top of the Function Code tab.
Modify and deploy the code.
After the test with
records=json.loads(event)
is successful, modify therecords
code torecords = cbor.loads(event)
.Click Deploy Code.
Functions are automatically invoked by triggers when data is written to Tablestore.
FAQ
If you cannot create a Tablestore trigger in a region, check the regions that support Tablestore triggers. For more information, see Considerations.
If you cannot find a created Tablestore data table when you create a Tablestore trigger, check whether the data table resides in the same region as the associated service in Function Compute.
In most cases, if an error that indicates a client cancels invocation is repeatedly reported when you use a Tablestore trigger, the timeout period configured for function execution on the client is shorter than the function execution duration. We recommend that you increase the timeout period on the client. For more information, see What do I do if a client disconnects and an error that indicates "Invocation canceled by client" is reported?
If data in added to a Tablestore table but the associated Tablestore trigger is not triggered, you can troubleshoot the issue by performing the following steps. For more information about how to troubleshoot issues related to triggers that cannot be triggered, see What do I do if a trigger cannot trigger a function?
Check whether the Stream feature is enabled for the data table. For more information, see Enable the Stream feature for the data table.
Check whether the role that you configured when you created the trigger is correct. You can use the default trigger role
AliyunTableStoreStreamNotificationRole
. For more information, see Create a Tablestore trigger.View the function execution logs to check whether the function failed to be executed. If a function fails to be executed, the function is retried until the log data in Tablestore expires.
Appendix: Grant Function Compute the permissions to access Tablestore
When you use the features provided by Function Compute, Function Compute needs to access Tablestore. In this case, you must grant the required permissions to the function. For coarse-grained authorization, you can select the default service role AliyunFCDefaultRole that is provided by Function Compute. For fine-grained authorization, you must grant other roles and the corresponding access policies to the function.
Use the default RAM role
Grant the AliyunOTSFullAccess permission (the permission to manage Tablestore) to the AliyunFCDefaultRole role. For more information, see Grant permissions to a RAM role.
NoteAliyunFCDefaultRole is the default service role of Function Compute, but it does not include the permissions to access Tablestore.
The first time you use the RAM role, you must grant the RAM role the permissions to manage Tablestore.
If the RAM role already has the permissions to manage Tablestore, skip this step.
Use a custom RAM role
For more information, see Example: Grant Function Compute the permissions to access OSS.