This topic describes how to use Function Compute to perform real-time computing on the incremental data of Tablestore.

Background information

Alibaba Cloud Function Compute is an event-driven computing service that allows you to focus on writing and uploading code without the need to manage servers. Function Compute prepares computing resources and runs your code in an elastic and reliable way. You are charged only for the resources that are consumed when the code is run. For more information, see What is Function Compute?. For more information about the examples on how to use Function Compute, see Examples on how to use Function Compute in Tablestore.

A Tablestore stream is a data channel used to obtain incremental data in Tablestore data tables. After you create Tablestore triggers, a Tablestore stream and a function in Function Compute can be automatically connected. This allows the custom program logic in the function to automatically process modifications to data in Tablestore data tables.

Scenarios

The following figure shows the tasks that you can use Function Compute to perform.

  • Data synchronization: You can use Function Compute to synchronize real-time data stored in Tablestore to data cache, search engines, or other database instances.
  • Data archiving: You can use Function Compute to incrementally archive data stored in Tablestore to OSS for cold archiving.
  • Event-driven application: You can create triggers to trigger functions to call API operations provided by IoT Hub. You can also create triggers to send notifications.
fig_fuc001

Configure a Tablestore trigger

You can create a Tablestore trigger in the Tablestore console to process the real-time data stream generated by the incremental data in a Tablestore data table.

  1. Create a data table and enable the stream feature for the data table.
    1. Log on to the Tablestore console and create an instance.
    2. Create a data table in the created instance and enable the stream feature for the table.
  2. Create a function.
    1. Log on to the Function Compute console.
    2. In the left-side navigation pane, click Services and Functions.
    3. On the Services and Functions page, click Create Function.
    4. On the Create Function page, click Event Function and then click Configure and Deploy.
    5. Configure the parameters based on your requirements, and then click Create.
      fig_newfunction
  3. Configure the service role.
    1. In the left-side navigation pane, click Services and Functions.
    2. On the Services and Functions page, click the Service Configurations tab.
    3. On the Service Configurations tab, configure the role used to authorize the function to collect logs and access other resources. For more information, see Grant permissions to a RAM user by using an Alibaba Cloud account.
  4. Create and test a Tablestore trigger.
    Note The first time you use Function Compute in the Tablestore console, authorize Tablestore to send event notifications in the previous version of the Tablestore console. On the Overview page of the Tablestore console, click Old Version to go to the previous version of the Tablestore console.
    1. On the Trigger tab of the table, click Use Existing Function Compute.
    2. In the Create Trigger dialog box, select Function Compute and the function, and enter the name of the trigger.
      Note The first time you create a trigger in the previous version of the Tablestore console, click Grant Tablestore the permission to send event notifications.

      After Tablestore is granted the permissions to send event notifications, you can view the AliyunTableStoreStreamNotificationRole role that is automatically created in the RAM console.

    3. Click OK.

Process data

  • Data formats

    A Tablestore trigger encodes the incremental data in the CBOR format to create a Function Compute event. The following code provides an example on the format of incremental data:

    {
        "Version": "string",
        "Records": [
            {
                "Type": "string",
                "Info": {
                    "Timestamp": int64
                },
                "PrimaryKey": [
                    {
                        "ColumnName": "string",
                        "Value": formated_value
                    }
                ],
                "Columns": [
                    {
                        "Type": "string",
                        "ColumnName": "string",
                        "Value": formated_value,
                        "Timestamp": int64
                    }
                ]
            }
        ]
    }
  • Elements

    The following table describes the elements included in the preceding format.

    Element Description
    Version The version of the payload, which is Sync-v1. Data type: string.
    Records The array that contains the row of incremental data in the data table. This element includes the following members:
    • Type: the type of the operation performed on the row. Valid values: PutRow, UpdateRow, and DeleteRow. Data type: string.
    • Info: includes the Timestamp member, which indicates the time when the row was last modified. The value of Timestamp must be in UTC. Data type: int64.
    PrimaryKey The array that stores the primary key column. This element includes the following members:
    • ColumnName: the name of the primary key column. Data type: string.
    • Value: the content of the primary key column. Data type: formated_value. Valid values: integer, string, and blob.
    Columns The array that stores the attribute columns. This element includes the following members:
    • Type: the type of the operation performed on the attribute column. Valid values: Put, DeleteOneVersion, and DeleteAllVersions. Data type: string.
    • ColumnName: the name of the attribute column. Data type: string.
    • Value: the content of the attribute column. Data type: formated_value. Valid values: integer, boolean, double, string, and blob.
    • Timestamp: the time when the attribute column was last modified. The value of Timestamp must be in UTC. Data type: int64.
  • Sample data
    {
        "Version": "Sync-v1",
        "Records": [
            {
                "Type": "PutRow",
                "Info": {
                    "Timestamp": 1506416585740836
                },
                "PrimaryKey": [
                    {
                        "ColumnName": "pk_0",
                        "Value": 1506416585881590900
                    },
                    {
                        "ColumnName": "pk_1",
                        "Value": "2017-09-26 17:03:05.8815909 +0800 CST"
                    },
                    {
                        "ColumnName": "pk_2",
                        "Value": 1506416585741000
                    }
                ],
                "Columns": [
                    {
                        "Type": "Put",
                        "ColumnName": "attr_0",
                        "Value": "hello_table_store",
                        "Timestamp": 1506416585741
                    },
                    {
                        "Type": "Put",
                        "ColumnName": "attr_1",
                        "Value": 1506416585881590900,
                        "Timestamp": 1506416585741
                    }
                ]
            }
        ]
    }

Debug functions online

Function Compute allows you to debug functions online. You can create an event to trigger a function and test whether the function logic is implemented as expected.

Tablestore events that trigger Function Compute are in the CBOR format, which is a JSON-like binary format. Therefore, you can debug a function online by using the following methods:

  1. Add both "import CBOR" and "import JSON" to the code.
  2. On the Services and Functions page, click the name of the function that you want to debug.
  3. On the details page of the function, click the Code tab.
  4. On the Code tab, click Event. In the Test Event panel that appears, select Custom. Copy the preceding sample data to the editor and modify the data based on your requirements. Click OK.
  5. Test the function.
    1. Add records = json.loads(event) to the code to process the custom event. Click Save and Invoke. Check whether the results are returned as expected.
    2. After you test the function by using records=json.loads(event), you can change this line of code to records = cbor.loads(event) and click Save. This way, the corresponding function logic is triggered when data is written to the Tablestore data table.
    fig_functionname_001
    Sample code:
    import logging
    import cbor
    import json
    def get_attrbute_value(record, column):
        attrs = record[u'Columns']
        for x in attrs:
            if x[u'ColumnName'] == column:
                return x['Value']
    def get_pk_value(record, column):
        attrs = record[u'PrimaryKey']
        for x in attrs:
            if x['ColumnName'] == column:
                return x['Value']
    def handler(event, context):
        logger = logging.getLogger()
        logger.info("Begin to handle event")
        #records = cbor.loads(event)
        records = json.loads(event)
        for record in records['Records']:
            logger.info("Handle record: %s", record)
            pk_0 = get_pk_value(record, "pk_0")
            attr_0 = get_attrbute_value(record, "attr_0")
        return 'OK'