When you use Function Compute to consume log data, you can use the function template provided by Log Service or custom functions to consume data. This topic describes how to create a custom function.

Event

An event of a function is a string serialized from a JSON object. The event includes parameters that are used to run functions.

  • Parameters
    Parameter Description
    jobName The name of an extract, transform, load (ETL) job in Log Service. An ETL job in Log Service is as a trigger in Function Compute.
    taskId The identifier of a function call in the ETL job.
    cursorTime The UNIX timestamp of the last log entry received by Log Service in a function call.
    source The range of data that can be consumed in a function call. The value of this parameter is generated by Log Service on a regular basis for each function call.
    • endpoint: the endpoint of the region where the project resides. For more information, see Endpoints.
    • projectName: the name of the project.
    • logstoreName: the name of the Logstore.
    • shardId: the ID of a shard in the Logstore.
    • beginCursor: the position from which Function Compute starts to consume data in the specified shard.
    • endCursor: the position at which Function Compute stops consuming data in the specified shard.
      Note The consumption interval is left-closed and right-open, in the format of [beginCursor,endCursor).
    parameter The value of the Function Configuration parameter that you specify for creating a trigger. The value is a JSON object. If the custom ETL function is called, this field is parsed. For more information, see Create a trigger.
  • Example
    {
        "source": {
            "endpoint": "http://cn-shanghai-intranet.log.aliyuncs.com", 
            "projectName": "fc-****************", 
            "logstoreName": "demo", 
            "shardId": 0, 
            "beginCursor": "MTUwNTM5MDI3NTY1ODcwNzU2Ng==", 
            "endCursor": "MTUwNTM5MDI3NTY1ODcwNzU2OA=="
        }, 
        "parameter": {
            ...
        }, 
        "jobName": "fedad35f51a2a97b466da57fd71f315f539d2234", 
        "taskId": "9bc06c96-e364-4f41-85eb-b6e579214ae4",
        "cursorTime": 1511429883
    }

    If you want to debug a custom function, you can call the GetCursor operation to obtain cursors. Then, you can create an event in the preceding format to debug the function.

Function development

Log Service provides runtime environment SDKs in various programming languages such as Java, Python, and Node.js. You can use the SDKs to develop custom functions. For more information, see SDK reference.

In the following example, Java Runtime Environment 8 (JRE 8) is used to describe how to develop ETL functions for Log Service. For more information about how to develop functions in JRE 8, see Java programming guide for Function Compute.

  • Java function template

    Log Service provides a custom function template that is based on JRE 8. You can use the template to develop functions. For more information, see Custom function template.

    The function template has the following features:
    • Parses the source, taskId, and jobName fields of events.
    • Uses Log Service SDK for Java to obtain data based on the source parameter of events and calls the processData method to process the data. For more information, see Log Service SDK for Java.
    You can also use the template to perform the following operations:
    • Use the UserDefinedFunctionParameter.java file to parse the parameter field of events.
    • Use the processData method in the UserDefinedFunction.java file to define the business logic.
    • Replace UserDefinedFunction with a name that can help you identify the ETL function.
  • processData method

    You can use the processData method to consume, transform, and ship data. For example, in the LogstoreReplication.java file, the method is used to read data from a Logstore and write the data to another Logstore.

    Note
    • If data is processed by using the processData method, true is returned. If data fails to be processed by using the processData method, false is returned. If false is returned, the ETL function continues to run and Log Service considers that the ETL task is successful. Exception data is ignored.
    • If a serious error or logic exception is returned by using the throw Exception method, the ETL function stops running. In this case, Log Service considers that an exception occurs in the ETL function and recalls the function based on the rule set in the corresponding ETL job.
    • If data is written to or read from shards at a high speed, allocate sufficient memory resources to run the function. This way, out-of-memory (OOM) errors do not occur when the function is running.
    • If the function execution is time-consuming or data is written to or read from shards at a high speed, specify a short call interval and a long timeout period for the function.
    • Grant the required permissions to Function Compute. For example, if Function Compute needs to write data to Object Storage Service (OSS), you must grant the write permissions on OSS to Function Compute.

ETL logs

  • ETL scheduling logs

    ETL scheduling logs record the start time and end time of an ETL task, whether the task is successful, and the information returned when the task is successful. If an error occurs, an ETL error log entry is generated and an email or text message is sent to notify the system administrator. When you create a trigger, specify a Logstore to store trigger logs, enable the indexing feature, and configure indexes for the Logstore. For more information, see Configure indexes.

    You can use a function to obtain the execution results of an ETL function. For example, you can use the outputStream function to obtain the execution results of an ETL function that is based on JRE 8. The execution result of an ETL function that is developed based on the function template of Log Service is a string serialized from a JSON object. This string is recorded in the ETL scheduling log. You can search for and analyze the string.

  • ETL processing logs

    ETL processing logs record the key information and errors of each execution step. The logs record the start time, end time, initialization status, and errors of each step. You can monitor ETL function execution and locate errors based on processing logs at the earliest opportunity.

    You can use the context.getLogger() method to record ETL processing logs and store the logs in a Logstore of a specified project. We recommend that you enable the indexing feature for the Logstore.