You can use Function Compute to consume log data. Based on different scenarios, you can use the function template provided by Log Service or custom functions to consume data. This topic describes how to create a custom function.

Function event

A function event is a string serialized from a JSON object. It includes parameters used to run functions.

  • Parameters
    Parameter Description
    jobName The name of an ETL job in Log Service. An ETL job in Log Service is a trigger in Function Compute.
    taskId A task ID is the identifier of a function call.
    cursorTime The UNIX timestamp of the last log entry received by Log Service in a function call.
    source The range of data that can be consumed in a function call. This field is generated by Log Service on a regular basis for each function call.
    • endpoint: the endpoint of the region where the project resides. For more information, see Endpoints.
    • projectName: the name of a project.
    • logstoreName: the name of a Logstore.
    • shardId: the ID of a specified shard in the Logstore.
    • beginCursor: the start position to consume data.
    • endCursor: the end position to consume data.
      Note The consumption interval is left-closed and right-open, in the format of [beginCursor,endCursor).
    parameter The configurations in the Function Configuration field of a trigger. The value is a JSON object. When the ETL function is invoked, this field is parsed. For more information, see Create a trigger.
  • Examples
    {
        "source": {
            "endpoint": "http://cn-shanghai-intranet.log.aliyuncs.com", 
            "projectName": "fc-****************", 
            "logstoreName": "demo", 
            "shardId": 0, 
            "beginCursor": "MTUwNTM5MDI3NTY1ODcwNzU2Ng==", 
            "endCursor": "MTUwNTM5MDI3NTY1ODcwNzU2OA=="
        }, 
        "parameter": {
            ...
        }, 
        "jobName": "fedad35f51a2a97b466da57fd71f315f539d2234", 
        "taskId": "9bc06c96-e364-4f41-85eb-b6e579214ae4",
        "cursorTime": 1511429883
    }

    If you want to debug a custom function, you can call the GetCursor operation to obtain the begin cursor and end cursor. Then you can create a function event in the preceding format to debug the function.

Function development

You can use Java, Python, Node.js, and other programming languages to develop custom functions. Log Service provides runtime environment SDKs of the programming languages. You can use the SDKs to develop functions. For more information, see SDK reference.

In the following example, Java Runtime Environment 8 (JRE 8) is used to describe how to develop ETL functions for Log Service. For more information about how to use Java 8 to develop functions, see Java programming guide for Function Compute.

  • Java function template

    Log Service provides a custom function template that is based on the JRE 8. You can use the template to develop functions.

    The function template has the following features:
    • Parses the source, taskId, and jobName fields of function events.
    • Uses the Log Service SDK for Java to obtain data based on the source parameter of function events and invokes the processData method to process the data.
    You can also use the template to perform the following operations:
    • Use the UserDefinedFunctionParameter.java file to parse the parameter field of function events.
    • Use the processData method in the UserDefinedFunction.java file to process data.
    • Replace UserDefinedFunction with an informative name for an ELT function.
  • processData method

    You can use the processData method to consume, transform and ship data. For example, in the LogstoreReplication.java file, the method is used to read data from a Logstore and write the data to another Logstore.

    Note
    • If data is processed by using the processData method, true is returned. If data fails to be processed by using the processData method, false is returned. If false is returned, the ETL function still runs and Log Service considers that the ETL task succeeds. Exception data is ignored.
    • If a serious error or rule exception is returned by using the throw Exception method, the ETL function stops running. In this case, Log Service considers that an exception occurs with the ETL function and recalls the function based on the rule set in the corresponding ETL job.
    • If the traffic of data written to or read from shards is fast, allocate sufficient memory resources to run the function. This prevents out-of-memory (OOM) errors when the function is running.
    • If the function execution is time-consuming or the traffic of data written to or read from shards is fast, set a relatively short invocation interval and a relatively long timeout period for the function.
    • Grant relevant permissions to Function Compute. For example, if Function Compute needs to write data to Object Storage Service (OSS), you must grant write permissions on OSS to Function Compute.

ETL logs

  • ETL scheduling logs

    ETL scheduling logs record the start time and end time of an ETL task, whether the task is successful, and the information returned when the task is successful. If an error occurs, an ETL error log entry is generated and an alert email or SMS message is sent to the system administrator. When you create a trigger, specify a Logstore to store trigger logs, enable the indexing feature, and configure indexes for the Logstore. For more information, see Enable and configure the index feature for a Logstore.

    You can use a function to return the execution results of an ETL function. For example, you can use the outputStream function to return the execution results of an ETL function that is based on the JRE 8. The execution result of an ETL function developed based on the function template of Log Service is a string serialized from an JSON object. This string is recorded in the ETL task scheduling log. You can search and analyze the string with high efficiency.

  • ETL processing logs

    ETL processing logs record the key information and errors of each execution step. The logs record the start time, end time, initialization status, and errors of each step. You can use ETL processing logs to monitor ETL function execution and locate errors in a timely manner.

    You can use the context.getLogger() method to record ETL processing logs and store the logs in a Logstore of a specified project. We recommend that you enable the indexing feature for the Logstore.