Simple Log Service:Create a custom function - Simple Log Service

Function event

When you use Function Compute to process log data, you must configure the function's entry parameter (the function event) in Step 2. The event is a serialized JSON string.

Parameters

Parameter	Description
jobName	The name of the Log Service ETL job. Each Log Service trigger in Function Compute corresponds to an ETL job in Log Service.
taskId	For an ETL job, the `taskId` uniquely identifies a specific function invocation.
cursorTime	The UNIX timestamp of when the last log record in the current function invocation arrived at the Log Service server.
source	This field is generated by Log Service. Log Service periodically triggers function execution based on the task interval defined in the ETL job. The `source` field is a critical component of the function event and defines the consumption scope for the current function invocation. endpoint: The endpoint for the region where the Project is located. For more information, see Endpoints. projectName: The name of the Project. logstoreName: The name of the LogStore. shardId: A specific shard within the LogStore. beginCursor: The starting position for data consumption in the shard. endCursor: The ending position for data consumption in the shard. Note The data range `[beginCursor, endCursor)` is inclusive of `beginCursor` and exclusive of `endCursor`.
parameter	A JSON object that you set in the Function Configurations section when you create a trigger. Your custom Log Service ETL function can parse this field at runtime to obtain required parameters. For more information, see Log Service triggers.

Example

{
    "source": {
        "endpoint": "http://cn-shanghai-intranet.log.aliyuncs.com", 
        "projectName": "fc-****************", 
        "logstoreName": "demo", 
        "shardId": 0, 
        "beginCursor": "MTUwNTM5MDI3NTY1ODcwNzU2Ng==", 
        "endCursor": "MTUwNTM5MDI3NTY1ODcwNzU2OA=="
    }, 
    "parameter": {
        ...
    }, 
    "jobName": "fedad35f51a2a97b466da57fd71f315f539d2234", 
    "taskId": "9bc06c96-e364-4f41-85eb-b6e579214ae4",
    "cursorTime": 1511429883
}

When you debug a function, you can call the GetCursorByTime operation to obtain a cursor and build a function event for testing based on the preceding example.

Function development

You can develop functions in various languages, such as Java, Python, and Node.js. Log Service provides SDKs for the corresponding runtimes to simplify integration. For more information, see SDK reference.

The following section uses the Java 8 runtime to demonstrate how to develop a Log Service ETL function. For more information about programming functions in Java 8, see the Function Compute Java programming guide.

Java function template

Log Service provides a custom data template based on the Java 8 runtime. You can adapt this template to meet your requirements.
The template implements the following features:
- Parses the source, taskId, and jobName fields from the function event.
- Fetches data from the source using the Log Service Java SDK and calls the processData interface to process each data batch.
You must implement the following in the template:
- Parse the parameter field from the function event. Implement this logic in UserDefinedFunctionParameter.java.
- Implement your data processing logic in the processData interface of UserDefinedFunction.java.
- Replace UserDefinedFunction with a descriptive name for your function.
Implement the processData interface

Within the processData interface, you can consume, transform, and deliver a batch of data. For example, the LogStoreReplication sample reads data from one LogStore and writes it to another.
Note
- If processData successfully processes the data, it returns true. If it encounters a non-retriable error, it returns false. In the latter case, the function continues to run, and Log Service considers the ETL task successful but ignores the unprocessed data.
- If a fatal error occurs or your business logic requires early termination, throw an exception to exit the function. Log Service detects the function failure and reinvokes the function according to the retry policy of the ETL job.
- If a shard has high throughput, configure a sufficient memory size for your function to prevent unexpected termination due to Out-of-Memory (OOM) errors.
- If your function performs time-consuming operations or a shard has high throughput, set a short function trigger interval and a long function execution timeout.
- Grant the function the necessary permissions. For example, if the function needs to write data to OSS, it must have OSS write permissions.

ETL logs

ETL scheduling logs

Scheduling logs record the start time, end time, and success status of an ETL job, along with any information returned on success. If an ETL job fails, an error log is generated and an alert is sent to the system administrator. When you create a trigger, specify a LogStore for its logs and then enable and configure an index for that LogStore. For more information, see Create an index.

A function can return execution statistics. For example, in a Java 8 runtime function, you can write statistics to the outputStream. The function template provided by Log Service writes a serialized JSON object as a string. This string is recorded in the ETL scheduling logs, allowing for statistical analysis and queries.
ETL process logs

These logs record key checkpoints and error information for each step during an ETL execution, including step start and end times, initialization status, and module errors. Process logs help you monitor the ETL job's runtime status and quickly diagnose errors.

Use context.getLogger() to write process logs and store them in a specified LogStore within a Log Service Project. We recommend that you enable index-based queries for this LogStore.