When you use Function Compute to process log data, you can use either the function templates provided by Log Service or your own custom functions. This topic shows how to build a custom function.
Function event
When you use Function Compute to process log data, you must configure the function's entry parameter (the function event) in Step 2. The event is a serialized JSON string.
-
Parameters
Parameter
Description
jobName
The name of the Log Service ETL job. Each Log Service trigger in Function Compute corresponds to an ETL job in Log Service.
taskId
For an ETL job, the
taskIduniquely identifies a specific function invocation.cursorTime
The UNIX timestamp of when the last log record in the current function invocation arrived at the Log Service server.
source
This field is generated by Log Service. Log Service periodically triggers function execution based on the task interval defined in the ETL job. The
sourcefield is a critical component of the function event and defines the consumption scope for the current function invocation.-
endpoint: The endpoint for the region where the Project is located. For more information, see Endpoints.
-
projectName: The name of the Project.
-
logstoreName: The name of the LogStore.
-
shardId: A specific shard within the LogStore.
-
beginCursor: The starting position for data consumption in the shard.
-
endCursor: The ending position for data consumption in the shard.
NoteThe data range
[beginCursor, endCursor)is inclusive ofbeginCursorand exclusive ofendCursor.
parameter
A JSON object that you set in the Function Configurations section when you create a trigger. Your custom Log Service ETL function can parse this field at runtime to obtain required parameters. For more information, see Log Service triggers.
-
-
Example
{ "source": { "endpoint": "http://cn-shanghai-intranet.log.aliyuncs.com", "projectName": "fc-****************", "logstoreName": "demo", "shardId": 0, "beginCursor": "MTUwNTM5MDI3NTY1ODcwNzU2Ng==", "endCursor": "MTUwNTM5MDI3NTY1ODcwNzU2OA==" }, "parameter": { ... }, "jobName": "fedad35f51a2a97b466da57fd71f315f539d2234", "taskId": "9bc06c96-e364-4f41-85eb-b6e579214ae4", "cursorTime": 1511429883 }When you debug a function, you can call the GetCursorByTime operation to obtain a cursor and build a function event for testing based on the preceding example.
Function development
You can develop functions in various languages, such as Java, Python, and Node.js. Log Service provides SDKs for the corresponding runtimes to simplify integration. For more information, see SDK reference.
The following section uses the Java 8 runtime to demonstrate how to develop a Log Service ETL function. For more information about programming functions in Java 8, see the Function Compute Java programming guide.
-
Java function template
Log Service provides a custom data template based on the Java 8 runtime. You can adapt this template to meet your requirements.
The template implements the following features:
-
Parses the
source,taskId, andjobNamefields from the function event. -
Fetches data from the source using the Log Service Java SDK and calls the
processDatainterface to process each data batch.
You must implement the following in the template:
-
Parse the
parameterfield from the function event. Implement this logic inUserDefinedFunctionParameter.java. -
Implement your data processing logic in the
processDatainterface ofUserDefinedFunction.java. -
Replace
UserDefinedFunctionwith a descriptive name for your function.
-
-
Implement the processData interface
Within the
processDatainterface, you can consume, transform, and deliver a batch of data. For example, the LogStoreReplication sample reads data from one LogStore and writes it to another.Note-
If
processDatasuccessfully processes the data, it returnstrue. If it encounters a non-retriable error, it returnsfalse. In the latter case, the function continues to run, and Log Service considers the ETL task successful but ignores the unprocessed data. -
If a fatal error occurs or your business logic requires early termination, throw an exception to exit the function. Log Service detects the function failure and reinvokes the function according to the retry policy of the ETL job.
-
If a shard has high throughput, configure a sufficient memory size for your function to prevent unexpected termination due to Out-of-Memory (OOM) errors.
-
If your function performs time-consuming operations or a shard has high throughput, set a short function trigger interval and a long function execution timeout.
-
Grant the function the necessary permissions. For example, if the function needs to write data to OSS, it must have OSS write permissions.
-
ETL logs
-
ETL scheduling logs
Scheduling logs record the start time, end time, and success status of an ETL job, along with any information returned on success. If an ETL job fails, an error log is generated and an alert is sent to the system administrator. When you create a trigger, specify a LogStore for its logs and then enable and configure an index for that LogStore. For more information, see Create an index.
A function can return execution statistics. For example, in a Java 8 runtime function, you can write statistics to the
outputStream. The function template provided by Log Service writes a serialized JSON object as a string. This string is recorded in the ETL scheduling logs, allowing for statistical analysis and queries. -
ETL process logs
These logs record key checkpoints and error information for each step during an ETL execution, including step start and end times, initialization status, and module errors. Process logs help you monitor the ETL job's runtime status and quickly diagnose errors.
Use
context.getLogger()to write process logs and store them in a specified LogStore within a Log Service Project. We recommend that you enable index-based queries for this LogStore.