Develop C++ custom processors for EAS Model Serving - Platform For AI - Alibaba Cloud - Platform For AI

PAI Elastic Algorithm Service (EAS) lets you extend the inference runtime with a custom processor written in C or C++. A custom processor is a shared object (.so) file that exports two functions — initialize() and process() — which EAS calls at specific points in the service lifecycle. Use this approach when the built-in processors do not support your model format or inference logic.

Quick start demo

Download the pai-prediction-example project, which includes two ready-to-run custom processors:

echo: Returns the user's input unchanged, along with a list of files in the model directory.
image_classification: Classifies MNIST images in JPG format and returns the predicted category.

For compilation instructions, see the README in the project root. For local debugging instructions, see the README in each processor's directory.

Interface definition

A custom processor must export exactly two C functions: initialize() and process(). Both are required — EAS calls them at different points in the service lifecycle.

For a complete working example of these functions, see Sample code.

initialize()

EAS calls initialize() once when the service starts, before handling any requests. Use this function to load your model into memory.

Function signature:

void *initialize(const char *model_entry, const char *model_config, int *state)

Parameter	Direction	Description
`model_entry`	Input	Entry file of the model package. Corresponds to the `model_entry` field in the JSON deployment config. Accepts a file name (for example, `randomforest.pmml`) or a directory path (for example, `./model`). For details, see Parameters for JSON deployment.
`model_config`	Input	Custom configuration for the model. Corresponds to the `model_config` field in the JSON deployment config. For details, see Parameters for JSON deployment.
`state`	Output	Model loading status. Set to `0` on success; any other value indicates a loading failure.
Return value		Pointer to the loaded model in memory. EAS passes this pointer to every subsequent `process()` call as `model_buf`.

process()

EAS calls process() on every inference request. Use this function to run inference and write the response.

Function signature:

int process(void *model_buf, const void *input_data, int input_size,
            void **output_data, int *output_size)

Parameter	Direction	Description
`model_buf`	Input	Model memory address returned by `initialize()`.
`input_data`	Input	Request payload. Can be a string or binary data.
`input_size`	Input	Length of the input data.
`output_data`	Output	Response payload. Allocate on the heap — the model releases this memory as configured.
`output_size`	Output	Length of the output data.
Return value		HTTP status code for the response. Return `0` or `200` for success.

Return values that are not recognized HTTP status codes are automatically converted to HTTP 400. Always use a standard HTTP status code (for example, 400 or 500) when returning an error.

Implementation requirements

Before deploying your processor, verify that it meets the following requirements:

Requirement	Details
Both functions exported	`initialize()` and `process()` must both be present and exported from the shared library.
Heap allocation for output	`output_data` must point to heap-allocated memory. The model releases this memory as configured. Do not use stack memory or static buffers.
Valid HTTP status codes only	Return `0` or `200` for success. For errors, use standard HTTP status codes (`400`, `500`, and so on). Unrecognized codes are converted to `400`.
Shared library format	Compile to a `.so` file with position-independent code (`-fPIC`).
processor_type set correctly	Set `processor_type` to `"cpp"` in the JSON deployment config.

Sample code

The following example implements an echo processor: it validates the input and returns it unchanged. No model is loaded.

Step 1: Write the processor code

#include <stdio.h>
#include <string.h>

extern "C" {
    // Called once at service startup. No model to load, so state is set to 0 (success).
    void *initialize(const char *model_entry, const char *model_config, int *state)
    {
        *state = 0;
        return NULL;
    }

    // Called on every inference request. Returns the input as-is.
    int process(void *model_buf, const void *input_data, int input_size,
                void **output_data, int *output_size)
    {
        if (input_size == 0) {
            const char *errmsg = "input data should not be empty";
            *output_data = strdup(errmsg);
            *output_size = strlen(errmsg);
            return 400;  // HTTP 400: bad request
        }

        // Allocate output on the heap — the model releases this memory as configured.
        *output_data = strdup((char *)input_data);
        *output_size = input_size;
        return 200;
    }
}

Step 2: Compile as a shared object (.so) file

Use the following Makefile to compile the processor into a shared object file named libpredictor.so:

CC=g++
CCFLAGS=-I./ -D_GNU_SOURCE -Wall -g -fPIC
LDFLAGS= -shared -Wl,-rpath=./
OBJS=processor.o
TARGET=libpredictor.so

all: $(TARGET)

$(TARGET): $(OBJS)
	$(CC) -o $(TARGET) $(OBJS) $(LDFLAGS) -L./

%.o: %.cc
	$(CC) $(CCFLAGS) -c $< -o $@

clean:
	rm -f $(TARGET) $(OBJS)

Step 3: Deploy the EAS service

Create a JSON deployment config that references your processor. Set processor_entry to the name of the compiled .so file and processor_type to "cpp".

{
    "name": "test_echo",
    "model_path": "http://*****.oss-cn-shanghai.aliyuncs.com/****/saved_model.tar.gz",
    "processor_path": "oss://path/to/echo_processor_release.tar.gz",
    "processor_entry": "libpredictor.so",
    "processor_type": "cpp",
    "metadata": {
        "instance": 1
    }
}

Field	Description
`processor_path`	OSS path to the compressed processor package (`.tar.gz`).
`processor_entry`	Main shared object file inside the package. EAS loads this file as the processor entry point.
`processor_type`	Must be set to `"cpp"` for C/C++ processors.