All Products
Search
Document Center

Artificial Intelligence Recommendation:Java SDK

Last Updated:Mar 23, 2026

To integrate Feature Generator (FG) into a Java service, use the Java software development kit (SDK) to perform feature transformations. This topic describes how to use the Java SDK for the Feature Generator tool.

Limits

  • The Java SDK runs only on the linux-x86_64 and macosx-arm64 platforms.

Using the Java SDK

This section uses a Maven project to demonstrate how to use the Java SDK.

1. Download the FG SDK JAR package to a local location, such as /path/to/feature_generator-${version}-${platform}.jar.

Platform

FG SDK JAR package

linux-x86_64

fg-linux-x86_64.jar

macosx-arm64

fg-macosx-arm64.jar

Note: Currently, FG uses std::hash for hash bucketing by default. The implementation of std::hash in the C++ standard library differs across platforms, leading to inconsistent bucketing results.

If you need cross-platform consistency for bucketing results, set the environment variable USE_FARM_HASH_TO_BUCKETIZE=true. For more information, see the FG global configuration documentation.

2. Use the mvn install:install-file command to install the package to your local repository.

mvn install:install-file \
  -Dfile=/path/to/feature_generator-${version}-${platform}.jar \
  -DgroupId=com.aliyun.pai \
  -DartifactId=feature_generator \
  -Dversion=${version} \
  -Dclassifier=${platform} \
  -Dpackaging=jar

Note: Replace ${version} in the command with the actual version number, and ${platform} with the actual platform name.

3. Add the dependency to your pom.xml file.

<dependencies>
<dependency>
    <groupId>com.aliyun.pai</groupId>
    <artifactId>feature_generator</artifactId>
    <version>${version}</version>
    <classifier>linux-x86_64</classifier>
</dependency>
</dependencies>

4. Create the fg.json file.

Example:

{
  "features": [
    {
      "feature_name": "query_word",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "user:query_word",
      "default_value": "",
      "combiner": "mean",
      "need_prefix": false,
      "is_multi": true
    },
    {
      "feature_name": "query_match",
      "feature_type": "lookup_feature",
      "map": "user:query_token",
      "key": "item:title",
      "needDiscrete": false,
      "needKey": false,
      "default_value": "0",
      "combiner": "sum",
      "need_prefix": false,
      "value_type": "double"
    },
    {
      "feature_name": "goods_id",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "item:goods_id",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "filter_type",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "item:filter_type",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "day_h",
      "feature_type": "id_feature",
      "value_type": "int64",
      "expression": "user:day_h",
      "default_value": "0",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "week_day",
      "feature_type": "id_feature",
      "value_type": "int64",
      "expression": "user:week_day",
      "default_value": "0",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "city",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "user:city",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "province",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "user:province",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "country",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "user:country",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "is_new_user",
      "feature_type": "id_feature",
      "value_type": "int64",
      "expression": "user:is_new_user",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "value_dimension": 1
    },
    {
      "feature_name": "focus_author",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "user:focus_author",
      "separator": ",",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "is_multi": true
    },
    {
      "feature_name": "title",
      "feature_type": "id_feature",
      "value_type": "String",
      "expression": "item:title",
      "default_value": "-1024",
      "combiner": "mean",
      "need_prefix": false,
      "is_multi": true
    }
  ],
  "reserves": [
    "request_id",
    "user_id",
    "is_click",
    "is_pay"
  ]
}

5. Use the Java API as shown in the following example.

package org.example;

import com.aliyun.pai.fg.*;
import java.net.URL;
import java.util.*;

public class Main {
    public static void main(String[] args) {
        String filePath = "/path/to/fg.json";
        FgHandler handler = new FgHandler(filePath, 4, false);

        List<String> outputs = new ArrayList<>();
        outputs.add("goods_id");
        outputs.add("is_new_user");
        outputs.add("day_h");
        outputs.add("query_match"); // set output feature_name
        outputs.add("title");
        outputs.add("filter_type");
        outputs.add("city");
        outputs.add("province");
        outputs.add("country");
        outputs.add("focus_author");
        handler.setOutputs(outputs.toArray(new String[0]));

        List<String> expectGoods = Arrays.asList("218687106", "1142068348", "1142068347");
        VariantVectorMap inputs = new VariantVectorMap.Builder()
                .putOptionalString("goods_id", expectGoods)
                .putOptionalInt32("is_new_user", Arrays.asList(0, 1, 0))
                .putOptionalInt32("day_h", Arrays.asList(6, 8, 11))
                .putListString("title", Arrays.asList(
                        Arrays.asList("k2", "k3", "k5"),
                        Arrays.asList("k1", "k2", "k3"),
                        Arrays.asList("k2", "k4")))
                .putMapStringFloat("query_token", Arrays.asList(
                        new HashMap<String, Float>() {{
                            put("k2", 0.8f);
                            put("k3", 0.5f);
                        }},
                        new HashMap<String, Float>() {{
                            put("k1", 0.9f);
                            put("k4", 0.2f);
                        }},
                        new HashMap<String, Float>() {{
                            put("k1", 0.7f);
                            put("k2", 0.3f);
                            put("k4", 0.6f);
                        }}))
                .putOptionalString("filter_type", Arrays.asList(null, "f1", null))
                .putOptionalString("city", Arrays.asList("hangzhou"))
                .putOptionalString("province", Arrays.asList("zhejiang"))
                .putOptionalString("country", Arrays.asList("china"))
                .putOptionalString("focus_author", Arrays.asList("2255010511022,14164467", "10511022,24164467", "550105110,34164467"))
                .build();

        VariantVectorMap results = handler.Process(inputs);
        if (results == null || results.isNull()) {
            System.out.println("fg result is null");
            return;
        }
        inputs.close();  // Release native memory.
        
        System.out.println("result size=" + results.size());
        List<String> features = results.getKeys();
        System.out.println("result features=" + features);
        List<String> goodsIds = results.getList("goods_id");
        System.out.println("goods_ids=" + String.join(", ", goodsIds));

        List<List<String>> titles = results.getList("title");
        System.out.println("titles=" + titles);

        List<Long> dayHours = results.getList("day_h");
        System.out.println("day_h=" + dayHours);

        List<String> filters = results.getList("filter_type");
        System.out.println("filter_type=" + String.join(", ", filters));

        List<List<String>> focus = results.getList("focus_author");
        System.out.println("focus_author=" + focus);

        List<String> citys = results.getList("city");
        System.out.println("city=" + String.join(", ", citys));

        List<String> provinces = results.getList("province");
        System.out.println("provinces=" + String.join(", ", provinces));

        List<String> countrys = results.getList("country");
        System.out.println("country=" + String.join(", ", countrys));

        List<Long> isNewUsers = results.getList("is_new_user");
        System.out.println("is_new_user=" + isNewUsers);

        List<Double> queryMatch = results.getList("query_match");
        System.out.println("query_match=" + queryMatch);
        System.out.println("===========================================================");

        Set<String> itemInputs = handler.GetItemInputNames();
        System.out.println("item side inputs=" + itemInputs);

        Set<String> userInputs = handler.GetUserInputNames();
        System.out.println("user side inputs=" + userInputs);

        Set<String> ctxInputs = handler.GetContextInputNames();
        System.out.println("context side inputs=" + ctxInputs);

        Set<String> reserved = handler.GetReserveColumns();
        System.out.println("reserved columns =" + reserved);

        Set<String> qminputs = handler.GetFeatureInputs("query_match");
        System.out.println("inputs of query_match =" + qminputs);

        String defaultVal = handler.DefaultFeatureValue("query_match");
        System.out.println("default feature value of query_match =" + defaultVal);

        List<String> allFeatures = handler.GetAllFeatureNames();
        System.out.println("all feature names:" + allFeatures);

        Map<String, String> schema = handler.GetTableSchema();
        System.out.println("table schema:" + schema);
        
        results.close();  // Release native memory.
        handler.close();  // Release resources.
    }
}

Java SDK documentation

1. Class overview

The FgHandler class processes input features based on a configuration file (config_json) and returns the processed results.

Key capabilities:

  • Initializes a feature processor from a configuration. You can specify parameters, such as the number of threads and whether to perform only the bucketize operation.

  • Processes input features using the Process(VariantVectorMap inputs) method, which returns a VariantVectorMap.

  • Sets the output feature collection using the setOutputs(...) method.

  • Queries metadata from the configuration, such as the names, schema, default values, and dimensions of inputs, outputs, and features.

  • Provides the close() method to explicitly release native resources. This step is important.

2. Getting started

2.1 Initialization

Four constructor methods are available. All of these methods call the native allocate function:

// 1) Configuration only
FgHandler handler = new FgHandler(configJsonOrPath);
// 2) Configuration and thread count
FgHandler handler = new FgHandler(configJsonOrPath, threadNum);
// 3) Configuration, thread count, and bucketize only flag
FgHandler handler = new FgHandler(configJsonOrPath, threadNum, bucketizeOnly);
// 4) Configuration, thread count, bucketize only flag, and config is file path flag
FgHandler handler = new FgHandler(configJsonOrPath, threadNum, bucketizeOnly, isCfgPath);

Parameter descriptions:

  • config_json: The configuration content or the path to the configuration file. The is_cfg_path parameter determines which of these is used.

  • thread_num: The number of internal processing threads for the native layer (size_t).

  • bucketize_only: Specifies whether to perform only the bucketize operation. You can check this setting using IsOnlyBucketize().

  • is_cfg_path: If true, the config_json parameter is a file path. If false, it is the JSON configuration content.

2.2 Set output features

You can set the outputs in one of two ways:

handler.setOutputs("feature_a", "feature_b");
// Or, use a String[] array as the parameter.
List<String> outputs = new ArrayList<>();
outputs.add("goods_id");
outputs.add("is_new_user");
outputs.add("title");
outputs.add("city");
outputs.add("province");
outputs.add("country");
handler.setOutputs(outputs.toArray(new String[0]));

Alternatively, you can use the low-level API:

StrVector outs = new StrVector();
outs.push_back("feature_a");
outs.push_back("feature_b");
handler.setOutputs(outs);

Note: If you do not set any outputs, all features in the configuration are output by default.

2.3 Execute processing

VariantVectorMap inputs = new VariantVectorMap();
// TODO: Fill inputs based on your VariantVectorMap conventions.
VariantVectorMap outputs = handler.Process(inputs);
if (outputs == null || outputs.isNull()) {
   System.out.println("fg output is null");
   inputs.close();
   return;
}
// TODO: Read results from outputs.

The FgHandler.Process(inputs) method processes a batch of samples at a time. The method takes a VariantVectorMap as input and returns a VariantVectorMap. This means:

  • VariantVectorMap: A Map<String, VariantVector> where the key is the field or feature name.

  • VariantVector: A batch data container for column-oriented storage, also known as a column vector. It supports the following nested forms:

    • Optional scalar: A List<T> whose length is the batch size. It can contain null values.

    • List sequence: A List<List<T>> where the outer list has a length equal to the batch size.

    • Map feature: A List<Map<K,V>> where the outer list has a length equal to the batch size.

    • Matrix feature: A List<List<List<T>>> where the outer list has a length equal to the batch size. Each sample is a 2D matrix.

  • The `Process` function may return null. A null value indicates that a feature transformation failed. Check if the result is null before you use it.

3. Input and output data structures

3.1 VariantVectorMap (Input/output container)

3.1.1 Basic API

  • Structure:

    • new VariantVectorMap()

    • VariantVectorMap.fromJavaMap(Map<String, VariantVector>)

    • new VariantVectorMap.Builder()...build()

  • Enter text:

    • put(String key, VariantVector value)

    • putAll(Map<String, VariantVector>)

  • Reading:

    • VariantVector get(String key)

    • boolean contains(String key)

    • List<String> getKeys()

    • long size()

    • Map<String, VariantVector> toJavaMap() (Converts the key-value pairs in the current map to a Java Map.)

  • Release:

    • clear(): Clears all elements.

    • close(): Releases native memory. You must call this method to prevent memory leaks.

3.1.2 Recommended: Build inputs with Builder

The VariantVectorMap.Builder provides multiple putXXX methods.

Common (optional scalar, for one batch):

VariantVectorMap inputs = new VariantVectorMap.Builder()
    .putOptionalInt64("user_id", Arrays.asList(1001L, 1002L))
    .putOptionalString("item_id", Arrays.asList("i1", "i2"))
    .putOptionalDouble("score", Arrays.asList(0.1, null)) // Supports null.
    .build();

Single value convenience (batch=1):

VariantVectorMap inputs = new VariantVectorMap.Builder()
    .putSingleOptionalInt64("user_id", 1001L)
    .putSingleOptionalString("item_id", "i1")
    .build();

Sequence (List type, one list per sample in the batch):

VariantVectorMap inputs = new VariantVectorMap.Builder()
    .putListInt64("clicked_item_ids",
        Arrays.asList(
            Arrays.asList(11L, 12L),
            Arrays.asList(21L)
        )
    )
    .build();

Map feature (one map per sample in the batch):

VariantVectorMap inputs = new VariantVectorMap.Builder()
    .putMapStringFloat("user_dense",
        Arrays.asList(
            Map.of("age", 18.0f, "lvl", 3.0f),
            Map.of("age", 25.0f)
        )
    )
    .build();

Matrix (one 2D matrix per sample):

VariantVectorMap inputs = new VariantVectorMap.Builder()
    .putMatrixFloat("image_emb",
        Arrays.asList(
            Arrays.asList( // sample1 matrix
                Arrays.asList(0.1f, 0.2f),
                Arrays.asList(0.3f, 0.4f)
            ),
            Arrays.asList( // sample2 matrix
                Arrays.asList(1.0f, 2.0f)
            )
        )
    )
    .build();

3.1.3 getList: Read data in a type-safe way (common for outputs)

The VariantVectorMap.getList(key) method automatically selects a registered TypeReference for conversion based on the return value of VariantVector.getType().

Example: To read the output feature_x, assuming it is OPTIONAL_FLOAT:

List<Float> xs = out.getList("feature_x");   // Actually returns List<Float>.

To read a LIST_INT64, which returns List<List>:

List<List<Long>> seq = out.getList("seq_feature");

Note:

  • The return type of getList is <T> List<T>, which cannot be strongly constrained at compile time. A type mismatch at runtime throws an IllegalArgumentException("Type mismatch...").

  • TYPE_REGISTRY contains pre-registered mappings from all defined type constants to TypeReference.

Recommended practice: First, retrieve the schema of the output table. Then, determine the return value type of the getList function based on the schema.

Map<String, String> schema = handler.GetTableSchema();
System.out.println("table schema:" + schema);

3.2 VariantVector (Column vector and type system)

A VariantVector represents single-column data. You must set its type and organize the internal data according to that type.

3.2.1 Supported types (by constant)

  • Optional (allows null, outer layer is the batch):

    • OPTIONAL_STRING / INT32 / INT64 / FLOAT / DOUBLE

    • Java representation: List<T> (length = batch size. Elements can be null).

  • List (a sequence for each sample):

    • LIST_STRING / INT32 / INT64 / FLOAT / DOUBLE

    • Java representation: List<List<T>> (outer list length = batch size).

  • Map (one map for each sample):

    • MAP_STRING_*: List<Map<String, V>>

    • MAP_INT32_*: List<Map<Integer, V>>

    • MAP_INT64_*: List<Map<Long, V>>

  • Matrix (one 2D matrix for each sample):

    • MATRIX_FLOAT / MATRIX_INT64 / MATRIX_STRING

    • Java representation: List<List<List<T>>>

3.2.2 Recommended construction method: Static fromXXX methods

You do not need to call the low-level Builder(type).withXXXData() methods manually. Instead, you can use the following static methods:

  • VariantVector.fromOptionalInt64(List<Long> values)

  • VariantVector.fromListFloat(List<List<Float>> values)

  • VariantVector.fromMapStringInt32(List<Map<String,Integer>> maps)

  • VariantVector.fromMatrixFloat(List<List<List<Float>>> matrices), and more.

Example (manually constructing and then putting into a map):

VariantVector userIds = VariantVector.fromOptionalInt64(Arrays.asList(1L, 2L, null));
VariantVectorMap inputs = new VariantVectorMap();
inputs.put("user_id", userIds);

Objects created this way own their native memory. You must call close() to release the native memory after use.

Another type of object uses a borrowed memory model. For example:

VariantVectorMap results;
// build results
VariantVector v = results.get("goods_id");

This type of object is a view. The underlying memory is shared with the `VariantVectorMap` object. When the `VariantVectorMap` object is released, the view object becomes a dangling pointer and can no longer be used.

For borrowed `VariantVector` objects, you do not need to call close() after use. Do not store them separately for long periods. Their lifecycle depends on the `VariantVectorMap` object that owns them.

3.2.3 Read data: toXXXList

After you retrieve a VariantVector vv, you can call a conversion method based on its type:

VariantVector vv = out.get("feature_x");
if (vv.getType() == VariantVector.OPTIONAL_DOUBLE) {
    List<Double> vals = vv.toOptionalDoubleList();
}

You can also use VariantVectorMap.getList(key) for automatic dispatch. For more information, see section 3.1.3.

3.2.4 Helper methods: size, isEmpty, and totalElementCount

  • vv.size():

    • Optional: Returns the batch size, which is the length of nullFlags.

    • Others: Returns sizes[0], which is the number of outer elements and is usually the batch size.

  • vv.isEmpty(): Checks if the underlying data is empty based on the type.

  • vv.totalElementCount(): Calculates the total number of elements after expanding nested lists, maps, or matrices. This method is mainly used for debugging and verification.

4. Metadata query API

These APIs are mainly used to inspect the configuration content and feature dependencies.

4.1 Query input field collections

Set<String> itemInputs = handler.GetItemInputNames();
Set<String> userInputs = handler.GetUserInputNames();
Set<String> ctxInputs  = handler.GetContextInputNames();

4.2 Querying feature names, reserved columns, and special collections

List<String> allFeatures = handler.GetAllFeatureNames();
Set<String> reserveCols  = handler.GetReserveColumns();
Set<String> userSide     = handler.GetUserSideFeatures();
Set<String> seqFeatures  = handler.GetSequenceFeatures();

4.3 Querying feature dependencies, default values, dimensions, and bucketizers

Set<String> deps = handler.GetFeatureInputs("feature_x");
String defaultVal = handler.DefaultFeatureValue("feature_x");
long defaultBucket = handler.DefaultBucketizedFeatureValue("feature_x");
long dim = handler.GetFeatureValueDim("feature_x");
boolean hasBucketizer = handler.HasBucketizer("feature_x");
boolean onlyBucketize = handler.IsOnlyBucketize();

Scenarios

  • Generate input preparation logic to identify the raw inputs required for a feature.

  • Fill in or validate missing inputs using default values.

  • Handle cases where a downstream model requires a fixed dimension (dim).

  • Determine whether additional bucketizer processing is required (hasBucketizer/onlyBucketize).

4.4 Querying the output table schema

Usage

Map<String, String> schema = handler.GetTableSchema();
String type = schema.get("feature_x");

5. Resource release (must-read)

The FgHandler object holds a pointer to a native object. You must release this object when it is no longer in use to prevent memory leaks.

This is a heavyweight object. In general:

  • Initialize it once.

  • Reuse it multiple times.

  • Call `close()` when the service exits.

Best Practices

  • Do not create a new `FgHandler()` for each request.

  • Use one long-term, reusable handler for each configuration.

5.1 When the FgHandler is no longer used, you must call the close() function to release resources. This function has the following attributes:

  • Thread-safe: The method is synchronized.

  • Idempotent: Calling the method multiple times does not release a resource more than once.

Important: After calling close(), do not call any other native methods (such as Process or query interfaces). Otherwise, the application may crash due to a use-after-free error.

FgHandler handler = new FgHandler(cfg, 4);
try {
    ...
} finally {
    handler.close();
}

5.2 During the debugging phase, you can use a try-with-resources statement.

try (FgHandler handler = new FgHandler(cfg, 4)) {
    handler.setOutputs("f1", "f2");
    VariantVectorMap out = handler.Process(in);
}

5.3 Do not frequently create identical FgHandler objects in a production environment.

Important: When you use FgHandler within a service, do not create a new FgHandler object for each request. The initialization overhead for this object is high, and frequent creation degrades service performance.

Create a global object in the service's initializer function. Call the close() method in the service's exit function. Follow the same pattern when you need to create multiple different FgHandler objects.


Most Recommended Request Processing Pattern

Pattern A: Synchronous Requests, Most Stable

public Result handle(Request req, FgHandler handler) {
    try (VariantVectorMap inputs = buildInputs(req);
         VariantVectorMap results = handler.Process(inputs)) {
        Result r = new Result();
        r.goodsIds = results.get("goods_id").toOptionalStringList();
        r.dayHs = results.get("day_h").toOptionalInt32List();
        r.titles = results.get("title").toListStringList();
        return r;
    }
}

Features:

  • The scope of native resources is clear.

  • Resources are released when the request ends.

  • Less likely to cause an OOM error.

Pattern B: Service-Level Singleton Handler

public class FgService implements AutoCloseable {
    private final FgHandler handler;
    public FgService(String configPath) {
        this.handler = new FgHandler(configPath, 4, false, true);
    }
    public Result process(Request req) {
        try (VariantVectorMap inputs = buildInputs(req);
             VariantVectorMap results = handler.Process(inputs)) {
            return decode(results);
        }
    }
    @Override
    public void close() {
        handler.close();
    }
}

This is the most recommended approach in a production environment.

6. Common usage examples

Before processing, prepare the input fields according to the configuration.

FgHandler handler = new FgHandler(cfgJson, 8, false, false);
try {
    // 1) Set the required outputs.
    handler.setOutputs("feature_a", "feature_b");
    // 2) Find out which inputs to prepare.
    Set<String> itemInputs = handler.GetItemInputNames();
    Set<String> userInputs = handler.GetUserInputNames();
    Set<String> ctxInputs  = handler.GetContextInputNames();
    // 3) Construct inputs (example, depends on the VariantVectorMap API).
    VariantVectorMap inputs = new VariantVectorMap();
    // inputs.put("user_id", ...)
    // inputs.put("item_id", ...)
    // inputs.put("ts", ...)
    VariantVectorMap outputs = handler.Process(inputs);
    inputs.close();
    // Read outputs...
    outputs.close();
} finally {
    handler.close();
}

For bucketization operations only:

FgHandler handler = new FgHandler(cfgJson, 4, true, false);
try {
    if (!handler.IsOnlyBucketize()) {
        // You can add a prompt if the configuration or constructor parameters do not match.
    }
    ...
} finally {
    handler.close();
}

Safe standard usage

try (FgHandler handler = new FgHandler(configPath, 4, false, true)) {
    handler.setOutputs("goods_id", "day_h", "title");
    try (VariantVectorMap inputs = new VariantVectorMap.Builder()
            .putOptionalString("goods_id", Arrays.asList("1", "2"))
            .putOptionalInt32("day_h", Arrays.asList(6, 8))
            .putListString("title", Arrays.asList(
                    Arrays.asList("a", "b"),
                    Arrays.asList("c", "d")))
            .build();
         VariantVectorMap results = handler.Process(inputs)) {
        List<String> goodsIds = results.get("goods_id").toOptionalStringList();
        List<Integer> dayHs = results.get("day_h").toOptionalInt32List();
        List<List<String>> titles = results.get("title").toListStringList();
        // Use immediately and do not keep results or borrowed VariantVector objects out of scope for long.
    }
}

7. Usages most likely to cause OOM errors

1. Failure to close VariantVectorMap

Incorrect example

for (...) {
    VariantVectorMap inputs = new VariantVectorMap.Builder()...build();
    VariantVectorMap results = handler.Process(inputs);
    // Not closed
}

This causes the following issues:

  • The Java object might be quickly garbage-collected.

  • However, the native memory is not guaranteed to be released promptly.

  • Under high queries per second (QPS), many objects are allocated but not released. This can lead to an OOM error or cause the native resident set size (RSS) to exceed its limit.

Correct example

for (...) {
    try (VariantVectorMap inputs = new VariantVectorMap.Builder()...build();
         VariantVectorMap results = handler.Process(inputs)) {
        ...
    }
}

Resources are automatically released in a try-with-resources block.

2. Creating a new FgHandler for each request

Incorrect example

for (...) {
    FgHandler handler = new FgHandler(configPath, 4, false, true);
    ...
}

This operation is typically very heavyweight:

  • Repeatedly loads the configuration.

  • Initializes the thread pool and internal structures.

  • Repeatedly requests native resources.

Correct example

FgHandler handler = new FgHandler(configPath, 4, false, true);
try {
    for (...) {
        ...
    }
} finally {
    handler.close();
}

3. Caching the return value of results.get(key)

Incorrect example

VariantVector v = results.get("goods_id");
cache.add(v);   // Incorrect

Because it is borrowed:

  • Once results.close() is called, these objects become invalid.

  • If you do not close results, the memory is never released.

Correct example

Either decode it into a plain Java object immediately:

List<String> goodsIds = results.get("goods_id").toOptionalStringList();
cache.add(goodsIds);

Or, if you must save the native object, use getCopy() and ensure you close it later:

VariantVector copy = results.getCopy("goods_id");
try {
    ...
} finally {
    copy.close();
}

However, we do not recommend caching a native copy unless it is necessary.

4. Not releasing values after extensive use of toJavaMap()

If your toJavaMap() returns an owned copy, it can easily create many native objects.

Incorrect example

Map<String, VariantVector> map = results.toJavaMap();
// Left open without closing.

Correct example

  • Do not do this unless it is necessary.

  • We recommend reading the data field by field into Java objects.

  • If you must do this:

Map<String, VariantVector> map = results.toJavaMap();
try {
    ...
} finally {
    for (VariantVector v : map.values()) {
        if (v != null) v.close();
    }
}

5. Leaving intermediate VariantVector objects from a Builder in a Java Map for a long time

For example:

VariantVectorMap.Builder builder = new VariantVectorMap.Builder();
builder.putOptionalString(...);
builder.putOptionalInt32(...);
// builder/data exists for a long time

The Builder internally holds many VariantVector objects, which are all owned native objects.

If the Builder is not released for a long time, it also consumes native memory.

Correct procedure

  • Keep the lifecycle of the Builder as short as possible. Use it only to build a single request.