All Products
Search
Document Center

Platform For AI:FeatureDB overview

Last Updated:Mar 10, 2026

Store and access online features with millisecond latency using FeatureDB, a distributed KV database for search, recommendation, and advertising.

What is FeatureDB

FeatureDB is a distributed database storing features in KV and KKV formats with native Array and Map support. Structured storage outperforms serialized strings for reads, writes, and inference. FeatureDB handles offline features, real-time features, and behavior sequences.

Activate FeatureDB

Activate FeatureDB when creating a FeatureDB data store by following interface prompts.

Core capabilities

FeatureDB provides these capabilities:

  • Read and write KV and KKV features.

  • Read and write MaxCompute complex types (Array, Map).

  • Pull all feature data in a FeatureView.

  • Millisecond polling for real-time feature updates.

  • Second-level TTL with automatic expired data cleanup.

  • Pay-as-you-go billing based on actual operations.

FeatureDB shards data within a FeatureView and adjusts shard count based on performance needs. It supports replicas for data stability. Shard allocation depends on Estimated Order of Magnitude:

  • Less than 10 million (default): 5 shards.

  • Between 10 million and 100 million: 10 shards.

  • More than 100 million: 20 shards.

image

Advantages

  • Lower costs

    Reduces expenses for smaller feature storage needs.

  • High-frequency updates

    Real-time statistical features update every few seconds to multiple EasyRec Processor instances.

  • Complex type features

    Search and promotion businesses commonly use Array and Map features, behavior sequences, and SideInfo. String serialization hurts performance.

    FeatureDB stores complex types natively and synchronizes MaxCompute 2.0 complex data for high-performance reads.

  • Elastic scaling

    Increase shards per feature view to scale read and write performance for larger deployments.

  • Comprehensive monitoring

    Monitor read/write QPS, RT, data update latency, and storage at view-level granularity, eliminating blind spots when integrating third-party sources.

Usage

Configure VPC direct connection

FeatureDB provides VPC direct connection via PrivateLink. After configuration, use FeatureStore SDK in your VPC to access FeatureDB through a private connection, improving performance and reducing latency.

Configure VPC direct connection using either method:

  • Method 1: If no FeatureDB data store exists, click Create Source on the Store tab. Configure VPC, Zone and vSwitch, and Security Group Name in VPC Direct Connection Configurations. For details, see Online data store: FeatureDB.

  • Method 2: If a FeatureDB data source exists, click feature_db on the Store tab. Click VPC Direct Connection Configurations. Specify VPC, Zone and vSwitch, and Security Group Name, then click OK.

Configuration notes

  • VPC setting is immutable. Configure the VPC where your online service using FeatureStore resides.

  • Deploy your service in these zones to avoid network latency:

    Area

    Region

    Recommended zone

    Asia Pacific

    China (Hangzhou)

    Zone G

    China (Shanghai)

    Zone L

    China (Beijing)

    Zone F

    China (Shenzhen)

    Zone F

    China (Hongkong)

    Zone B

    Singapore

    Zone C

    Europe and Americas

    Germany (Frankfurt)

    Zone A

    United States (Silicon Valley)

    Zone B

  • Zone and vSwitch: Select a vSwitch in the zone where your online service instance resides. Select vSwitches in at least two zones for high availability.

  • After confirmation, configurations are immutable except adding vSwitches in other zones.

Write data

For offline features, use FeatureStore Python SDK to run scheduled tasks through DataWorks, synchronizing data from MaxCompute to FeatureDB.

For real-time features, write feature data directly using Java SDK.

       // Initialize configuration with region, credentials, and project
       Configuration configuration = new Configuration("cn-beijing",
                Constants.accessId, Constants.accessKey,"fs_demo_featuredb" );

        // Set FeatureDB credentials
        configuration.setUsername(Constants.username);
        configuration.setPassword(Constants.password);

        // For public network access, set domain (omit for VPC)
        //configuration.setDomain(Constants.host);

        ApiClient client = new ApiClient(configuration);

        // For public network, set usePublicAddress = true (omit for VPC)
        // FeatureStoreClient featureStoreClient = new FeatureStoreClient(client, Constants.usePublicAddress);
        FeatureStoreClient featureStoreClient = new FeatureStoreClient(client );

        // Get project and feature view
        Project project = featureStoreClient.getProject("fs_demo_featuredb");
        if (null == project) {
            throw  new RuntimeException("project not found");
        }

        FeatureView featureView = project.getFeatureView("user_test_2");
        if (null == featureView) {
            throw  new RuntimeException("featureview not found");
        }

        // Prepare sample data
        List<Map<String, Object>> writeData = new ArrayList<>();
        for (int i = 0; i < 10; i++) {
            Map<String, Object> data = new HashMap<>();
            data.put("user_id", i);
            data.put("string_field", String.format("test_%d", i));
            data.put("int32_field", i);
            data.put("int64_field", Long.valueOf(i));
            data.put("float_field", Float.valueOf(i));
            data.put("double_field", Double.valueOf(i));
            data.put("boolean_field", i % 2 == 0);
            writeData.add(data);
        }

        // Write data in batches
        for (int i = 0; i < 100;i++) {
            featureView.writeFeatures(writeData);
        }

        // Flush all writes (call once after all writes complete)
        featureView.writeFlush();

By default, real-time feature writes update the entire data row. If written data contains only some fields, unwritten fields become empty. To update only written fields and merge with original data, configure:

  • Use Java SDK: Specify InsertMode.PartialFieldWrite.

    for (int i = 0; i < 100;i++) {
        featureView.writeFeatures(writeData, InsertMode.PartialFieldWrite);
    }
  • Use Flink Connector: Set insert_mode to partial_field_write.

Read data

Read features using FeatureStore SDK (Go/Java) or EasyRec Processor.

FeatureStore SDK (Go/Java) supports KV point queries for offline/real-time features. Specify JoinID (primary key) and feature name to complete key-value queries within milliseconds. It also supports KKV queries for behavior sequences by specifying UserID to query assembled sequences.

EasyRec Processor integrates FeatureStore C++ SDK, supporting full feature pulls from FeatureDB into memory with millisecond polling for real-time updates, achieving higher read performance.

Monitor metrics

When using FeatureDB as an online data source, view metrics like read/write QPS and RT for each feature view. After creating a feature view, click Data Monitoring on the right side of the target view.image

Real-time feature data flow

image

FeatureStore storage service consists of three components: Feature Service (access layer), MSMQ (DataHub), and FeatureDB.

For real-time features, write data to FeatureDB via FeatureStore Java SDK or Flink Connector. Data written through feature service also synchronizes to your MaxCompute table for real-time feature export and model training.

Read feature data from FeatureDB via FeatureStore Java/Go SDK, or pull all features through EasyRec Processor and store in local cache for higher performance. Real-time features provide millisecond updates.

Feature lifecycle management

When creating a real-time feature view, specify Feature Lifecycle for the FeatureDB table. Data rows reaching the lifecycle expire and are automatically cleaned within seconds.

image

Specify survival time using either method:

  • Method 1: Don't set an Event Time field. Survival time is calculated from data write time.

  • Method 2: Enable Event Time for a feature field (unit: milliseconds). Let event_time be the Event Time value, time_now be current time, and time_ttl = time_now - ttl be when data should expire. Written feature data is handled as:

    • When using PartialFieldWrite mode for partial field updates, survival time uses actual write time.

    • event_time > time_now + 15min: Data is not written (15-minute buffer prevents timestamp differences between systems).

    • time_ttl < event_time <= time_now + 15min: Data is written normally. Survival time starts from event_time, and data expires after reaching lifecycle.

    • 0 < event_time < time_ttl: Data expires automatically after write. Note: event_time unit is milliseconds. If your Event Time field uses seconds, data will not write successfully.

    • event_time <= 0: Survival time is calculated from actual write time.

    • Invalid value (cannot convert to integer): Data is not written.

    • Registered Event Time field but no value passed: Data is written normally. Survival time is calculated from actual write time.

    • No Event Time field: Data is written normally. Survival time is calculated from actual write time.

    • In FeatureDB, event_time becomes the timestamp (ts) for this row. To update data for a key, the Event Time value must equal or exceed the previous value. If new event_time < original event_time, data is not updated.

Performance benchmarks

These performance test results use FeatureStore Go SDK to read FeatureDB data. The feature table contains user-side data from a recommendation scenario with 17,689,586 rows. Test machine: 4 cores, 8 GiB memory. Results are for reference only.

  • Configure VPC direct connection configured, online service in recommended zone:

    Number of feature fields (columns)

    Number of keys read (rows)

    Average latency

    TP95

    TP99

    260

    1

    0.89 milliseconds

    1.20 milliseconds

    1.45 milliseconds

    260

    10

    1.17 milliseconds

    1.52 milliseconds

    1.87 milliseconds

    260

    50

    1.91 milliseconds

    2.56 milliseconds

    2.92 milliseconds

    260

    100

    2.87 milliseconds

    3.58 milliseconds

    3.93 milliseconds

    260

    200

    4.43 milliseconds

    5.25 milliseconds

    5.80 milliseconds

  • VPC direct connection configured, online service in non-recommended zone:

    Number of feature fields (columns)

    Number of keys read (rows)

    Average latency

    TP95

    TP99

    260

    1

    2.54 milliseconds

    2.86 milliseconds

    3.15 milliseconds

    260

    10

    2.75 milliseconds

    3.12 milliseconds

    3.56 milliseconds

    260

    50

    3.95 milliseconds

    4.75 milliseconds

    5.19 milliseconds

    260

    100

    4.82 milliseconds

    5.66 milliseconds

    6.21 milliseconds

    260

    200

    6.84 milliseconds

    7.75 milliseconds

    8.25 milliseconds

  • VPC direct connection not configured:

    Number of feature fields (columns)

    Number of keys read (rows)

    Average latency

    TP95

    TP99

    260

    1

    3.62 milliseconds

    3.83 milliseconds

    4.27 milliseconds

    260

    10

    3.82 milliseconds

    4.11 milliseconds

    4.61 milliseconds

    260

    50

    4.54 milliseconds

    5.19 milliseconds

    5.60 milliseconds

    260

    100

    5.40 milliseconds

    6.13 milliseconds

    6.56 milliseconds

    260

    200

    7.15 milliseconds

    7.93 milliseconds

    8.47 milliseconds

Billing

For billing information, see Billing of FeatureStore.