Create and configure FeatureStore projects that connect offline and online data sources. Each project is isolated, but tables within a project are shared.
Prerequisites
-
Create offline data sources and online stores. For more information, see Create a new data source.
-
Store the label table in the offline store.
Label tables contain training labels, including the target attribute and JoinId linking to feature entities. In recommendation scenarios, derive label tables from behavior tables by grouping on fields such as user_id, item_id, or request_id.
Create a project
-
Log on to the PAI console. In the left navigation pane, click . Select a workspace and click Enter FeatureStore.
-
Click Create Project and configure the project parameters.
Key parameters:
Parameter
Description
Offline Store
Select an existing offline data source.
Online Store
Select an existing online data source.
Offline Table Lifecycle
Lifecycle of tables automatically created and stored in MaxCompute by FeatureStore.
-
Click Submit.
Create a feature entity
Feature entities group related feature tables. In a recommendation system, define two feature entities: user and item.
-
Click the project name in the feature project list.
-
On the Feature Entity tab, click Create Entity. In the dialog box, configure the feature entity parameters.
Key parameters:
Parameter
Description
Feature Entity Name
Enter a custom name. For recommendation scenarios, create two feature entities: user and item.
Join Id
Join IDs are fields in feature tables that link feature views to feature entities. Each entity has one Join ID that joins multiple feature views.
Each feature view has a primary key (index key) to fetch feature data. Index key names may differ from Join ID names.
For recommendation scenarios, set Join IDs to user_id and item_id (primary keys of user and item tables).
-
Click Submit.
Create a feature view
Feature views hold features and their derived features. Each view represents a subset of a feature entity's full feature set and maps to both offline and online feature tables.
-
On the project details page, go to the Feature View tab and click Create Feature View.
-
Configure the view parameters and click Submit.
-
Create an offline feature view to register offline feature data with FeatureStore.
-
Create a real-time feature view to register real-time feature data with FeatureStore.
-
Create an offline feature view
Key parameters:
|
Parameter |
Description |
|
Type |
Select Offline to register offline feature data as a feature view in FeatureStore. |
|
Write Mode |
Field properties:
|
|
Synchronize Online Feature Table |
Enable automatic sync to copy feature data from this view to the online data source. |
|
Feature Entity |
Select the feature entity to link to this view. Note
Link multiple feature views to one feature entity. |
|
Feature Lifecycle |
Set the lifecycle for this feature view. Newly written real-time data uses this lifecycle. |
Create a real-time feature view
Key parameters:
|
Parameter |
Description |
|
View Name |
Follow the console prompts to configure this parameter. |
|
Type |
Select Real Time to register online feature data as a feature view in FeatureStore. |
|
Feature Entity |
Select the feature entity to link to this view. Note
Link multiple feature views to one feature entity. |
|
Write Mode |
Real-time feature views support only Customize Table Schema. Define a new schema for this view. Add fields manually and configure their properties. Field properties:
|
|
Feature Field |
Enter the number of fields you need.
|
|
Feature Lifecycle |
Set a value greater than 1. The default is 30 days. |
|
Advanced Settings |
Configure advanced options in JSON format. |
Create a Label table
Label tables store training labels, including target attributes and join IDs linking to feature entities. For recommendation scenarios, generate label tables from behavior tables using operations such as GROUP BY user_id, item_id, or request_id.
-
On the project details page, go to the Label Table tab and click Create Label Table.
-
Select the data source and table name for the label table.
-
Configure label table fields and click Submit.
Field configuration
Description
Feature Field
If the label table contains features, select the corresponding fields as feature fields.
FG Reserved Fields
No configuration needed now.
Event Time
Select the timestamp field that records when the behavior occurred.
Label Field
Select the Label field in the Label table.
Partition Field
Select the partition field in the label table.
Create model features
Model features are collections of features used during training and publishing. After selecting features to create a model, MaxCompute creates a train set table for offline training. Later, specify model features from EAS and FeatureStore in PAI-Rec to automatically retrieve feature data for inference.
-
On the project details page, go to the Model Features tab and click Create Model Feature.
-
Configure the model feature parameters and click Submit.
Parameter
Description
Select Feature
Select features from the current offline view and assign aliases.
Label Table Name
Select an existing label table name.
Export Table Name
After you submit, FeatureStore creates a train set table in MaxCompute for offline training.
Real-time feature overview
Terms
Real-time features change rapidly—often within milliseconds. They are generated or updated quickly on the server side and used immediately for processing and decision-making. Real-time features are typically built and consumed in real-time data stream systems and require high timeliness and fast response.
Real-time features are extracted from data streams. Stream processing systems such as Flink compute and generate them to reflect current conditions. The entire pipeline delivers high performance and low latency. Real-time features update dynamically, and the system recalculates them continuously.
Use cases
Common use cases:
-
Online advertising: Adjust ad content in real time based on user browsing behavior.
-
Fraud detection: Detect suspicious financial transactions in real time and trigger alerts or block transactions.
-
Personalized recommendations: Update recommendation lists in real time using current activity and historical data.
-
IoT systems: Monitor and control devices in real time. Generate and use real-time features to respond to environmental changes.
Real-time features in recommendation and advertising systems
Real-time feature write process
After creating a real-time feature view in FeatureStore, FeatureStore automatically creates a matching table in the online data engine to store and serve real-time feature data. When using data sources such as FeatureDB, TableStore, or Hologres, your backend connects to Alibaba Cloud DataHub. DataHub forwards data to Flink. Flink processes and computes real-time features, then writes results to the online data source table. Find the exact table name on the real-time feature view details page.
Read online features
When using the EasyRec Processor, its built-in FeatureStore C++ SDK analyzes your model feature name (fs_model) to identify real-time features and read them automatically. When using the FeatureStore Go SDK or Java SDK, configure feature reads as described in the SDK documentation.
Export offline samples
FeatureStore automatically joins the tables in the offline data engine that correspond to a feature view and exports the results. For real-time feature views, FeatureDB automatically writes online-mode data to the offline table in the offline data engine. If not using FeatureDB, create a task to write data to the offline table in the offline data engine. Alternatively, use the recommendation algorithm in PAI-Rec to generate simulated real-time data offline, which serves as the data source for the offline table of the corresponding real-time feature view.
Real-time feature views in FeatureStore
Real-time feature view workflow
Real-time feature views in FeatureStore handle features that change in real time. They use DataHub and Flink to write features to the online store. Then EasyRec Processor polls for features, or FeatureStore SDKs read them directly, letting downstream services detect millisecond-level feature changes.
Export operation
Select multiple real-time and offline feature views to create model features. After creating model features, export them. FeatureStore supports automatic export. The source of the offline table for a real-time feature view depends on your setup:
|
Online data source |
FeatureDB |
Hologres/TableStore |
|
|
Recommendation engine |
Both |
PAI-REC (use recommendation algorithm customization) |
Other |
|
Export method |
Export directly using FeatureStore. |
Import simulated data from recommendation algorithm customization into the offline table for the real-time feature view. Then export using FeatureStore. |
Manually export data from the offline table for the real-time feature view. Then export using FeatureStore. |
Sync operation
Sync data in two ways:
-
Write using SDKs. For more information, see FeatureStore SDK reference.
References
After configuring a FeatureStore project, learn how to use it. For more information, see Use FeatureStore to manage features in a recommendation system.