All Products
Search
Document Center

Artificial Intelligence Recommendation:Service activation and initialization

Last Updated:Dec 15, 2025

When you use PAI-Rec to build a recommendation system for the first time, purchase a PAI-Rec instance and configure the initial environment.

  • Selection guide

    Instance selection

    When you first use PAI-Rec, purchase a standard instance with the recommendation solution customization feature. After you become familiar with the service, you can purchase the operations tool feature.

    • Recommendation solution customization lets you customize feature engineering, recall strategies, and fine-ranking strategies to configure your recommendation system with greater flexibility and efficiency.

    • The operations tool improves operational efficiency and provides more control over recommendation results.

    Cloud product resource selection

    Building a PAI-Rec recommendation solution requires multiple cloud products. The specific cloud product resources required vary based on your business needs.

    Dependent cloud products (click to view details)

    image

    Cloud product

    Function

    Required cloud resources

    Modeling

    Object Storage Service (OSS)

    Stores model checkpoints, saved model files, and model configuration files.

    Create an OSS bucket.

    Note

    Do not enable the Versioning feature.

    MaxCompute

    Used for data cleansing, feature engineering, and preparing training samples.

    Create a MaxCompute project.

    To use PAI-DLC to train models, activate Data Transmission Service.

    Platform For AI (PAI)

    PAI serves as the entry point for the PAI-Rec developer platform. It includes features such as PAI-FeatureStore association, model training, model exporting, and model evaluation.

    Create a PAI workspace.

    Note

    PAI and DataWorks workspaces are interconnected at the underlying layer. When you create a PAI workspace, a workspace with the same name is automatically generated in DataWorks.

    You can also manually create a DataWorks workspace.

    DataWorks

    Used for data cleansing, feature engineering, model training and evaluation, model updates, and data synchronization with online stores. It also schedules all offline data production, model training, and evaluation tasks.

    Engine

    Hologres instance ID and database

    A real-time feature storage engine.

    It can be used with FeatureDB. For example, use Hologres to store vector recall data, user exposure data, and u2i2i trigger data. Use FeatureDB to store offline and real-time features of users and items.

    Purchase a Hologres instance and create a database.

    Use PAI-FeatureStore

    A real-time feature storage engine.

    ApsaraDB for Redis instance ID

    Stores fallback data. Can be replaced by FeatureDB in PAI-FeatureStore.

    Create an instance.

    PAI-EAS Resource Group

    Deploys the recommendation system engine to orchestrate processes such as recall, filtering, coarse-ranking, fine-ranking, and reranking. It also deploys the user-side vector inference service for vector recall and the model scoring service for coarse-ranking and fine-ranking.

    Resource configuration.

    Monitoring and others

    Simple Log Service (SLS)

    Users can use SLS to manage request logs.

    Create a project.

    DataHub Project

    Used for real-time log ingestion to continuously update user behavior for model training.

    We recommend that you prioritize using DataHub.

    Create a project

    Message Queue for Apache Kafka instance ID and resource group

    Purchase and deploy an instance.

    Flink VVP Streaming Service

    Processes real-time data and collects real-time feature statistics. The results can be written to a FeatureDB database.

    Activate Realtime Compute for Apache Flink.

    Suggestions for solutions

    Suggestions based on recommendation system complexity (click to view details)

    Note

    The complexity of a recommendation system's recall, filtering, model, and reranking processes is closely related to business requirements. We divide system development into the following stages: initial, intermediate, performance improvement, and operational intervention.

    Phases

    Description

    Recall model suggestions

    Ranking and reranking suggestions

    Initial stage

    Use Customized Recommendation Solution to build the entire recommendation pipeline. For more information, see Best practices for customizing PAI-Rec recommendation algorithms.

    Use collaborative filtering (etrec), the Swing algorithm tool, and group-based hot item retrieval.

    Use FeatureDB to store user exposure filter data, recall data, and feature data.

    Use feature configuration (note the use of real-time sequence features) and sorting configuration to set up a single-objective multi-tower model. This model offers fast inference, good performance, and conserves PAI-EAS resources.

    Use a diversity reranking configuration.

    Intermediate stage

    Add vector recall and a multi-objective ranking model.

    Add vector recall. The item index does not need to be updated because the index is stored inside the processor. For more information, see the Faiss index section of the TorchEasyRec Processor documentation.

    For multiple prediction targets such as clicks, purchases, and likes, use the DBMTL multi-objective ranking model.

    Business needs to quickly perceive item changes

    Implement cold start for items.

    Provide real-time item feature feedback to the ranking model.

    Use the item cold-start algorithm. For more information, see Recommendation cold-start solution.

    Create a new recommendation solution customization. In the feature configuration, set up real-time statistics. Then, in PAI-FeatureStore, create a new feature view and a new model feature. Export the new training samples and train a new model.

    Operational intervention

    Set exposure ratios for different users and item categories.

    Ensure a minimum number of exposures for new items.

    Other suggestions (click to view details)

    • PAI-EAS: Configure scheduled scale-out for peak hours and automatic scale-in to reduce resources during off-peak hours. Consider combining subscription resources with elastic scaling resources.

Prerequisites

This topic uses an offline modeling scenario as an example. This scenario requires the following cloud product resources. For more information about other cloud product resources, see Cloud product resource selection.

Purchase a PAI-Rec instance and configure cloud products

  1. On the instance purchase page, set Region, Recommendation Solution Customization, Operations Tool, and Subscription Duration. Click Buy Now, confirm the order, and complete the payment.

  2. In the PAI-Rec management console, switch to the destination region. In the navigation pane on the left, choose System Configuration > Cloud Product Configuration.

  3. On the Modeling tab, click Edit, select the cloud product resources that you created, and then click Exit.

    The parameter settings on the Engine and Monitoring and Others tabs are similar. First, configure the corresponding cloud resources, and then associate them in the PAI-Rec console.

Why do I need to use an Alibaba Cloud account to access Cloud Product Configuration?

  • You must use an Alibaba Cloud account, also known as a root account, to access System Configuration > Cloud Product Configuration. A root account is different from a Resource Access Management (RAM) user. For more information, see Quick Start: Create and authorize a RAM user. You must use a root account because configuring cloud products involves several steps. You need to activate PAI-Rec-related products, such as PAI, DataWorks, MaxCompute, OSS, Flink, PAI-FeatureStore, and Data Transmission Service. You also need to create projects or workspaces in these products. Finally, you must add the PAI-Rec service-linked role (aliyunserviceroleforpairec) to these projects and workspaces. If the `aliyunserviceroleforpairec` role is not added correctly, subsequent operations will fail due to insufficient permissions.