Activate and initialize PAI-Rec and related products - Artificial Intelligence Recommendation

When you use PAI-Rec to build a recommendation system for the first time, purchase a PAI-Rec instance and configure the initial environment.

Selection guide

Instance selection

When you first use PAI-Rec, purchase a standard instance with the recommendation solution customization feature. After you become familiar with the service, you can purchase the operations tool feature.

Recommendation solution customization lets you customize feature engineering, recall strategies, and fine-ranking strategies to configure your recommendation system with greater flexibility and efficiency.
The operations tool improves operational efficiency and provides more control over recommendation results.

Cloud product resource selection

Building a PAI-Rec recommendation solution requires multiple cloud products. The specific cloud product resources required vary based on your business needs.

Dependent cloud products (click to view details)

Cloud product	Function	Required cloud resources
Modeling
Object Storage Service (OSS)	Stores model checkpoints, saved model files, and model configuration files.	Create an OSS bucket. Note Do not enable the Versioning feature.
MaxCompute	Used for data cleansing, feature engineering, and preparing training samples.	Create a MaxCompute project. To use PAI-DLC to train models, activate Data Transmission Service.
Platform For AI (PAI)	PAI serves as the entry point for the PAI-Rec developer platform. It includes features such as PAI-FeatureStore association, model training, model exporting, and model evaluation.	Create a PAI workspace. Note PAI and DataWorks workspaces are interconnected at the underlying layer. When you create a PAI workspace, a workspace with the same name is automatically generated in DataWorks. You can also manually create a DataWorks workspace.
DataWorks	Used for data cleansing, feature engineering, model training and evaluation, model updates, and data synchronization with online stores. It also schedules all offline data production, model training, and evaluation tasks.	Create a workspace. We strongly recommend that you select basic mode. Basic mode simplifies the scheduling and execution of production tasks. The primary DataWorks data source must be attached to a MaxCompute computing resource. Add a scheduling resource group and attach it to a DataWorks workspace. Select a pay-as-you-go scheduling resource group. When you run Python and Shell script tasks, you can select the latest dataworks_pairec_task_pod image. The pod contains eascmd64 and the PAI-FeatureStore Python SDK.
Engine
Hologres instance ID and database	A real-time feature storage engine. It can be used with FeatureDB. For example, use Hologres to store vector recall data, user exposure data, and u2i2i trigger data. Use FeatureDB to store offline and real-time features of users and items.	Purchase an instance and create a database.
Use PAI-FeatureStore	A real-time feature storage engine.	Configure a FeatureStore project. Create an online store.
ApsaraDB for Redis instance ID	Stores fallback data. Can be replaced by FeatureDB in PAI-FeatureStore.	Create an instance.
PAI-EAS Resource Group	Deploys the recommendation system engine to orchestrate processes such as recall, filtering, coarse-ranking, fine-ranking, and reranking. It also deploys the user-side vector inference service for vector recall and the model scoring service for coarse-ranking and fine-ranking.	Resource configuration.
Monitoring and others
Simple Log Service (SLS)	Users can use SLS to manage request logs.	Create a project.
DataHub Project	Used for real-time log ingestion to continuously update user behavior for model training. We recommend that you prioritize using DataHub.	Create a project
Message Queue for Apache Kafka instance ID and resource group		Purchase and deploy an instance.
Flink VVP Streaming Service	Processes real-time data and collects real-time feature statistics. The results can be written to a FeatureDB database.	Activate Realtime Compute for Apache Flink.

Suggestions for solutions

Suggestions based on recommendation system complexity (click to view details)

Note

The complexity of a recommendation system's recall, filtering, model, and reranking processes is closely related to business requirements. We divide system development into the following stages: initial, intermediate, performance improvement, and operational intervention.

Phases	Description	Recall model suggestions	Ranking and reranking suggestions
Initial stage	Use Customized Recommendation Solution to build the entire recommendation pipeline. For more information, see Best practices for customizing PAI-Rec recommendation algorithms.	Use collaborative filtering (etrec), the Swing algorithm tool, and group-based hot item retrieval. Use FeatureDB to store user exposure filter data, recall data, and feature data.	Use feature configuration (note the use of real-time sequence features) and sorting configuration to set up a single-objective multi-tower model. This model offers fast inference, good performance, and conserves PAI-EAS resources. Use a diversity reranking configuration.
Intermediate stage	Add vector recall and a multi-objective ranking model.	Add vector recall. The item index does not need to be updated because the index is stored inside the processor. For more information, see the Faiss index section of the TorchEasyRec Processor documentation.	For multiple prediction targets such as clicks, purchases, and likes, use the DBMTL multi-objective ranking model.
Business needs to quickly perceive item changes	Implement cold start for items. Provide real-time item feature feedback to the ranking model.	Use the item cold-start algorithm. For more information, see Recommendation cold-start solution.	Create a new recommendation solution customization. In the feature configuration, set up real-time statistics. Then, in PAI-FeatureStore, create a new feature view and a new model feature. Export the new training samples and train a new model.
Operational intervention	Set exposure ratios for different users and item categories. Ensure a minimum number of exposures for new items.

Other suggestions (click to view details)

PAI-EAS: Configure scheduled scale-out for peak hours and automatic scale-in to reduce resources during off-peak hours. Consider combining subscription resources with elastic scaling resources.

Prerequisites

This topic uses an offline modeling scenario as an example. This scenario requires the following cloud product resources. For more information about other cloud product resources, see Cloud product resource selection.

Purchase a PAI-Rec instance and configure cloud products

On the instance purchase page, set Region, Recommendation Solution Customization, Operations Tool, and Subscription Duration. Click Buy Now, confirm the order, and complete the payment.
In the PAI-Rec management console, switch to the destination region. In the navigation pane on the left, choose System Configuration > Cloud Product Configuration.
On the Modeling tab, click Edit, select the cloud product resources that you created, and then click Exit.
The parameter settings on the Engine and Monitoring and Others tabs are similar. First, configure the corresponding cloud resources, and then associate them in the PAI-Rec console.

Why do I need to use an Alibaba Cloud account to access Cloud Product Configuration?

You must use an Alibaba Cloud account, also known as a root account, to access System Configuration > Cloud Product Configuration. A root account is different from a Resource Access Management (RAM) user. For more information, see Quick Start: Create and authorize a RAM user. You must use a root account because configuring cloud products involves several steps. You need to activate PAI-Rec-related products, such as PAI, DataWorks, MaxCompute, OSS, Flink, PAI-FeatureStore, and Data Transmission Service. You also need to create projects or workspaces in these products. Finally, you must add the PAI-Rec service-linked role (aliyunserviceroleforpairec) to these projects and workspaces. If the `aliyunserviceroleforpairec` role is not added correctly, subsequent operations will fail due to insufficient permissions.