All Products
Search
Document Center

Artificial Intelligence Recommendation:Best practices for customizing PAI-Rec recommendation algorithms

Last Updated:Dec 02, 2025

This topic uses a public dataset to help you get started with PAI-Rec. You can follow the steps to configure key features, such as feature engineering, recall, and fine-grained ranking for custom recommendation algorithms. You can then generate the code and deploy it to the corresponding workflow in DataWorks.

Prerequisites

Before you begin, complete the following prerequisites:

1. Create a PAI-Rec instance and initialize the service

  1. Log on to the Personalized Recommendation Platform home page and click Buy Now.

  2. On the PAI-Rec instance purchase page, configure the following key parameters and click Buy Now.

    Parameter

    Description

    Region And Zone

    The region where your cloud service is deployed.

    Service Type

    Select Premium Edition for this solution.

    Note

    Compared to the Standard Edition, the Premium Edition adds data diagnostics and custom recommendation solution features.

  3. Log on to the PAI-Rec console. In the top menu bar, select a region.

  4. In the navigation pane on the left, choose Instance List. Click the instance name to go to the instance details page.

  5. In the Operation Guide section, click Init. You are redirected to the System Configurations > End-to-End Service page. Click Edit, configure the resources as described in the following table, and then click Done.

    Resource configuration

    Parameter

    Description

    Modeling

    PAI Workspace

    Enter the default PAI workspace that you created.

    DataWorks Workspace

    Enter the automatically generated DataWorks workspace.

    MaxCompute Project (Workspace)

    Enter the MaxCompute project that you created.

    OSS Bucket

    Select the OSS bucket that you created.

    Engine

    Real-time Recall Engine

    For Use PAI-FeatureStore, select Yes.

    Real-time Feature Query

    For Use PAI-FeatureStore, select Yes.

  6. In the navigation pane on the left, choose System Configurations > Permission Management. On the Access Service tab, check the authorization status of each cloud product to ensure that access is granted.

2. Clone the public dataset

1. Synchronize data tables

You can provide input data for this solution in two ways:

  1. Clone data for a fixed time window from the pai_online_project project. This method does not support routine task scheduling.

  2. Use a Python script to generate data. You can run a task in DataWorks to generate data for a specific period.

To schedule daily data generation and model training, use the second method. You must deploy the specified Python code to generate the required data. For more information, see the Generate data using code tab.

Synchronize data for a fixed time window

PAI-Rec provides three common tables for recommendation algorithms in the publicly accessible pai_online_project project:

  • User table: pai_online_project.rec_sln_demo_user_table

  • Item table: pai_online_project.rec_sln_demo_item_table

  • Behavior table: pai_online_project.rec_sln_demo_behavior_table

The subsequent operations in this solution are based on these three tables. The data is randomly generated and simulated and has no real business meaning. Therefore, metrics such as Area Under the Curve (AUC) obtained from training will be low. You must run SQL commands in DataWorks to synchronize the table data from the pai_online_project project to your DataWorks project, such as DataWorks_a. The procedure is as follows:

  1. Log on to the DataWorks console. In the top menu bar, select a region.

  2. In the navigation pane on the left, click Data Development And O&M > Data Development.

  3. Select the DataWorks workspace that you created and click Go To Data Development.

  4. Hover over Create and choose Create Node > MaxCompute > ODPS SQL. Configure the parameters as described in the following table and click Confirm.

    Resource configuration

    Parameter

    Description

    Engine Instance

    Select the attached MaxCompute data source.

    Node Type

    Select the node type ODPS SQL.

    Path

    Select the path where the current node is located. For example, Business Flow/Workflow/MaxCompute.

    Name

    Enter a custom name, such as Data.

  5. In the new node section, copy and run the following code to synchronize the user, item, and behavior tables from the pai_online_project project to your MaxCompute project, such as project_mc. To run the code, you must set variables to specify data from bizdate to 100 days before bizdate. Typically, you can set bizdate to the day before the current date. Configure the scheduling parameters as follows: image Run the following code once to copy the data from the public pai_online_project project to your project:

CREATE TABLE IF NOT EXISTS rec_sln_demo_user_table_v1(
 user_id BIGINT COMMENT 'Unique user ID',
 gender STRING COMMENT 'Gender',
 age BIGINT COMMENT 'Age',
 city STRING COMMENT 'City',
 item_cnt BIGINT COMMENT 'Number of created items',
 follow_cnt BIGINT COMMENT 'Number of follows',
 follower_cnt BIGINT COMMENT 'Number of followers',
 register_time BIGINT COMMENT 'Registration time',
 tags STRING COMMENT 'User tags'
) PARTITIONED BY (ds STRING) STORED AS ALIORC;

INSERT OVERWRITE TABLE rec_sln_demo_user_table_v1 PARTITION(ds)
SELECT *
FROM pai_online_project.rec_sln_demo_user_table
WHERE ds >= "${bizdate_100}" and ds <= "${bizdate}";

CREATE TABLE IF NOT EXISTS rec_sln_demo_item_table_v1(
 item_id BIGINT COMMENT 'Item ID',
 duration DOUBLE COMMENT 'Video duration',
 title STRING COMMENT 'Title',
 category STRING COMMENT 'Primary tag',
 author BIGINT COMMENT 'Author',
 click_count BIGINT COMMENT 'Total clicks',
 praise_count BIGINT COMMENT 'Total likes',
 pub_time BIGINT COMMENT 'Publication time'
) PARTITIONED BY (ds STRING) STORED AS ALIORC;

INSERT OVERWRITE TABLE rec_sln_demo_item_table_v1 PARTITION(ds)
SELECT *
FROM pai_online_project.rec_sln_demo_item_table
WHERE ds >= "${bizdate_100}" and ds <= "${bizdate}";

CREATE TABLE IF NOT EXISTS rec_sln_demo_behavior_table_v1(
 request_id STRING COMMENT 'Instrumentation ID/Request ID',
 user_id STRING COMMENT 'Unique user ID',
 exp_id STRING COMMENT 'Experiment ID',
 page STRING COMMENT 'Page',
 net_type STRING COMMENT 'Network type',
 event_time BIGINT COMMENT 'Behavior time',
 item_id STRING COMMENT 'Item ID',
 event STRING COMMENT 'Behavior type',
 playtime DOUBLE COMMENT 'Playback/Read duration'
) PARTITIONED BY (ds STRING) STORED AS ALIORC;


INSERT OVERWRITE TABLE rec_sln_demo_behavior_table_v1 PARTITION(ds)
SELECT *
FROM pai_online_project.rec_sln_demo_behavior_table
WHERE ds >= "${bizdate_100}" and ds <= "${bizdate}";

Generate data using code

Using data from a fixed time window does not support routine task scheduling. To schedule tasks, you must deploy specific Python code to generate the required data. The procedure is as follows:

  1. In the DataWorks console, create a PyODPS 3 node. For more information, see Create and manage MaxCompute nodes.

  2. Download create_data.py and paste the file content into the PyODPS 3 node.

  3. In the right-side pane, click Scheduling Configurations, configure the parameters, and then click the Save image and Submit image icons in the upper-right corner.

    • Configure scheduling parameters:

      • Note the variable replacements:

        Replace $user_table_name with rec_sln_demo_user_table.

        Replace $item_table_name with rec_sln_demo_item_table.

        Replace $behavior_table_name with rec_sln_demo_behavior_table.

        image

        After replacement:

        image

    • Configure scheduling dependencies.

  4. Go to the Operation Center and choose Periodic Task O&M > Periodic Tasks.

  5. In the Actions column of the target task, choose Backfill Data > Current And Descendant Nodes.

  6. In the Backfill Data panel, set the data timestamp and click Submit And Go.

    A good backfill time range is 60 days. We recommend that you set the data timestamp to Scheduled Task Date - 60 to ensure data integrity.

2. Configure dependency nodes

To ensure smooth code generation and deployment, add three SQL code nodes to your DataWorks project in advance. Configure the scheduling dependencies of these nodes to the root node of the workspace. After you complete all the settings, publish the nodes. The procedure is as follows:

  1. Hover over Create and choose Create Node > General > Virtual Node. Create three virtual nodes as described in the following table and click Confirm.

    Resource configuration

    Parameter

    Description

    Example

    Node Type

    Select the node type.

    Virtual Node

    Path

    Select the path where the current node is located.

    Workflow/Workflow/General

    Name

    Enter the names of the synchronized data tables.

    • rec_sln_demo_user_table_v1

    • rec_sln_demo_item_table_v1

    • rec_sln_demo_behavior_table_v1

    image

  2. Select a node, set the node content to select 1; for each node, and then click Scheduling Configurations in the right-side pane to complete the configurations:

    • In the Time Property section, set Rerun Property to Rerun When Succeeded Or Failed.

    • In the Scheduling Dependencies > Upstream Dependencies section, enter the DataWorks workspace name, select the node with the _root suffix, and click Add.

      Configure all three virtual nodes.

      image

  3. Click the image icon in front of the virtual node to submit it.

3. Register data

To configure feature engineering, recall, and sorting algorithms in the custom recommendation solution, you must first register the three tables that you synchronized to your DataWorks project. The procedure is as follows:

  1. Log on to the PAI-Rec console. In the top menu bar, select a region.

  2. In the navigation pane on the left, choose Instance List. Click the instance name to go to the instance details page.

  3. In the navigation pane on the left, choose Custom Recommendation Solution > Data Registration. On the MaxCompute Table tab, click Add Data Table. Add one user table, one item table, and one behavior table as described in the following table, and then click Start Import.

    Parameter

    Description

    Example default solution

    MaxCompute project

    Select the MaxCompute project that you created.

    project_mc

    MaxCompute table

    Select the data tables that you synchronized to the DataWorks workspace.

    • User table: rec_sln_demo_user_table_v1

    • Item table: rec_sln_demo_item_table_v1

    • Behavior table: rec_sln_demo_behavior_table_v1

    Data table name

    Enter a custom name.

    • User Table

    • Item Table

    • Behavior Table

4. Create a recommendation scenario

Before you configure a recommendation task, you must create a recommendation scenario. For information about the basic concepts of recommendation scenarios and the meaning of traffic IDs, see Terms.

In the navigation pane on the left, choose Recommendation Scenarios. Click Create Scenario, create a recommendation scenario as described in the following table, and then click OK.

Resource configuration

Parameter

Description

Example: Default Solution

Scenario Name

Enter a custom name.

HomePage

Scenario Description

A detailed description of the scenario.

None

5. Create and configure an algorithm solution

To configure a complete real-world scenario, we recommend the following recall and fine-grained ranking configurations.

  • Global hot recall: Ranks the top k items based on statistics from log data.

  • Global hot fallback recall: Uses Redis as a fallback to prevent the recommendation API from returning empty data.

  • Grouped hot recall: Recalls items by categories, such as city and gender, to help improve the accuracy of popular item recommendations.

  • etrec u2i recall: Based on the etrec collaborative filtering algorithm.

  • swing u2i recall (optional): Based on the Swing algorithm.

  • Cold-start recall (optional): Uses the DropoutNet algorithm for cold-start recall.

  • Fine-grained ranking: You can choose MultiTower for single-objective ranking or DBMTL for multi-objective ranking.

Vector recall or PDN recall algorithms are typically enabled after the recall stage is comprehensive. Vector recall requires a vector recall engine. We do not configure vector recall in this example because FeatureDB does not support it.

This topic is designed to guide you through the configuration and deployment process. Therefore, in the recall configuration stage, we configure only global hot recall and the u2i recall strategy of RECommender (eTREC, a collaborative filtering implementation). For the ranking configuration, we select fine-grained ranking to optimize the experience. The procedure is as follows:

  1. In the navigation pane on the left, choose Custom Recommendation Solution > Solution Configuration. Select the scenario that you created, click Create Recommendation Solution, create a solution as described in the following table, and then click Save And Configure Algorithm Solution.

    Keep the default values for parameters that are not described. For more information, see Data Table Configuration.

    Resource configuration

    Parameter

    Description

    Solution Name

    Enter a custom name.

    Scenario Name

    Select the recommendation scenario that you created.

    Offline Store

    Select the MaxCompute project associated with the recommendation scenario.

    DataWorks Workspace

    Select the DataWorks workspace associated with the recommendation scenario.

    Workflow Name

    This is the name of the workflow created in DataWorks when you deploy the recommendation solution script. You can enter a custom name, such as Flow.

    StorageAPI configuration

    For regions in China, such as Beijing and Shanghai, you can select "StorageAPI", which is a pay-as-you-go Data Transmission Service.

    For regions outside China, such as China (Hong Kong), Singapore, and Frankfurt, you must first purchase and use a dedicated resource group for Data Transmission Service. If a pay-as-you-go option is not available, you must purchase a subscription Data Transmission Service. Then, refresh the page and select the name of the subscription Data Transmission Service. Add a parameter to the TorchEasyRec training task of PAI-DLC in DataWorks, in a format similar to: -odps_data_quota_name ot_xxxx_p#ot_yyyy.

    slim_mode

    If the DataWorks edition you purchased has a size limit on the code packages imported by Migration Assistant, you can use this feature and manually upload the code packages that exceed the size limit. For this solution, select No.

    OSS Bucket

    Select the OSS bucket associated with the recommendation scenario.

    Project

    Select the FeatureStore project that you created. For the online store, select FeatureDB.

    User Entity

    Select the user feature entity `user` corresponding to the FeatureStore project.

    Item Entity

    Select the item feature entity `item` corresponding to the FeatureStore project.

  2. At the Data Table Configuration node, click Add to the right of the target data table. Configure the Behavior Log Table, User Table, and Item Table as described in the following tables. Set the partition, event, feature, and timestamp fields, and then click Next.

    Keep the default values for parameters that are not described. For more information, see Data Table Configuration.

    Behavior log table resource configuration

    When you configure the behavior log table, make adjustments based on the actual data content. In this topic, the behavior log contains core information, such as request ID, unique user ID, the page where the behavior occurred, behavior timestamp, and behavior category. If the table contains richer data dimensions, we recommend that you classify this information by user and item and configure it as user information or item information for subsequent feature engineering.

    Parameter

    Description

    Example

    Behavior Table Name

    Select the registered behavior table.

    rec_sln_demo_behavior_table_v1

    Time Partition

    The partition field of the behavior table.

    ds

    yyyymmdd

    Behavior information configuration

    Request ID

    The ID that marks each recommendation request in the log, typically a program-generated UUID. This is optional.

    request_id

    Behavior Event

    The field that records the behavior event in the log.

    event

    Behavior Event Enumeration Values

    The enumeration values included in the behavior event, such as impression, click, add-to-cart, or purchase.

    expr,click,praise

    Behavior Value

    Represents the depth of the behavior, such as transaction price or viewing duration.

    playtime

    Behavior Timestamp

    The time when the log was generated, as a UNIX timestamp accurate to the second.

    event_time

    Timestamp Format

    Used with the behavior timestamp.

    unixtime

    Behavior Scenario

    The scenario field where the log occurred, such as home page, search page, or product details page.

    page

    Scenario Enumeration Values

    Indicates which scenario data is used. You can calculate statistics for features by scenario in subsequent feature engineering.

    home,detail

    User information configuration

    User ID

    The user ID identifier in the behavior table.

    user_id

    User Categorical Features

    User categorical features in the behavior table, such as network, operating platform, or gender.

    net_type

    Item information configuration

    Item ID

    The item ID identifier in the behavior table.

    item_id

    User table resource configuration

    Parameter

    Description

    Example

    User Table Name

    Select the registered user table.

    rec_sln_demo_user_table_v1

    Time Partition

    The time partition field of the user table.

    ds

    yyyymmdd

    User information configuration

    User ID

    The user ID field in the user table.

    user_id

    Registration Timestamp

    The time when the user registered.

    register_time

    Timestamp Format

    Used with the registration timestamp.

    unixtime

    Categorical Features

    Categorical fields in the user table, such as gender, age group, or city.

    gender, city

    Numerical Features

    Numerical fields in the user table, such as number of works or points.

    age, item_cnt, follow_cnt, follower_cnt

    Tag Feature

    The name of the tag feature field.

    tags

    Item table resource configuration

    Parameter

    Description

    Example

    Item Table Name

    Select the registered item table.

    rec_sln_demo_item_table_v1

    Time Partition

    The time partition field of the item table.

    ds

    yyyymmdd

    Item information configuration

    Item ID

    The item ID field in the item table.

    item_id

    Author ID

    The author of the item.

    author

    Listing Timestamp

    The name of the item listing timestamp field.

    pub_time

    Timestamp Format

    Used with the listing timestamp.

    unixtime

    Categorical Features

    Categorical fields in the item table, such as category.

    category

    Numerical Features

    Numerical fields in the item table, such as price, total sales, or number of likes.

    click_count, praise_count

  3. At the Feature Configuration node, configure the parameters as described in the following table, click Generate Features, set the feature version, and then click Next.

    After you click Generate Features, various statistical features are derived for users and items. In this solution, we do not edit the derived features and keep the default settings. You can edit the derived features as needed. For more information, see Feature Configuration.

    Resource configuration

    Parameter

    Description

    Example

    Common Statistics Period

    This configuration is used for batch feature generation. To avoid generating too many features, this solution sets the statistics period to 3, 7, and 15 days to calculate statistics for users and items in the last 3, 7, and 15 days, respectively.

    If the number of user behaviors is small, you can try setting it to 21 days.

    3,7,15

    Key Behaviors

    Select the configured behavior events. We recommend adding them in the order of expr (impression), click, and praise.

    expr, click, praise

  4. At the Recall Configuration node, click Add to the right of the target category, configure the parameters, click Confirm, and then click Next.

    The following sections describe multiple recall configuration methods. To quickly guide you through the deployment process, you can configure only Global hot recall and etrec u2i recall. Other methods, such as vector recall and collaborative metric recall, are for reference only.

    Resource configuration

    Global hot recall

    Global hot recall generates a ranking of popular items (`top_n` represents the number of items in the ranking) based on click event statistics. If you want to modify the scoring formula for popularity or the access event, you can do so after you generate the relevant code and deploy it to the DataWorks platform.

    The scoring formula is click_uv*click_uv/(expr+adj_factor)*exp(-item_publish_days/fresh_decay_denom), where:

    • click_uv: For the same click-through rate (CTR), a higher number of clicks indicates greater popularity.

    • click_uv/(expr+adj_factor): The smoothed CTR, where click_uv is the number of unique users who clicked and expr is the number of impressions. The adjustment factor adj_factor is added to prevent the denominator from being zero and to adjust the CTR when the number of impressions is low. When impressions are few, the CTR approaches 1. Adding adj_factor moves the CTR away from 1, making it closer to the true CTR.

    • exp(-item_publish_days/fresh_decay_denom): Penalizes items that were published earlier. item_publish_days is the number of days from the publication date to the current date.

    image

    etrec u2i recall

    etrec is an item-based collaborative filtering algorithm. For more information, see Collaborative filtering etrec.

    image

    Parameter

    Description

    Training Days

    The number of days of behavior logs used for training. The default is 30 days. You can increase or decrease this value based on the log volume.

    Recall Count

    The final number of user-to-item pairs generated offline.

    U2ITrigger

    Items with which the user has interacted. For example, items that the user has clicked, favorited, or purchased. This generally does not include items with only impressions.

    Behavior Time Window

    The number of days of behavior data to collect. The default is 15, which means the last 15 days.

    Behavior Time Attenuation Coefficient

    A value between 0 and 1. A larger value indicates that past behaviors decay more rapidly, and their weight in constructing the trigger_item is smaller.

    Trigger Selection Count

    The number of item IDs to take for each user to perform a Cartesian product with the i2i data generated by etrec. We recommend a value between 10 and 50. If the number of triggers is too large, it will result in too many candidate items for recall.

    U2i Behavior Weight

    Note that the impression event should either not be set or be set to a weight of 0. We recommend not setting the impression event, which means skipping user impression data.

    I2I Model Settings

    The parameter settings for etrec. For more information, see Collaborative filtering etrec. We recommend not setting the number of related item selections too high.image

    Grouped hot recall

    You can set up rankings based on attributes, such as city and gender, to provide initial personalized recall. In the following example, a combination of gender and the bucketing number of a numerical value is used as the group.

    image

    swing u2i recall

    Swing is a method for calculating item relevance, measuring item similarity based on the User-Item-User principle.image

    image

    Vector recall

    Two vector recall methods are provided: DSSM and MIND. For more information, see:

    • Recall target name: Generally refers to whether an item was clicked. Set this to is_click.

    • Recall target selection: Set this to max(if(event='click', 1, 0)).

      You can use the following code for execution:

      select max(if(event='click',1,0)) is_click ,...
      from ${behavior_table}
      where between dt=${bizdate_start} and dt=${bizdate_end}
      group by req_id,user_id,item

      Where:

      • ${behavior_table}: The behavior table.

      • ${bizdate_start}: The start date of the behavior time window.

      • event: The event field in the ${behavior_table} table. Select a value based on the specific field.

      • is_click: The target name.

      The formulas for dimension calculation are as follows:

      EMB_SQRT4_STEP8: (8 + Pow(count, 0.25)) / 8) * 8
      EMB_SQRT4_STEP4: (4 + Pow(count, 0.25)) / 4) * 4
      EMB_LN_STEP8:    (8 + Log(count + 1)) / 8) * 8
      EMB_LN_STEP4:    (4 + Log(count + 1)) / 4) * 4

      Here, count is the number of feature enumeration values. Use the Log function when the number of feature values is large.

    image

    image

    Cold-start recall

    Similar to the DSSM dual-tower recall model, it is divided into a user tower and an item tower. DropoutNet is a recall model suitable for head users and items, along with for the long tail and even brand new users and items.

    image

    Global hot fallback recall

    Global hot fallback recall is similar to global hot recall. Its main purpose is to ensure that a sufficient candidate set can be recalled if the global hot recall engine fails. Therefore, it is stored in Redis, and this output has only one row of data.image

    Collaborative metric learning i2i recall

    Collaborative metric learning i2i recall, also known as the Collaborative Metric Learning I2I recall model, calculates the similarity between items based on session click data.image

  5. At the Ranking Configuration node, click Add to the right of Fine-grained Ranking, configure the parameters as described in the following table, click Confirm, and then click Next.

    Resource configuration

    The platform provides multiple ranking models. For more information, see Ranking Models. The following section describes how to set the ranking parameters for the DBMTL multi-objective ranking model.

    image

    Click Add next to Refined Ranking Target Settings (labels) to add the following two labels:

    • Target 1image

    • Target 2 (Note: The 'l' in 'ln' is a lowercase L)image

  6. At the Generate Script node, click Generate Deployment Script.image

    Important

    After the script is successfully generated, the system generates an OSS address as shown in the preceding figure. This OSS path stores all the files to be deployed. You can save this address locally to manually deploy the script later.

  7. After the script is generated, click OK in the dialog box. You are redirected to the Custom Recommendation Solution > Deployment Records page.

    If the generation fails, view the run logs, analyze and resolve the specific error, and then generate the script again.

6. Deploy the recommendation solution

After the script is generated, you can deploy it to DataWorks in one of two ways.

Method 1: Deploy through the Personalized Recommendation Platform

  1. Click Go To Deploy to the right of the target solution.image

  2. On the Deployment Preview page, in the File Diff section, select the files to deploy. Because this is the first deployment, click Select All and then click Deploy To DataWorks.

    The page automatically returns to the Deployment Records page, which shows that the script deployment is in progress.image

  3. Wait for a moment, then click image to refresh the list and check the deployment status.

    • If the deployment fails, click View Log in the Actions column, analyze and resolve the specific error, and then regenerate and deploy the script.

    • When the Deployment Status changes to Success, the script is successfully deployed. You can go to the Data Development page of the DataWorks workspace configured for this solution to view the deployed code. For more information, see Data development process guide.image

  4. View the task data backfill process.

    1. On the Custom Recommendation Solution > Deployment Records page, click Details in the Actions column of the successfully deployed recommendation solution.

    2. On the Deployment Preview page, click View Task Data Backfill Process to understand the backfill process and related instructions to ensure data integrity.

    3. Ensure that the user table, item table, and user behavior table partitions contain data for the last n days, where n is the sum of the training time window and the maximum feature time window. If you use the demo data from this topic, synchronize the latest data partitions. If you generate data using a Python script, backfill the data in the DataWorks Operation Center to produce the latest data partitions.

    4. Click Create Deployment Task. Under the Backfill Task List, click Start Tasks Sequentially. Ensure that all tasks run successfully. If a task fails, click Details to view the log information, analyze and resolve the error, and then rerun the task. After a successful rerun, click Continue in the upper-left corner of the page until all tasks are successful.截屏2025-10-20 15

Method 2: Deploy using Migration Assistant

After the script is successfully generated, you can also go to the DataWorks console and manually deploy the script using the Migration Assistant feature. The key parameters are described below. For other operations, see Create and view a DataWorks import task.

  • Import Name: Set this as prompted in the console.

  • Upload Method: Select OSS File, enter the OSS Link, and click Verify.

    The deployment file is stored at the OSS address generated in Step 5, such as oss://examplebucket/algoconfig/plan/1723717372/package.zip. You can log on to the OSS console and follow the steps below to obtain the URL of the corresponding file.image

7. Freeze nodes

This topic uses demo data. After the data backfill is complete, freeze the tasks in the Operation Center (the three nodes from Step 2.2) to prevent them from being scheduled and run daily.

Go to the DataWorks Operation Center. Choose Periodic Task O&M > Periodic Tasks. Search for the name of the node that you created, such as rec_sln_demo_user_table_v1. Select the target node (Workspace.Node Name) and choose Pause (Freeze).