In recommendation scenarios, you can use the FM-Embedding scheme that is provided by Machine Learning Studio to obtain the feature vectors of each user and item. Then, you can use the recall module to obtain the product of the feature vectors. This way, you can predict the rating to be assigned by each user to each item. This topic describes how to use the Factorization Machine (FM) and Embedding algorithms to generate feature vectors of users and items.

Background information

AI-based recommendation is divided into two modules: sorting and recall. The recall module uses feature vectors to represent users and to-be-recommended items. The product of the feature vector of a user and the feature vector of an item indicates the interest of the user in the item. The experiment that is described in this topic is based on real recommendation data. The entire workflow of the experiment is preset in the REC-FM Embedding Matching template of Machine Learning Studio. You can generate the feature vectors of users and items in a fast manner by dragging the components that are provided by Machine Learning Studio.

Dataset

The following table describes the fields in the dataset.
Field Data type Description
userid STRING The ID of the user.
age DOUBLE The age of the user.
gender STRING The gender of the user.
itemid STRING The ID of the item.
price DOUBLE The price of the item.
size DOUBLE The size of the item.
label DOUBLE Indicates whether the user has purchased the item. Valid values:
  • 1: The user has purchased the item.
  • 0: The user has not purchased the item.
The following figure shows the sample data that is used in the experiment.Dataset

Procedure

  1. Go to the Machine Learning Studio console.
    1. Log on to the PAI console.
    2. In the left-side navigation pane, choose Model Training > Studio-Modeling Visualization.
    3. On the PAI Visualization Modeling page, find the project in which you want to create an experiment and click Machine Learning in the Operation column.Machine Learning
  2. Create an experiment.
    1. In the left-side navigation pane, click Home.
    2. In the Templates section, click Create below REC-FM Embedding Matching.
    3. In the New Experiment dialog box, set the experiment parameters. You can use the default values of the parameters.
      Parameter Description
      Name The name of the experiment. Default value: REC-FM Embedding Matching.
      Project The project in which you want to create the experiment. You cannot change the value of this parameter.
      Description The description of the experiment. Default value: Recommended system based on FM-Embedding.
      Save To The directory for storing the experiment. Default value: My Experiments.
    4. Click OK.
    5. Optional:Wait about 10 seconds. Then, click Experiments in the left-side navigation pane.
    6. Optional:Click REC-FM Embedding Matching_XX under My Experiments. The canvas of the experiment appears.
      My Experiments is the directory for storing the experiment that you created and REC-FM Embedding Matching_XX is the name of the experiment. In the experiment name, _XX is the ID that the system automatically creates for the experiment.
    7. View the components of the experiment on the canvas, as shown in the following figure. The system automatically creates the experiment based on the preset template.
      Matching recall experiment
      Component No. Description
      1 This component performs one-hot encoding on all feature data. One-hot encoding converts character-type data to numeric-type data. In this experiment, the One Hot Encoding-1 component first performs one-hot encoding on all feature data and creates an encoding model. Then, the One Hot Encoding-1 component exports the encoding model to the One Hot Encoding-2 and One Hot Encoding-3 components.
      2 This component creates an FM model. You can click the component and view the default parameter settings of the component on the Parameters Setting tab in the right-side pane. The default value of Dimension is 1,1,10, in which 10 indicates the number of dimensions in each feature vector to be generated.
      3 This component generates user feature codes. The input parameters of this component include userid, gender, and age. The input parameters are specified by Binarization Column. The userid column specified by Appended Columns will be appended to the output table of the component.
      4 This component generates item feature codes. The input parameters of this component include itemid, price, and size. The input parameters are specified by Binarization Column. The itemid column specified by Appended Columns will be appended to the output table of the component.
      5 These components extract feature vectors of users and items. Each component includes the following parameters:
      • Name of the Embedding Vector ID Column: the feature_id parameter of the model that is trained by the FM Train-1 component.
      • Embedding vector column name: the feature_weights parameter of the model that is trained by the FM Train-1 component.
      • Weight vector column name: the sparse columns that are exported by the One-Hot Encoding-1 component.
      • Output result column name: the name of the column that contains the generated feature vectors.
  3. Run the experiment and view the result.
    1. In the top toolbar of the canvas, click Run.
    2. After the experiment is run, right-click Embedding extract-1 on the canvas and select View Data. In the dialog box that appears, view the feature vectors of users.Feature vectors of users
    3. Right-click Embedding extract-2 on the canvas and select View Data. In the dialog box that appears, view the feature vectors of items.Feature vectors of items