This topic describes how to use the FM-Embedding for Rec-System template that is preset in Machine Learning Studio to create a recommendation model.
The Factorization Machine (FM) algorithm is a nonlinear model that takes into account the interaction between features. This algorithm is widely used in the recommendation scenarios of e-commerce, advertising, and live streaming. The FM-Embedding for Rec-System template that is provided by Machine Learning Studio include FM training, FM prediction, and evaluation components.
- Go to the Machine Learning Studio console.
- Log on to the PAI console.
- In the left-side navigation pane, choose .
- On the PAI Visualization Modeling page, find the project in which you want to create an experiment and click Machine Learning in the Operation column.
- Create an experiment.
- In the left-side navigation pane, click Home.
- In the Templates section, click Create below FM-Embedding for Rec-System.
- In the New Experiment dialog box, set the experiment parameters such as Name and Description.
- Click OK.
- Wait about 10 seconds for the canvas of the experiment to appear. The following figure shows the canvas.
- Set the parameters for FM Train-1.
- Click FM Train-1 on the canvas.
- On the Fields Setting tab in the right-side pane, set the parameters.
Parameter Description Feature column The name of the feature column. The data in the feature column is in the key:value format. Multiple key-value pairs in the feature column are separated with commas (,). Label column The name of the label column. The label column must be of the DOUBLE data type.
- On the Parameters Setting and Tuning tabs in the right-side pane, set the training parameters.Assume that the experiment involves 120 million sample data records and 1.3 million feature data records. We recommend that you set the training parameters that are described in the following table to the recommended values, and use the default values for other parameters. You can modify the training parameters based on the amount of data that is involved.
Tab Parameter Description Parameters Setting Learning rate The learning rate. Recommended value: 0.005. If the training is divergent, set this parameter to a smaller value. Dimension The number of dimensions. Recommended value: 1,1,16. Block size The size of the block. If less than 2 million feature data records are involved, we recommend that you set this parameter to 1000000. If 2 million feature data records or more are involved, you do not need to set this parameter. Tuning cluster number The number of nodes to be used. Recommended value: 32. If a large amount of data is involved, set this parameter to a greater value. Memory size of a single node, unit M The memory size to be allocated for each node. Recommended value: 16384.
- In the top toolbar of the canvas, click Run.
- After the experiment is run, right-click Binary Classification Evaluation-1 on the canvas and select View Evaluation Report.Based on the data of the FM-Embedding for Rec-System template, the FM algorithm of Machine Learning Studio can create a model with an area under curve (AUC) close to 0.97.