The Factorization Machine (FM) algorithm is a nonlinear model that incorporates interactions among features. This algorithm is suitable for scenarios in which E-commerce, advertisements, and live video streaming are used to promote commodities.

Configure the components

Machine Learning Designer, previously known as Machine Learning Studio, provides the FM algorithm in the FM Train and FM Prediction components. You can use the templates that contain the components to create FM experiments. If you use Machine Learning Studio, you can find the FM-Embedding for Rec-System template on the Home page and click Create. If you use Machine Learning Designer, you can search for the [Alink]FM-Embedding for Rec-System template on the Pipeline Templates tab of the Visualized Modeling (Designer) page and click Create.

You can use one of the following methods to configure the FM algorithm component.

Method 1: Configure the component on the pipeline page

You can configure the parameters of the FM algorithm component on the pipeline page of Machine Learning Designer of Machine Learning Platform for AI (PAI). Machine Learning Designer is formerly known as Machine Learning Studio. The following table describes the parameters.
ComponentTabParameterDescription
FM TrainFields SettingFeature ColumnsSelect feature columns based on the characteristics of the input table. Columns of the STRING and DOUBLE types are supported.
Label ColumnSelect a label column based on the characteristics of the input table. Only the columns of the DOUBLE type are supported.
Advanced OptionsThis parameter is available only in Machine Learning Designer.

If you select Advanced Options, Flink configuration item is available.

Flink configuration itemThis parameter is available only in Machine Learning Designer.

Specify the Flink configuration items. For more information, see Configuration.

Parameters SettingTask TypeSelect the task type. Valid values:
  • regression
  • binary_classification
Number of iterationsSpecify the total number of iterations. Default value: 10.
Regularization coefficientSpecify three floating-point numbers separated by commas (,). These three numbers represent the regularization coefficients of the 0th order term, 1st order term, and 2nd order term.
Learning rateSpecify the learning rate. If the training is diverged, set this parameter to a smaller value.
Parameter initialization standard deviationSpecify the standard deviation for parameter initialization. This parameter is used to normalize data. The value is of the DOUBLE type. Default value: 0.05.
DimensionsSpecify three positive integers separated by commas (,). These three positive integers represent the lengths of the 0th order term, 1st order term, and 2nd order term.
Block sizeSpecify the name of the performance metric.
Output table lifecycleThis parameter is available only in Machine Learning Studio.

Specify the lifecycle of the output table.

TuningNumber of WorkersSpecify the number of workers. This parameter must be used together with the Memory Size per Node (MB) parameter. Valid values: 1 to 9999.
Memory Size per Node (MB)Specify the memory size of each node. This parameter must be used together with the Number of Workers parameter. Valid values: 1024 to 64 × 1024. Unit: MB.
FM PredictionParameters SettingPrediction Result ColumnSpecify the name of the prediction result column.
Prediction Score ColumnThis parameter is available only in Machine Learning Studio.

Specify the name of the prediction score column.

None.
Output Detail ColumnSpecify the name of the prediction detail column.
Reserved ColumnsSpecify the columns that you want to reserve in the output table.
Advanced ConfigurationThis parameter is available only in Machine Learning Designer.

If you select Advanced Configuration, Number of Threads using by each worker and Type of ModelSize are available.

Number of Threads using by each workerSpecify the number of threads used for prediction in each worker. Default value: 1.
Type of ModelSizeSpecify the model size type. Default value: small. Valid values:
  • large
  • small
TuningNumber of WorkersSpecify the number of workers. This parameter must be used together with the Memory Size per Core (MB) parameter. Valid values: 1 to 9999.
Memory Size per Core (MB)Specify the memory size of each core. This parameter must be used together with the Number of Workers parameter. Valid values: 1024 to 64 × 1024. Unit: MB.

Method 2: Use PAI commands

ComponentParameterRequiredDescriptionDefault value
FM TraintensorColNameYesThe name of the feature column. Data in the column must be in the key-value format. Separate multiple names with commas (,). Example: 1:1.0,3:1.0. None
labelColNameYesThe name of the label column. Only the columns of numeric data types are supported. If the task parameter is set to binary_classification, the value of label must be 0 or 1. None
taskYesThe type of the task. Valid values: regression and binary_classification. regression
numEpochsNoThe number of iterations. 10
dimNoThree positive integers separated by commas (,). These three positive integers represent the lengths of the 0th order term, 1st order term, and 2nd order term. 1,1,10
learnRateNoThe learning rate.
Note If the training is diverged, set the learnRate parameter to a smaller value.
0.01
lambdaNoThree floating-point numbers separated by commas (,). These three numbers represent the regularization coefficients of the 0th order term, 1st order term, and 2nd order term. 0.01,0.01,0.01
initStdevNoThe standard deviation of parameter initialization. 0.05
FM PredictionpredResultColNameNoThe name of the prediction result column. prediction_result
predScoreColNameNoThe name of the prediction score column. prediction_score
predDetailColNameNoThe name of the prediction detail column. prediction_detail
keepColNamesNoThe columns that you want to reserve in the output table. All columns
If you use the following data as input for the FM algorithm template, the model area under the curve (AUC) generated by the training operation is about 0.97. Input dataAUC