## Overview

The Factorization Machine (FM) algorithm can be used for regression and binary classification prediction. It is a nonlinear model that takes into account the interaction between features. Currently, the FM algorithm is one of the proven effective recommendation solutions and is widely used in the recommendation scenarios of e-commerce, advertising, and live streaming.

The FM algorithm used by Machine Learning Platform for AI (PAI) is developed based on big data within Alibaba. It features excellent performance and outstanding results. For the FM algorithm usage, see the corresponding template on the homepage.

The FM algorithm involves the FM training and prediction components, which can be used with the evaluation component.

## Required input data

Currently, the FM algorithm only supports data in the libsvm format. The data is divided into two columns: feature column and target column.

- Target column: double type
- Feature column: string type. Features must be entered in the k:v format and separated with commas (,).

See the following figure.

## Components

### 1. FM training

In **Parameters Setting**, you can set **Regression** or **Binary Classification**.

#### PAI commands

Parameter | Description | Value |
---|---|---|

tensorColName | The name of the feature column for training, expressed by a string in the k:v format, such as 1:1.0,3:1.0. The feature ID must be a non-negative integer. The value range is [0, Long.MAX_VALUE). Nonconsecutive values are allowed. | Required |

labelColName | The name of the label column. The value must be a number. If the task type is binary_classification, the value is either 0 or 1. | Required |

task | The task type. | Required. Valid values: regression and binary_classification |

numEpochs | The number of iterations. | Optional. Default value: 10 |

dim | The number of factors, expressed by a string that consists of three integers separated with commas (,) to indicate the length of constant term, linear term, and quadratic term. | Optional. Default value: 1,1,10 |

learnRate | The learning rate. | Optional. Default value: 0.01 |

lambda | The regularization coefficient, expressed by a string that consists of three floating-point numbers separated with commas (,) to indicate the regularization coefficients of constant term, linear term, and quadratic term. | Optional. Default value: 0.01,0.01,0.01 |

initStdev | The standard deviation of parameter initialization. | Optional. Default value: 0.05 |

Note:

- Reduce the learning rate in the case of training divergence.

### 2. FM prediction

#### PAI commands

Parameter | Description | Value |
---|---|---|

predResultColName | The name of the prediction result column. | Optional. Default value: prediction_result |

predScoreColName | The name of the prediction score column. | Optional. Default value: prediction_score |

predDetailColName | The name of the prediction detail column. | Optional. Default value: prediction_detail |

keepColNames | The columns saved to the output result table. | Optional. Default value: all columns |

## Result evaluation

Using the data of the corresponding template on the homepage, the FM algorithm of PAI can create a model with an AUC close to 0.97.