Alternating Least Squares (ALS) is a matrix factorization algorithm that factorizes sparse matrices and predicts the values of missing entries to obtain a basic training model. ALS, also called a hybrid collaborative filtering algorithm, combines users and items.
Limits
The supported compute engines are MaxCompute and Realtime Compute for Apache Flink.
Configure the component in the PAI console
Input ports
Input port (from left to right)
Data type
Recommended upstream component
Required
input table
N/A
Data Preprocessing
Yes
Component parameters
Tab
Parameter
Description
Fields Setting
user column name
The name of the user ID column in the input table. Data in the column must be of the BIGINT type.
item column name
The name of the item column in the input table. Data in the column must be of the BIGINT type.
rating column name
The name of the column that contains the scores provided by users for items in the input table. Data in the column must be of the numeric type
Parameters Setting
num factors
The number of factors. Valid values: (0,+∞). Default value: 10.
Number of iterations
The number of iterations. Valid values: (0,+∞). Default value: 10.
Regularization coefficient
The regularization coefficient. Valid values: (0,+∞). Default value: 0.1.
check box
Specifies whether to use an implicit preference model.
alpha parameter
The implicit preference coefficient. Valid values: (0,+∞). Default value: 40.
Output table lifecycle
The lifecycle of the output model table. Unit: days.
Tuning
Number of Workers
The number of worker nodes. Valid values: 1 to 9999.
Node Memory, MB
The memory size of each worker node. Valid values: 1024 to 65536. Unit: MB.
Examples
If you use the following data as input for the Als Matrix Factorization component, you can obtain the following user factors and item factors.
Input data
user_id
item_id
rating
10944750
13451
0
10944751
13452
1
10944752
13453
2
10944753
13454
2
10944754
13455
4
... ...
... ...
... ...
Output user factor table
user_id
factors
8528750
[0.026986524,0.03350178,0.03532385,0.019542359,0.020429865,0.02046867,0.022253247,0.027391396,0.018985065,0.04889483]
282500
[0.116156064,0.07193632,0.090851225,0.017075706,0.025412979,0.047022138,0.12534861,0.05869226,0.11170533,0.1640192]
4895250
[0.038429666,0.061858658,0.04236993,0.055866677,0.031814687,0.0417443,0.012085311,0.0379342,0.10767074,0.028392972]
... ...
... ...
Output item factor table
item_id
factors
24601
[0.0063337763,0.026349949,0.0064828005,0.01734504,0.022049638,0.0059205987,0.008568814,0.0015981696,0.0,0.013601779]
26699
[0.0027524426,0.0043066847,0.0031336215,0.00269448,0.0022347474,0.0020477585,0.0027995422,0.0025390312,0.0033011117,0.003957773]
20751
[0.03902271,0.050952066,0.032981463,0.03862796,0.048720762,0.027976315,0.02721664,0.018149626,0.0149896275,0.026251089]
... ...
... ...