All Products
Search
Document Center

Platform For AI:Als Matrix Factorization

Last Updated:Mar 08, 2024

Alternating Least Squares (ALS) is a matrix factorization algorithm that factorizes sparse matrices and predicts the values of missing entries to obtain a basic training model. ALS, also called a hybrid collaborative filtering algorithm, combines users and items.

Limits

The supported compute engines are MaxCompute and Realtime Compute for Apache Flink.

Configure the component in the PAI console

  • Input ports

    Input port (from left to right)

    Data type

    Recommended upstream component

    Required

    input table

    N/A

    • Data Preprocessing

    Yes

  • Component parameters

    Tab

    Parameter

    Description

    Fields Setting

    user column name

    The name of the user ID column in the input table. Data in the column must be of the BIGINT type.

    item column name

    The name of the item column in the input table. Data in the column must be of the BIGINT type.

    rating column name

    The name of the column that contains the scores provided by users for items in the input table. Data in the column must be of the numeric type

    Parameters Setting

    num factors

    The number of factors. Valid values: (0,+∞). Default value: 10.

    Number of iterations

    The number of iterations. Valid values: (0,+∞). Default value: 10.

    Regularization coefficient

    The regularization coefficient. Valid values: (0,+∞). Default value: 0.1.

    check box

    Specifies whether to use an implicit preference model.

    alpha parameter

    The implicit preference coefficient. Valid values: (0,+∞). Default value: 40.

    Output table lifecycle

    The lifecycle of the output model table. Unit: days.

    Tuning

    Number of Workers

    The number of worker nodes. Valid values: 1 to 9999.

    Node Memory, MB

    The memory size of each worker node. Valid values: 1024 to 65536. Unit: MB.

Examples

If you use the following data as input for the Als Matrix Factorization component, you can obtain the following user factors and item factors.

  • Input data

    user_id

    item_id

    rating

    10944750

    13451

    0

    10944751

    13452

    1

    10944752

    13453

    2

    10944753

    13454

    2

    10944754

    13455

    4

    ... ...

    ... ...

    ... ...

  • Output user factor table

    user_id

    factors

    8528750

    [0.026986524,0.03350178,0.03532385,0.019542359,0.020429865,0.02046867,0.022253247,0.027391396,0.018985065,0.04889483]

    282500

    [0.116156064,0.07193632,0.090851225,0.017075706,0.025412979,0.047022138,0.12534861,0.05869226,0.11170533,0.1640192]

    4895250

    [0.038429666,0.061858658,0.04236993,0.055866677,0.031814687,0.0417443,0.012085311,0.0379342,0.10767074,0.028392972]

    ... ...

    ... ...

  • Output item factor table

    item_id

    factors

    24601

    [0.0063337763,0.026349949,0.0064828005,0.01734504,0.022049638,0.0059205987,0.008568814,0.0015981696,0.0,0.013601779]

    26699

    [0.0027524426,0.0043066847,0.0031336215,0.00269448,0.0022347474,0.0020477585,0.0027995422,0.0025390312,0.0033011117,0.003957773]

    20751

    [0.03902271,0.050952066,0.032981463,0.03862796,0.048720762,0.027976315,0.02721664,0.018149626,0.0149896275,0.026251089]

    ... ...

    ... ...