The Linear Model Feature Importance component is used to calculate the feature importance for the linear model, such as linear regression and logistic regression for binary classification. Both the sparse and dense data formats are supported.

Configure the component

You can use the Linear Model Feature Importance component to calculate the feature importance for the linear model, such as linear regression and logistic regression for binary classification. Both the sparse and dense data formats are supported. You can configure the component by using one of the following methods:
  • Use the Machine Learning Platform for AI console
    Tab Parameter Description
    Fields Setting Feature Columns Optional. The feature columns that are selected from the input table for training. By default, all columns other than the label column are selected.
    Target Column Required. The label column.

    Click the Directory icon. In the Select Column dialog box, enter the keywords of the column that you want to search for. Select the column and click OK.

    Input Sparse Table Optional. Specifies whether data in the input table is in the sparse format.
    Tuning Number of Computing Cores Optional. The number of cores used in computing.
    Memory Size per Core Optional. The memory size of each core. Unit: MB.
  • Use commands
    PAI -name regression_feature_importance -project algo_public
        -DmodelName=xlab_m_logisticregressi_20317_v0
        -DoutputTableName=pai_temp_2252_20321_1
        -DlabelColName=y
      -DfeatureColNames=pdays,previous,emp_var_rate,cons_price_idx,cons_conf_idx,euribor3m,nr_employed,age,campaign
        -DenableSparse=false -DinputTableName=pai_dense_10_9;
    Parameter Required Description Default value
    inputTableName Yes The name of the input table. N/A
    outputTableName Yes The name of the output table. N/A
    labelColName Yes The name of the label column in the input table. N/A
    modelName Yes The name of the input model. N/A
    featureColNames No The feature columns that are selected from the input table for training. All columns other than the label column
    inputTablePartitions No The partitions that are selected from the input table for training. All partitions
    enableSparse No Specifies whether data in the input table is in the sparse format. false
    itemDelimiter No The delimiter that is used to separate key-value pairs when data in the input table is in the sparse format. Space
    kvDelimiter No The delimiter that is used to separate keys and values when data in the input table is in the sparse format. :
    lifecycle No The lifecycle of the output table. Empty string
    coreNum No The number of cores. Determined by the system
    memSizePerCore No The memory size of each core. Determined by the system

Example

  1. Execute the following SQL statements to generate training data:
    create table if not exists pai_dense_10_9 as
    select
        age,campaign,pdays, previous, emp_var_rate, cons_price_idx, cons_conf_idx, euribor3m, nr_employed, y
    from  bank_data limit 10;
  2. Create the experiment shown in the following figure. For more information, see Generate a model by using an algorithm.
    y in the preceding SQL statements is used as the label column of Logistic Regression for Multiclass Classification. Other fields in the statement are feature columns. Default values are retained for other parameters specified for the Logistic Regression for Multiclass Classification and Linear Model Feature Importance components. Algorithm modeling
  3. Run the experiment and view the prediction results. Prediction results
    The following table describes the calculation formulas for metrics.
    Column Formula
    weight abs(w_)
    importance abs(w_j) * STD(f_i)
  4. After the running is complete, right-click the Linear Model Feature Importance component and select View Analytics Report. Then, view the result. Result