The Feature Importance Filtering component provides the filtering feature for components including Linear Model Feature Importance, GBDT Feature Importance, and Random Forest Feature Importance. The Feature Importance Filtering component can be used to filter the top N features.

Configure the component

PAI command
PAI -name fe_filter_runner -project algo_public
    -DselectedCols=pdays,previous,emp_var_rate,cons_price_idx,cons_conf_idx,euribor3m,nr_employed,age,campaign,poutcome
    -DinputTable=pai_dense_10_10
    -DweightTable=pai_temp_2252_20319_1
    -DtopN=5
    -DmodelTable=pai_temp_2252_20320_2
    -DoutputTable=pai_temp_2252_20320_1;
Parameter Description Required
inputTable The name of the input table. Yes
inputTablePartitions By default, all partitions in the input table are selected. Specify this parameter in one of the following formats:
  • A single partition: partition_name=value
  • Multiple partitions: name1=value1,name2=value2
    Note Multiple partitions are separated by commas (,).
  • Multi-level partitions: name1=value1/name2=value2
No
weightTable The weight tables of the feature importance. The weight tables are the output tables of Linear Model Feature Importance, GBDT Feature Importance, and Random Forest Feature Importance. Yes
outputTable The output table after the top N features are filtered. Yes
modelTable The model file generated by feature filtering. Yes
selectedCols By default, all the fields in the input table are selected. No
topN The top N features that are filtered. Default value: 10.
Note The value of this parameter must be a positive integer.
No
lifecycle The lifecycle of the output table. Default value: 7.
Note The value of this parameter must be a positive integer.
No