This topic describes the Information Value (IV) method that evaluates the predictive power of a feature.

Scenarios

IV is commonly used to select effective features. In risk control scenarios, thousands or tens of thousands of features may exist, which makes it difficult to manually choose the predictive ones. The IV method can be used to handle this issue.

Syntax

CREATE FEATURE feature_name WITH ( feature_class = '', x_cols = '', y_cols = '', parameters=()) AS (SELECT select_expr [, select_expr] ... FROM table_reference)
Parameter description:
ParameterDescription
feature_nameThe name of the feature.
feature_classThe type of the feature. Set the value to iv.
x_colsThe list of independent variables. Separate multiple variables with commas (,).
y_colsThe dependent variable.
parametersCustom parameters for creating the feature. The IV method supports only categorical features. The value can only be set to categorical_features. Separate multiple features with commas(,).
select_exprThe name of the column used to create the feature.
table_referenceThe name of the table containing the column used to create the feature.

Example

/*polar4ai*/CREATE FEATURE iv_001 WITH ( feature_class = 'iv',x_cols='Airline,Flight,AirportFrom,AirportTo,DayOfWeek,Time,Length',y_cols='Delay',parameters=(categorical_feature='Airline,Flight,AirportFrom,AirportTo,DayOfWeek')) AS (SELECT * from airlines_test_1000);