Scorecard is a common modeling tool that is used in the credit risk assessment field. The scorecard performs binning to implement the discretization of variables and uses linear models such as linear and logistic regression models, to train a model. The model training process includes feature selection and score transformation. The scorecard also allows you to add constraints to the variables during model training.
Background information
 Feature engineering
The main difference between the scorecard and normal linear models is that the scorecard performs feature engineering before it trains linear models. The Scorecard Training component supports the following methods for feature engineering:
 The Binning component is used to implement feature discretization. Then, onehot encoding
is performed for each variable based on binning results to generate N dummy variables.
N represents the number of bins.
Note When you convert original variables into dummy variables, you can specify constraints for these dummy variables.
 The Binning component is used to implement feature discretization. Then, weight of evidence (WOE) conversion is performed to replace the original value of a variable with the WOE value of the bin into which the variable falls.
 The Binning component is used to implement feature discretization. Then, onehot encoding
is performed for each variable based on binning results to generate N dummy variables.
N represents the number of bins.
 Score transformation
In scenarios such as credit scoring, you must perform a linear transformation to convert the predicted sample odds into a score. The following formula is used for linear transformation: You can use the following parameters to specify the linear transformation relationship:
 scaledValue: specifies a scaled score.
 odds: specifies the odds of the scaled score.
 pdo: specifies the points at which the odds are doubled.
Calculate the values of a and b, and perform a linear transformation to obtain the scores of a and b.log(50)=a×800+b log(100)=a×825+b
The scaling information is specified in the JSON format by using theDscale
parameter.
If you specify thenull
Dscale
parameter, you must also specify scaledValue, odds, and pdo.  Constraint addition during training
During scorecard training, you can add constraints to variables. For example, you can specify the score of a specific bin to a fixed value, specify the scores of two bins to a specific proportion, or limit the scores between bins, such as sorting bin scores by WOE value. The implementation of constraints depends on the underlying optimization algorithm that contains constraints. You can specify the constraints in the Binning component in the Machine Learning Platform for AI (PAI) console. After the constraints are specified, the Binning component generates JSONformatted constraints and automatically transfers them to its connected training component. The system supports the following JSONformatted constraints:
 "<": The weights of variables must be sorted in ascending order.
 ">": The weights of variables must be sorted in descending order.
 "=": The weight of a specific variable must be a fixed value.
 "%": The weights of two variables must meet a proportional relationship.
 "UP": the upper limit for the weights of variables.
 "LO": the lower limit for the weights of variables.
{ "name": "feature0", "<": [ [0,1,2,3] ], ">": [ [4,5,6] ], "=": [ "3:0","4:0.25" ], "%": [ ["6:1.0","7:1.0"] ] }
 Builtin constraints
Each original variable has a builtin constraint: For each variable, the average score of a population must be 0. Due to the constraint, the value of scaled_weight in the intercept options of the scorecard model equals the average score of the population in terms of all variables.
 Optimization algorithms
On the Parameters Setting tab, select Advanced Options. Then, you can configure the optimization algorithm that is used during scorecard training. The system supports the following optimization algorithms:
 LBFGS: This algorithm is a firstorder optimization algorithm that is used to process large amounts of feature data. The algorithm does not contain constraints. If you select this algorithm, the system automatically ignores the constraints that you specify.
 Newton's method: This algorithm is a classic secondorder optimization algorithm. It is fast in convergence and accurate. However, the algorithm is not suitable for processing large amounts of feature data because it needs to calculate a secondorder Hessian matrix. The algorithm does not contain constraints. If you select this algorithm, the system automatically ignores the constraints that you specify.
 Barrier method: This algorithm is a secondorder optimization algorithm. If the algorithm does not contain constraints, it is completely equivalent to the Newton's method algorithm. The barrier method algorithm provides almost the same computing performance and accuracy as the SQP algorithm. In most cases, we recommend that you select SQP.
 SQP
This algorithm is a secondorder optimization algorithm. If the algorithm does not contain constraints, it is completely equivalent to the Newton's method algorithm. The SQP algorithm provides almost the same computing performance and accuracy as the barrier method algorithm. In most cases, we recommend that you select SQP.
Note LBFGS and Newton's method are optimization algorithms without constraints. Barrier method and SQP are optimization algorithms with constraints.
 If you are not familiar with optimization algorithms, we recommend that you set the Optimization Algorithm parameter to Autoselected by default. In this case, the system selects the most appropriate algorithm based on the data amount and constraints.
 Feature selection
The Scorecard Training component supports stepwise feature selection. Stepwise feature selection is a combination of forward and backward selection. Each time the system performs a forward selection to select a new variable and adds it to the model, the system also performs a backward selection. The backward selection is used to remove the variables whose significance does not meet requirements. Stepwise feature selection supports various functions and feature transformation methods. Therefore, stepwise feature selection also supports multiple selection standards. The following standards are supported:
 Marginal contribution: This standard can be applied to all functions and feature engineering
methods.
For this standard, two models must be trained: Model A and Model B. Model A does not contain Variable X, and Model B contains Variable X in addition to all the variables of Model A. The difference between the functions of the two models in final convergence is the marginal contribution of Variable X to all the other variables in Model B. In scenarios where variables are converted into dummy variables, the marginal contribution of Variable X is the difference between the functions of all dummy variables in Model A and the functions of all dummy variables in Model B. Therefore, the marginal contribution standard is supported by all feature engineering methods.
Marginal contribution is flexible and is not limited to a specific type of model. Only variables that contribute to functions are passed to the model. Marginal contribution has disadvantages when compared with statistical significance. Typically, 0.05 is used as the threshold for statistical significance. Marginal contribution does not provide a recommended threshold for beginners. We recommend that you set the threshold to 10E5.
 Score test: This standard supports only WOE conversion and logistic regression without
feature engineering.
During a forward selection, a model that has only intercept options is trained first. In each subsequent iteration, the score chisquares of the variables that are not passed to the model are measured. The variable with the largest score chisquare is passed to the model. In addition, the pvalue of the variable with the largest score chisquare is calculated based on chisquare distribution. If the pvalue of the variable is greater than the given SLENTRY value, the variable is not passed to the model, and feature selection is terminated.
After the forward selection is complete, a backward selection is performed for the variable that is passed to the model. The Wald chisquare of the variable and the related pvalue are calculated. If the pvalue is greater than the given SLSTAY value, the variable is removed from the model. Then, the system starts a new iteration.
 F test: This standard supports only WOE conversion and linear regression without feature
engineering.
During a forward selection, a variable that has only intercept options is trained first. In each subsequent iteration, the Fvalues of the variables that are not passed to the model are calculated. Fvalue calculation is similar to marginal contribution calculation. Two models must be trained to calculate the Fvalue of a variable. The Fvalue follows F distribution. The related pvalue can be calculated based on the probability density function of F distribution. If the pvalue is greater than the given SLENTRY value, the variable is not passed to the model, and the forward selection is terminated.
During the backward selection, the Fvalue is used to calculate the significance of a variable in a way similar to a score test.
 Marginal contribution: This standard can be applied to all functions and feature engineering
methods.
 Forced selection of the variables that you want to pass to a model
Before a feature selection is performed, you can specify the variables that you want to forcibly pass to the model. No forward or backward selection is performed for the specified variables. These variables are directly passed to the model regardless of their significance. You can specify the number of iterations and significance thresholds by using the Dselected parameter. Specify this parameter in the JSON format. Example:
If the Dselected parameter is left empty or the max_step parameter is set to 0, no feature selection is performed.{"max_step":2, "slentry": 0.0001, "slstay": 0.0001}
Configure the component
pai name=linear_model project=algo_public
DinputTableName=input_data_table
DinputBinTableName=input_bin_table
DinputConstraintTableName=input_constraint_table
DoutputTableName=output_model_table
DlabelColName=label
DfeatureColNames=feaname1,feaname2
Doptimization=barrier_method
Dloss=logistic_regression
Dlifecycle=8
Parameter  Description  Required  Default value 

inputTableName  The name of the input feature table.  Yes  N/A 
inputTablePartitions  The partitions that are selected from the input feature table.  No  Full table 
inputBinTableName  The name of the binning result table. If you specify this parameter, the system automatically performs discretization for features based on the binning rules in the binning result table.  No  N/A 
featureColNames  The feature columns that are selected from the input table.  No  All columns except the label column 
labelColName  The name of the label column.  Yes  N/A 
outputTableName  The name of the output model table.  Yes  N/A 
inputConstraintTableName  The name of the table that stores constraints. The constraints are a JSON string that is stored in a cell of the table.  No  N/A 
optimization  The optimization algorithm. Valid values:

No  auto 
loss  The loss type. Valid values: logistic_regression and least_square.  No  logistic_regression 
iterations  The maximum number of iterations for optimization.  No  100 
l1Weight  The parameter weight of L1 regularization. Only LBFGS supports this parameter.  No  0 
l2Weight  The parameter weight of L2 regularization.  No  0 
m  The historical step size for optimization that is performed by using the LBFGS algorithm. Only the LBFGS algorithm supports this parameter.  No  10 
scale  The weight scaling information of the scorecard.  No  Empty string 
selected  Specifies whether to enable feature selection during scorecard training.  No  Empty string 
convergenceTolerance  The convergence tolerance.  No  1e6 
positiveLabel  Specifies whether the samples are positive samples.  No  1 
lifecycle  The lifecycle of the output table.  No  N/A 
coreNum  The number of cores.  No  Determined by the system 
memSizePerCore  The memory size of each core. Unit: MB.  No  Determined by the system 
Output
Column  Data type  Description 

feaname  STRING  The feature name. 
binid  BIGINT  The bin ID. 
bin  STRING  The description of the bin, which indicates the interval of the bin. 
constraint  STRING  The constraints that are added to the bin during training. 
weight  DOUBLE  The weight of a binning variable. For a nonscorecard model without binning, this field indicates the weight of a model variable. 
scaled_weight  DOUBLE  The score that is linearly transformed from the weight of a binning variable in scorecard training. 
woe  DOUBLE  A statistical metric. It indicates the WOE value of a bin in the training set. 
contribution  DOUBLE  A statistical metric. It indicates the marginal contribution value of a bin in the training set. 
total  BIGINT  A statistical metric. It indicates the total number of samples in a bin in the training set. 
positive  BIGINT  A statistical metric. It indicates the number of positive samples in a bin in the training set. 
negative  BIGINT  A statistical metric. It indicates the number of negative samples in a bin in the training set. 
percentage_pos  DOUBLE  A statistical metric. It indicates the proportion of positive samples in a bin to total positive samples in the training set. 
percentage_neg  DOUBLE  A statistical metric. It indicates the proportion of negative samples in a bin to total negative samples in the training set. 
test_woe  DOUBLE  A statistical metric. It indicates the WOE value of a bin in the testing set. 
test_contribution  DOUBLE  A statistical metric. It indicates the marginal contribution value of a bin in the testing set. 
test_total  BIGINT  A statistical metric. It indicates the total number of samples in a bin in the testing set. 
test_positive  BIGINT  A statistical metric. It indicates the number of positive samples in a bin in the testing set. 
test_negative  BIGINT  A statistical metric. It indicates the number of negative samples in a bin in the testing set. 
test_percentage_pos  DOUBLE  A statistical metric. It indicates the proportion of positive samples in a bin to total positive samples in the testing set. 
test_percentage_neg  DOUBLE  A statistical metric. It indicates the proportion of negative samples in a bin to total negative samples in the testing set. 