All Products
Search
Document Center

Platform For AI:Linear SVM

Last Updated:Apr 02, 2026

Linear support vector machine (SVM) is a binary classifier built on statistical learning theory. It finds the decision boundary that maximizes the margin between two classes by minimizing structural risk—without using kernel functions.

Use Linear SVM when:

  • The classification task is binary (two classes only).

Linear SVM does not support multiclass classification. For multiclass tasks, use a different component.

Algorithm reference

The Linear SVM component implements the Trust Region Newton method for L2-SVM. For details, see the "Trust region method for L2-SVM" section in Trust Region Newton Method for Large-Scale Logistic Regression.

Configure the component

Two methods are available: the Machine Learning Designer visual interface and the PAI command line. Use the Designer interface for interactive experimentation; use the PAI command for scripted or automated workflows.

Method 1: Configure in Machine Learning Designer

Input port

The Linear SVM component has a single input port. Connect it to the Read Table component.

Parameters

Tab Parameter Required Description
Fields Setting Feature Columns Yes Columns used as features. Accepted data types: BIGINT or DOUBLE.
Label Column Yes Column containing class labels. Accepted data types: BIGINT, DOUBLE, or STRING.
Parameters Setting Positive Sample Label No The label value treated as the positive class. If omitted, the component selects a value at random. Specify this parameter when the class distribution is imbalanced.
Positive Penalty Factor No Cost assigned to misclassifying a positive sample. Increasing this value makes the model penalize positive-class errors more heavily, which is useful when false negatives are more costly. Valid values: (0, +∞). Default: 1.0.
Negative Penalty Factor No Cost assigned to misclassifying a negative sample. Increasing this value makes the model penalize negative-class errors more heavily, which is useful when false positives are more costly. Valid values: (0, +∞). Default: 1.0.
Convergence Coefficient No Convergence tolerance (epsilon). Training stops when the change between iterations falls below this value. Lower values increase training precision but require more iterations. Valid values: (0, 1). Default: 0.001.
Tuning Cores No Number of CPU cores for training. Auto-allocated if omitted.
Memory Size per Core No Memory allocated per core, in MB. Auto-allocated if omitted.

Output port

The component outputs binary models that have the same format as batch models to downstream prediction components.

Method 2: Run a PAI command

Use the SQL Script component to submit PAI commands. For setup instructions, see SQL Script.

PAI -name LinearSVM -project algo_public
    -DinputTableName="bank_data"
    -DmodelName="xlab_m_LinearSVM_6143"
    -DfeatureColNames="pdays,emp_var_rate,cons_conf_idx"
    -DlabelColName="y"
    -DpositiveLabel="0"
    -DpositiveCost="1.0"
    -DnegativeCost="1.0"
    -Depsilon="0.001";

Parameters

Parameter Required Description Default
inputTableName Yes Name of the input table.
inputTablepartitions No Partitions to use for training. Formats: Partition_name=value (single) or name1=value1/name2=value2 (multi-level). Separate multiple partitions with commas. All partitions
modelName Yes Name for the output model.
featureColNames Yes Feature columns from the input table.
labelColName Yes Label column name.
positiveLabel No Label value for the positive class. Random value from label column
positiveCost No Positive penalty factor. Increasing this value makes the model penalize positive-class errors more heavily. Valid values: (0, +∞). 1.0
negativeCost No Negative penalty factor. Increasing this value makes the model penalize negative-class errors more heavily. Valid values: (0, +∞). 1.0
epsilon No Convergence tolerance. Valid values: (0, 1). 0.001
enableSparse No Set to true if the input data is in sparse format. false
itemDelimiter No Delimiter separating key-value pairs in sparse input. , (comma)
kvDelimiter No Delimiter separating keys and values in sparse input. : (colon)
coreNum No Number of CPU cores. Must be a positive integer. Auto-allocated
memSizePerCore No Memory per core, in MB. Valid values: 1–65536. Auto-allocated

Usage notes

Imbalanced classes: If the positive and negative samples are heavily skewed, set Positive Sample Label explicitly and raise positiveCost or negativeCost to weight the minority class higher. For example, if false negatives are more harmful than false positives, increase positiveCost above 1.0.

Sparse data: For high-dimensional sparse features (for example, text or one-hot encoded data), set enableSparse=true and configure itemDelimiter and kvDelimiter to match your data format.

Resource tuning: Leave Cores and Memory Size per Core unset for most workloads—the platform auto-allocates based on data size. Set these manually only when you need predictable resource usage.

Example

This example trains a Linear SVM binary classifier on a small dataset.

  1. Use the following training data as the input.

    id y f0 f1 f2 f3 f4 f5 f6 f7
    1 -1 -0.294118 0.487437 0.180328 -0.292929 -1 0.00149028 -0.53117 -0.0333333
    2 +1 -0.882353 -0.145729 0.0819672 -0.414141 -1 -0.207153 -0.766866 -0.666667
    3 -1 -0.0588235 0.839196 0.0491803 -1 -1 -0.305514 -0.492741 -0.633333
    4 +1 -0.882353 -0.105528 0.0819672 -0.535354 -0.777778 -0.162444 -0.923997 -1
    5 -1 -1 0.376884 -0.344262 -0.292929 -0.602837 0.28465 0.887276 -0.6
    6 +1 -0.411765 0.165829 0.213115 -1 -1 -0.23696 -0.894962 -0.7
    7 -1 -0.647059 -0.21608 -0.180328 -0.353535 -0.791962 -0.0760059 -0.854825 -0.833333
    8 +1 0.176471 0.155779 -1 -1 -1 0.052161 -0.952178 -0.733333
    9 -1 -0.764706 0.979899 0.147541 -0.0909091 0.283688 -0.0909091 -0.931682 0.0666667
    10 -1 -0.0588235 0.256281 0.57377 -1 -1 -1 -0.868488 0.1
  2. Use the following test data as the input.

    id y f0 f1 f2 f3 f4 f5 f6 f7
    1 +1 -0.882353 0.0854271 0.442623 -0.616162 -1 -0.19225 -0.725021 -0.9
    2 +1 -0.294118 -0.0351759 -1 -1 -1 -0.293592 -0.904355 -0.766667
    3 +1 -0.882353 0.246231 0.213115 -0.272727 -1 -0.171386 -0.981213 -0.7
    4 -1 -0.176471 0.507538 0.278689 -0.414141 -0.702128 0.0491804 -0.475662 0.1
    5 -1 -0.529412 0.839196 -1 -1 -1 -0.153502 -0.885568 -0.5
    6 +1 -0.882353 0.246231 -0.0163934 -0.353535 -1 0.0670641 -0.627669 -1
    7 -1 -0.882353 0.819095 0.278689 -0.151515 -0.307329 0.19225 0.00768574 -0.966667
    8 +1 -0.882353 -0.0753769 0.0163934 -0.494949 -0.903073 -0.418778 -0.654996 -0.866667
    9 +1 -1 0.527638 0.344262 -0.212121 -0.356974 0.23696 -0.836038 -0.8
    10 +1 -0.882353 0.115578 0.0163934 -0.737374 -0.56974 -0.28465 -0.948762 -0.933333
  3. Create the pipeline shown in the following figure. For more information, see Algorithm modeling.

    image.png

  4. Configure the parameters listed in the following table for the Linear SVM component. Keep the default values for all other parameters.

    Tab Parameter Value
    Fields Setting Feature Columns Select f0, f1, f2, f3, f4, f5, f6, and f7.
    Label Column Select y.
  5. Run the pipeline and view the prediction results. Prediction results