Linear support vector machine (SVM) is a binary classifier built on statistical learning theory. It finds the decision boundary that maximizes the margin between two classes by minimizing structural risk—without using kernel functions.
Use Linear SVM when:
-
The classification task is binary (two classes only).
Linear SVM does not support multiclass classification. For multiclass tasks, use a different component.
Algorithm reference
The Linear SVM component implements the Trust Region Newton method for L2-SVM. For details, see the "Trust region method for L2-SVM" section in Trust Region Newton Method for Large-Scale Logistic Regression.
Configure the component
Two methods are available: the Machine Learning Designer visual interface and the PAI command line. Use the Designer interface for interactive experimentation; use the PAI command for scripted or automated workflows.
Method 1: Configure in Machine Learning Designer
Input port
The Linear SVM component has a single input port. Connect it to the Read Table component.
Parameters
| Tab | Parameter | Required | Description |
|---|---|---|---|
| Fields Setting | Feature Columns | Yes | Columns used as features. Accepted data types: BIGINT or DOUBLE. |
| Label Column | Yes | Column containing class labels. Accepted data types: BIGINT, DOUBLE, or STRING. | |
| Parameters Setting | Positive Sample Label | No | The label value treated as the positive class. If omitted, the component selects a value at random. Specify this parameter when the class distribution is imbalanced. |
| Positive Penalty Factor | No | Cost assigned to misclassifying a positive sample. Increasing this value makes the model penalize positive-class errors more heavily, which is useful when false negatives are more costly. Valid values: (0, +∞). Default: 1.0. | |
| Negative Penalty Factor | No | Cost assigned to misclassifying a negative sample. Increasing this value makes the model penalize negative-class errors more heavily, which is useful when false positives are more costly. Valid values: (0, +∞). Default: 1.0. | |
| Convergence Coefficient | No | Convergence tolerance (epsilon). Training stops when the change between iterations falls below this value. Lower values increase training precision but require more iterations. Valid values: (0, 1). Default: 0.001. | |
| Tuning | Cores | No | Number of CPU cores for training. Auto-allocated if omitted. |
| Memory Size per Core | No | Memory allocated per core, in MB. Auto-allocated if omitted. |
Output port
The component outputs binary models that have the same format as batch models to downstream prediction components.
Method 2: Run a PAI command
Use the SQL Script component to submit PAI commands. For setup instructions, see SQL Script.
PAI -name LinearSVM -project algo_public
-DinputTableName="bank_data"
-DmodelName="xlab_m_LinearSVM_6143"
-DfeatureColNames="pdays,emp_var_rate,cons_conf_idx"
-DlabelColName="y"
-DpositiveLabel="0"
-DpositiveCost="1.0"
-DnegativeCost="1.0"
-Depsilon="0.001";
Parameters
| Parameter | Required | Description | Default |
|---|---|---|---|
inputTableName |
Yes | Name of the input table. | — |
inputTablepartitions |
No | Partitions to use for training. Formats: Partition_name=value (single) or name1=value1/name2=value2 (multi-level). Separate multiple partitions with commas. |
All partitions |
modelName |
Yes | Name for the output model. | — |
featureColNames |
Yes | Feature columns from the input table. | — |
labelColName |
Yes | Label column name. | — |
positiveLabel |
No | Label value for the positive class. | Random value from label column |
positiveCost |
No | Positive penalty factor. Increasing this value makes the model penalize positive-class errors more heavily. Valid values: (0, +∞). | 1.0 |
negativeCost |
No | Negative penalty factor. Increasing this value makes the model penalize negative-class errors more heavily. Valid values: (0, +∞). | 1.0 |
epsilon |
No | Convergence tolerance. Valid values: (0, 1). | 0.001 |
enableSparse |
No | Set to true if the input data is in sparse format. |
false |
itemDelimiter |
No | Delimiter separating key-value pairs in sparse input. | , (comma) |
kvDelimiter |
No | Delimiter separating keys and values in sparse input. | : (colon) |
coreNum |
No | Number of CPU cores. Must be a positive integer. | Auto-allocated |
memSizePerCore |
No | Memory per core, in MB. Valid values: 1–65536. | Auto-allocated |
Usage notes
Imbalanced classes: If the positive and negative samples are heavily skewed, set Positive Sample Label explicitly and raise positiveCost or negativeCost to weight the minority class higher. For example, if false negatives are more harmful than false positives, increase positiveCost above 1.0.
Sparse data: For high-dimensional sparse features (for example, text or one-hot encoded data), set enableSparse=true and configure itemDelimiter and kvDelimiter to match your data format.
Resource tuning: Leave Cores and Memory Size per Core unset for most workloads—the platform auto-allocates based on data size. Set these manually only when you need predictable resource usage.
Example
This example trains a Linear SVM binary classifier on a small dataset.
-
Use the following training data as the input.
id y f0 f1 f2 f3 f4 f5 f6 f7 1 -1 -0.294118 0.487437 0.180328 -0.292929 -1 0.00149028 -0.53117 -0.0333333 2 +1 -0.882353 -0.145729 0.0819672 -0.414141 -1 -0.207153 -0.766866 -0.666667 3 -1 -0.0588235 0.839196 0.0491803 -1 -1 -0.305514 -0.492741 -0.633333 4 +1 -0.882353 -0.105528 0.0819672 -0.535354 -0.777778 -0.162444 -0.923997 -1 5 -1 -1 0.376884 -0.344262 -0.292929 -0.602837 0.28465 0.887276 -0.6 6 +1 -0.411765 0.165829 0.213115 -1 -1 -0.23696 -0.894962 -0.7 7 -1 -0.647059 -0.21608 -0.180328 -0.353535 -0.791962 -0.0760059 -0.854825 -0.833333 8 +1 0.176471 0.155779 -1 -1 -1 0.052161 -0.952178 -0.733333 9 -1 -0.764706 0.979899 0.147541 -0.0909091 0.283688 -0.0909091 -0.931682 0.0666667 10 -1 -0.0588235 0.256281 0.57377 -1 -1 -1 -0.868488 0.1 -
Use the following test data as the input.
id y f0 f1 f2 f3 f4 f5 f6 f7 1 +1 -0.882353 0.0854271 0.442623 -0.616162 -1 -0.19225 -0.725021 -0.9 2 +1 -0.294118 -0.0351759 -1 -1 -1 -0.293592 -0.904355 -0.766667 3 +1 -0.882353 0.246231 0.213115 -0.272727 -1 -0.171386 -0.981213 -0.7 4 -1 -0.176471 0.507538 0.278689 -0.414141 -0.702128 0.0491804 -0.475662 0.1 5 -1 -0.529412 0.839196 -1 -1 -1 -0.153502 -0.885568 -0.5 6 +1 -0.882353 0.246231 -0.0163934 -0.353535 -1 0.0670641 -0.627669 -1 7 -1 -0.882353 0.819095 0.278689 -0.151515 -0.307329 0.19225 0.00768574 -0.966667 8 +1 -0.882353 -0.0753769 0.0163934 -0.494949 -0.903073 -0.418778 -0.654996 -0.866667 9 +1 -1 0.527638 0.344262 -0.212121 -0.356974 0.23696 -0.836038 -0.8 10 +1 -0.882353 0.115578 0.0163934 -0.737374 -0.56974 -0.28465 -0.948762 -0.933333 -
Create the pipeline shown in the following figure. For more information, see Algorithm modeling.

-
Configure the parameters listed in the following table for the Linear SVM component. Keep the default values for all other parameters.
Tab Parameter Value Fields Setting Feature Columns Select f0, f1, f2, f3, f4, f5, f6, and f7. Label Column Select y. -
Run the pipeline and view the prediction results.
