All Products
Search
Document Center

Platform For AI:IForest Anomaly Detection

Last Updated:Mar 11, 2026

The IForest Anomaly Detection component identifies abnormal data points using a subsampling algorithm. This method reduces computational complexity while maintaining high detection effectiveness.

Configure the component

Configure the IForest Anomaly Detection component using one of these methods:

Use Designer UI

Configure the component parameters on the workflow page in Designer.

Tab

Parameter

Description

Fields Setting

Feature Columns

Cannot be configured if Vector Column or Tensor Column is set.

Feature columns for training.

Note

Feature Columns, Tensor Column, and Vector Column are mutually exclusive. Use only one parameter to specify input features.

Group Columns

Columns for grouping data.

Tensor Column

Cannot be configured if Vector Column or Feature Columns is set.

Name of the tensor column.

Note

Feature Columns, Tensor Column, and Vector Column are mutually exclusive. Use only one parameter to specify input features.

Vector Column

Cannot be configured if Tensor Column or Feature Columns is set.

Name of the vector column.

Note

Feature Columns, Tensor Column, and Vector Column are mutually exclusive. Use only one parameter to specify input features.

Parameter Settings

Prediction Result Column

Name of the prediction result column.

Maximum Number of Outliers per Group

Maximum number of outliers to detect in each group.

Maximum Ratio of Outliers

Maximum ratio of outliers that the algorithm can detect.

Maximum Number of Samples per Group

Maximum number of samples in each group.

Number of Trees in the Model

Number of trees in the model. Default: 100.

Outlier Score Threshold

Data points with scores greater than this threshold are identified as outliers.

Prediction Details Column

Name of the column that stores prediction details.

Number of Rows Sampled per Tree

Number of rows to sample for each tree. Must be a positive integer in the range [2, 100000]. Default: 256.

Number of Threads

Number of threads for the component. Default: 1.

Execution Tuning

Number of Workers

Number of workers. Used with Memory per Worker. Must be a positive integer in the range [1, 9999].

Memory per Worker (MB)

Memory size of each worker, in MB. Must be a positive integer in the range [1024, 65536].

Use Python code

Configure component parameters using the PyAlink Script component, which allows calling Python code. For more information, see PyAlink Script.

Parameter

Required

Description

Default

predictionCol

Yes

Name of the prediction result column.

N/A

featureCols

No

Names of the feature columns. Array type.

Select All

groupCols

No

Names of the group columns. Multiple columns supported.

None

maxOutlierNumPerGroup

No

Maximum number of outliers in each group.

None

maxOutlierRatio

No

Maximum ratio of outliers that the algorithm can detect.

None

maxSampleNumPerGroup

No

Maximum number of samples in each group.

None

numTrees

No

Number of trees in the model.

100

outlierThreshold

No

Data points with scores greater than this threshold are identified as outliers.

None

predictionDetailCol

Yes

Name of the column that contains prediction details.

N/A

tensorCol

No

Tensor column name.

None

vectorCol

No

Name of the vector column.

None

subsamplingSize

No

Number of rows sampled for each tree. Must be a positive integer. Range: [2, 100000].

256

numThreads

No

Number of threads for the component.

1

Example code:

from pyalink.alink import *
import pandas as pd
df = pd.DataFrame([
[0.73, 0],
[0.24, 0],
[0.63, 0],
[0.55, 0],
[0.73, 0],
[0.41, 0]
])

dataOp = BatchOperator.fromDataframe(df, schemaStr='val double, label int')

outlierOp = IForestOutlierBatchOp()\
    .setFeatureCols(["val"])\
    .setOutlierThreshold(3.0)\
    .setPredictionCol("pred")\
    .setPredictionDetailCol("pred_detail")

outlierOp.print()