Generate cluster predictions using a trained Gaussian Mixture Model (GMM).
Limits
Supported computing engines: MaxCompute, Flink, or DLC.
Configure component in Designer
Configure parameters in Designer.
|
Tab |
Parameter |
Description |
|
Fields setting |
Vector column name |
Name of the vector column containing input data. |
|
Reserved columns |
Columns preserved in prediction output. |
|
|
Parameters setting |
Prediction result column name |
Name of the column storing cluster assignments. |
|
Prediction detail column name |
Name of the column storing probability distributions across clusters. |
|
|
Number of threads for the component |
Number of threads used for prediction. Default: 1. |
|
|
Execution tuning |
Number of workers |
Number of parallel workers. Used with Memory per worker (MB). Value ranges from 1 to 9999. See Estimate resource usage. |
|
Memory per worker (MB) |
Memory allocated to each worker. Ranges from 1024 MB to 65536 MB. See Estimate resource usage. |
Estimate resource usage
Estimate resource requirements using the following guidelines.
-
Memory per node
Allocate approximately 30 times the model size in memory per node.
Example: For a 1 GB model, allocate 30 GB per node.
-
Number of nodes
Adding nodes initially improves performance, but communication overhead eventually degrades speed. Stop adding nodes when performance decreases.