This topic describes how to use the AI_CLASSIFY function to classify text using an LLM.
Limitations
This function requires Ververica Runtime (VVR) 11.4+.
The throughput of
AI_CLASSIFYoperators is subject to the rate limits of Alibaba Cloud Model Studio. When the rate limits for a model are reached, the Flink job will be backpressured withAI_CLASSIFYoperators as the bottleneck. In some cases, timeout errors and job restarts may be triggered.
Syntax
AI_CLASSIFY(
MODEL => MODEL <MODEL NAME>,
INPUT => <INPUT COLUMN NAME>,
LABELS => <LABELS>
)Input parameters
Parameter | Data type | Description |
MODEL <MODEL NAME> | MODEL | The name of the registered model. For more information about registering a model service, see Model settings. Note: The output type of this model must be VARIANT. |
<INPUT COLUMN NAME> | STRING | The data to be classified by the model. |
<LABELS> | ARRAY<STRING> | The expected categories. Note: This input parameter must be a constant. |
Return values
Parameter | Data type | Description |
category | STRING | The category determined by the model. |
confidence | DOUBLE | The confidence level output by the model. |
Example
Test data
id | content | label |
1 | Li-Ning Way of Wade 10 Basketball Shoes, Performance Basketball Shoes, Shock Absorption and Rebound, Black/Red | Digital |
2 | Apple iPhone 15 Pro Max 256GB, Space Black, 5G Phone, A17 Pro Chip, Titanium Frame | Clothing |
Test statements
The following sample SQL statement uses the Qwen-Plus model and the AI_CLASSIFY function to categorize products.
CREATE TEMPORARY MODEL general_model
INPUT (`input` STRING)
OUTPUT (`content` VARIANT)
WITH (
'provider' = 'openai-compat',
'endpoint'='<YOUR ENDPOINT>',
'apiKey' = '<YOUR KEY>',
'model' = 'qwen-plus'
);
CREATE TEMPORARY VIEW products(id, content)
AS VALUES (1, 'Li-Ning Way of Wade 10 Basketball Shoes, Performance Basketball Shoes, Shock Absorption and Rebound, Black/Red'), (2, 'Apple iPhone 15 Pro Max 256GB, Space Black, 5G Phone, A17 Pro Chip, Titanium Frame');
-- Use positional argument to call AI_CLASSIFY
SELECT id, category, confidence FROM products,
LATERAL TABLE(
AI_CLASSIFY(
MODEL general_model, content, ARRAY['Digital', 'Clothing']));
-- Use named argument to call AI_CLASSIFY
SELECT id, category, confidence FROM products,
LATERAL TABLE(
AI_CLASSIFY(
MODEL => MODEL general_model,
INPUT => content,
LABELS => ARRAY['Digital', 'Clothing'])); Output
id | category | confidence |
1 | Clothing | 0.95 |
2 | Digital | 0.99 |