Build Real-Time Sentiment Analysis with AI_SENTIMENT - Realtime Compute for Apache Flink

AI_SENTIMENT classifies the sentiment of a text column using a Large Language Model (LLM) and returns a score, label, and confidence value for each row.

Limitations

Requires Ververica Runtime (VVR) 11.4 or later.
Throughput is bounded by the rate limit of the model platform. When the rate limit is exceeded, the Flink operator experiences backpressure and becomes a bottleneck. In severe cases, this triggers operator timeout errors and causes the job to restart.

Syntax

AI_SENTIMENT(
  MODEL => MODEL <model_name>,
  INPUT => <input_column>
)

Input parameters

Parameter	Data type	Description
`MODEL <model_name>`	MODEL	The registered model service to use. Register a model service in Model Settings. The model's output type must be VARIANT.
`<input_column>`	STRING	The text column to analyze.

Outputs

AI_SENTIMENT returns one row per input row with the following columns:

Column	Data type	Description
`score`	DOUBLE	Sentiment score from `-1.0` to `1.0`. Reference values: `-1.0` extremely negative, `-0.5` moderately negative, `0.0` neutral, `0.5` moderately positive, `1.0` extremely positive.
`label`	STRING	Sentiment classification. Valid values: `positive`, `negative`, `neutral`.
`confidence`	DOUBLE	Model confidence in the predicted label, between `0.0` and `1.0`.

Example

The following example registers a Qwen-Plus model and uses AI_SENTIMENT to classify the sentiment of movie comments.

Test data

id	movie_name	comment	actual_label
1	Good Stuff	I loved the part where the child guessed sounds. It was one of the most romantic narratives I've seen in a movie. Very gentle and full of love.	positive
2	Dumpling Queen	Nothing remarkable.	negative

SQL statement

-- Register the model service
CREATE TEMPORARY MODEL general_model
INPUT (`input` STRING)
OUTPUT (`content` VARIANT)
WITH (
    'provider' = 'openai-compat',
    'endpoint'  = '<YOUR ENDPOINT>',
    'apiKey'    = '<YOUR KEY>',
    'model'     = 'qwen-plus'
);

-- Create a view with the test data
CREATE TEMPORARY VIEW movie_comment(id, movie_name, user_comment, actual_label)
AS VALUES
  (1, 'Good Stuff',      'I loved the part where the child guessed sounds. It was one of the most romantic narratives I\'ve seen in a movie. Very gentle and full of love.', 'positive'),
  (2, 'Dumpling Queen',  'Nothing remarkable.', 'negative');

-- Call AI_SENTIMENT using positional arguments
SELECT id, movie_name, actual_label, score, label, confidence
FROM movie_comment,
LATERAL TABLE(
  AI_SENTIMENT(MODEL general_model, user_comment));

-- Call AI_SENTIMENT using named arguments
SELECT id, movie_name, actual_label, score, label, confidence
FROM movie_comment,
LATERAL TABLE(
  AI_SENTIMENT(
    MODEL => MODEL general_model,
    INPUT => user_comment));

Replace the following placeholders with your actual values:

Placeholder	Description
`<YOUR ENDPOINT>`	The endpoint of your model service
`<YOUR KEY>`	The API key for your model service

Output

The predicted label matches the actual_label for both rows.

id	movie_name	actual_label	score	label	confidence
1	Good Stuff	positive	0.8	positive	0.95
2	Dumpling Queen	negative	-1.0	negative	0.95