All Products
Search
Document Center

Realtime Compute for Apache Flink:AI_SUMMARIZE

Last Updated:Mar 26, 2026

AI_SUMMARIZE is a table-valued function that calls a large language model (LLM) to generate a text summary of an input string.

Limitations

  • Requires Ververica Runtime (VVR) 11.4 or later.

  • Throughput is bounded by the rate limits of the underlying model platform. When those limits are reached, the Flink job experiences backpressure and AI_SUMMARIZE becomes the bottleneck. In severe cases, this triggers timeout errors and job restarts.

Syntax

AI_SUMMARIZE(
  MODEL => MODEL <model_name>,
  INPUT => <input_column>,
  MAX_LENGTH => <max_length>
)

Supports both positional and named argument styles.

Parameters

ParameterData typeDescription
MODEL <model_name>MODELName of the registered model service. The model must return output of type VARIANT. See Model Settings to register a model service.
<input_column>STRINGThe column whose content is summarized by the model.
<max_length>INTEGERMaximum length of the model output. Must be a constant value.

Output

ColumnData typeDescription
summarySTRINGThe generated summary.

Examples

Full example with table data

The following example creates a Qwen-Plus model, loads test data into a temporary view, and calls AI_SUMMARIZE on a table column using both positional and named argument styles.

Test data

iddescription
1What is Flink? Apache Flink is an open source distributed stream processing framework for stateful computation over real-time data streams and batch data. In simple terms: Flink is a compute engine for processing real-time data. It handles continuous data streams such as website clicks, Internet of Things sensor data, and stock trades. It provides low latency, high throughput, and exactly-once semantics. It supports both stream processing and batch processing.

SQL

CREATE TEMPORARY MODEL general_model
INPUT (`input` STRING)
OUTPUT (`content` VARIANT)
WITH (
    'provider' = 'openai-compat',
    'endpoint' = '<YOUR ENDPOINT>',
    'apiKey' = '<YOUR KEY>',
    'model' = 'qwen-plus'
);

CREATE TEMPORARY VIEW infos(id, description)
AS VALUES (1, '
What is Flink?
Apache Flink is an open source distributed stream processing framework for stateful computation over real-time data streams and batch data.
In simple terms:
Flink is a compute engine for processing real-time data.
It handles continuous data streams such as website clicks, Internet of Things sensor data, and stock trades.
It provides low latency, high throughput, and exactly-once semantics.
It supports both stream processing and batch processing.
');

-- Positional argument style
SELECT id, summary
FROM infos, LATERAL TABLE(
  AI_SUMMARIZE(MODEL general_model, description, 10));

-- Named argument style
SELECT id, summary
FROM infos, LATERAL TABLE(
  AI_SUMMARIZE(
    MODEL => MODEL general_model,
    INPUT => description,
    MAX_LENGTH => 10));

Output

idsummary
1Real-time stream processing engine