All Products
Search
Document Center

Realtime Compute for Apache Flink:AI_EMBED

Last Updated:Dec 09, 2025

This topic describes how to use AI_EMBED to generate vector representations of texts.

Limitations

  • This feature requires Ververica Runtime (VVR) 11.4 or later.

  • The throughput of AI_EMBED operators is subject to the rate limits of Alibaba Cloud Model Studio. When the rate limits for a model are reached, the Flink job will be backpressured with AI_EMBED operators as the bottleneck. In some cases, timeout errors and job restarts may be triggered.

Syntax

AI_EMBED(
  MODEL => MODEL <MODEL NAME>, 
  INPUT => <INPUT COLUMN NAME>
)

Input

Parameter

Data type

Description

MODEL <MODEL NAME>

MODEL

The registered model's name.

Note: The output type for the model must be ARRAY<FLOAT>.

<INPUT COLUMN NAME>

STRING

The source text for the model to analyze.

Result

Parameter

Data type

Description

embedding

ARRAY<FLOAT>

The generated 1024-dimensional vector.

Example

Test data

id

content

1

Flink

Test statement

This SQL example uses the text-embedding-v3 model and AI_EMBED to generate vectors.

CREATE TEMPORARY MODEL embedding_model
INPUT (`input` STRING)
OUTPUT (`embedding` ARRAY<FLOAT>)
WITH (
    'provider' = 'openai-compat',
    'endpoint'='<YOUR ENDPOINT>',
    'apiKey' = '<YOUR KEY>',
    'model' = 'text-embedding-v3',
    'dimension' = '1024'
);

CREATE TEMPORARY VIEW infos(id, content)
AS VALUES (1, 'Flink');

-- Use positional argument to call AI_EMBED
SELECT id, embedding
FROM infos,
LATERAL TABLE(
  AI_EMBED(
    MODEL embedding_model, 
    content
    )); 

-- Use named argument to call AI_EMBED
SELECT id, embedding
FROM infos,
LATERAL TABLE(
  AI_EMBED(
    MODEL => MODEL embedding_model, 
    INPUT => content
    )); 

Outputs

id

embedding

1

[-0.13219477, 0.054332353, -0.033010617, -0.0039787884, ...]