This document describes the data definition language (DDL) statements for registering, querying, modifying, and deleting AI models in Flink SQL.
Usage notes
Supports AI model services from Alibaba Cloud Model Studio, Platform for AI (PAI), and other providers with OpenAI-compatible APIs.
The inference service deployed on Platform for AI (PAI) must be in the same region as your Realtime Compute for Apache Flink service.
Requires Ververica Runtime (VVR) version 11.1 or later.
CREATE MODEL
Register a model
In the Data Query editor, run the following command.
CREATE [TEMPORARY] MODEL [catalog_name.][db_name.]model_name
INPUT ( { <physical_column_definition> [, ...n] )
OUTPUT ( { <physical_column_definition> [, ...n] )
WITH (key1=val1, key2=val2, ...)
<physical_column_definition>:
column_name column_type [COMMENT column_comment]Clause | Description | Key parameters | Schema constraints | Example |
INPUT | Defines the column names, data types, and order of the model's input data. |
| Requires exactly one column of type STRING. |
|
OUTPUT | Defines the column names, data types, and order of the model's output data. |
| Constraints vary by model task type:
|
|
WITH | See WITH parameters. |
| None. |
|
Examples
Alibaba Cloud Model Studio
CREATE MODEL model_bailian
INPUT (`input` STRING)
OUTPUT (`content` STRING)
WITH (
'provider'='openai-compat',
'endpoint'='<Endpoint>',
'api-key'='<bailian-key>',
'model'='qwen3-235b-a22b'
);The endpoint format for Alibaba Cloud Model Studio is <base-url>/compatible-mode/v1/<task>. For example, https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions.
base-url
Internet access:
https://dashscope-intl.aliyuncs.com. To use internet access, you must enable it for your Flink workspace. For more information, see Select a network connection type.Internal network access: See Access Alibaba Cloud Model Studio APIs over an internal network.
Internal network access is supported from the same region and across regions. To access the service from a different region, for example, a Flink instance in Shanghai accessing a Model Studio service in Beijing, you must configure Cloud Enterprise Network (CEN). For more information, see Cross-region private access to Alibaba Cloud Model Studio APIs.
ImportantData processing for Alibaba Cloud Model Studio occurs in the selected Model Studio region. If you have data residency requirements, select an appropriate region based on your compliance needs. For more information, see Service deployment scope comparison.
task: The model task type. The following values are supported:
Platform for AI (PAI)
CREATE MODEL model_pai
INPUT (`input` STRING)
OUTPUT (`embedding` ARRAY<FLOAT>)
WITH (
'provider'='openai-compat',
'endpoint'='<VPC_endpoint>',
'api-key'='<Token>',
'model'='qwen3-235b-a22b'
);To get the endpoint and API key, you must first deploy a model service. For more information, see Deploy DeepSeek-V3 and DeepSeek-R1 models with one click and Quick start for Elastic Algorithm Service (EAS).
Model Gallery
Log on to the Platform for AI (PAI) console.
In the left navigation pane, go to and click the name of the target service.
Click View Invocation Information.
The VPC endpoint uses HTTP. You must change it to HTTPS and append
/v1/<task>to the URL. Thetaskcan be one of the following values:For example:
https://************.vpc.cn-hangzhou.pai-eas.aliyuncs.com/api/predict/quickstart_deploy_20250722_7b22/v1/chat/completions.Use the value from the
Tokenfield for theapi-keyparameter.
Elastic Algorithm Service (EAS)
In the left navigation pane, go to , and click the name of the target service to go to the Overview page.
In the Basic Information section, click View Invocation Information.
On the Invocation Information panel, copy the endpoint and token:
The VPC endpoint uses HTTP. You must change it to HTTPS and append
/v1/<task>to the URL. Thetaskcan be one of the following values:For example:
https://************.vpc.cn-hangzhou.pai-eas.aliyuncs.com/api/predict/quickstart_deploy_20250722_7b22/v1/chat/completions.Use the value from the
Tokenfield for theapi-keyparameter.
For more information, see Invoke a service by using a public or internal endpoint through a gateway.
WITH parameters
General
Parameter | Description | Type | Required | Default | Notes |
provider | The model service type. | string | Yes | None |
Note For PAI or other OpenAI-compatible model services, the value must be |
endpoint | The endpoint for the OpenAI-compatible model service, such as an embeddings or chat/completions service. | string | Yes | None |
|
api-key | The API key for the model service. | string | None | For more information, see Obtain an API key. Previous key name: | |
max-context-size | The maximum context size for a single request, in tokens. | integer | No | None | If the maximum capacity is exceeded, the action defined in Note Supported in VVR 11.2 and later. |
context-overflow-action | The action to take when a request's context exceeds the token limit. | string | No |
| Valid values:
Note Supported in VVR 11.2 and later. |
max-context-size | The maximum context length (number of records). | integer | No | None | Note Supported in VVR 11.2 and later. |
context-overflow-action | The action to take when the context length exceeds the record limit. | string | No | truncated-tail | Valid values:
Note Supported in VVR 11.2 and later. |
error-handling-strategy | The strategy for handling model request errors. | string | No | retry | Valid values:
Note Supported in VVR 11.4 and later. |
retry-num | The number of retry attempts. | integer | No | 100 | Takes effect only when Note Supported in VVR 11.4 and later. |
retry-fallback-strategy | The fallback strategy to use after the maximum number of retries is reached. | string | No | failover | Valid values are This setting takes effect only when Note Supported in VVR 11.4 and later. |
retry-backoff-strategy | The retry backoff strategy. | string | No | fixed | Valid values:
Note Supported in VVR 11.4 and later. |
retry-backoff-base-interval | The base time interval for the retry backoff strategy. | duration | No | 1 s | Note Supported in VVR 11.4 and later. |
chat/completions
The following parameters are specific to chat/completions model tasks:
Parameter | Description | Type | Required | Default | Notes |
model | The model to invoke. | string | Yes | None | Supports text generation models. Note You are charged based on the selected model and the number of tokens in the input and output. |
system-prompt | The system prompt for the request. | string | Yes | "You are a helpful assistant." | Previous key name: Note In VVR 11.6 and later, you can set this parameter to an empty value. |
temperature | Controls the smoothness of the probability distribution for each candidate token. | float | No | None | Valid range: [0, 2). A value of 0 is not recommended. A higher temperature makes the output more random, while a lower value makes it more deterministic. |
top-p | The probability threshold for nucleus sampling. | float | No | None | A higher value increases randomness, while a lower value increases determinism. Previous key name: |
stop | A stop sequence. | string | No | None | The model stops generating tokens when this sequence is produced. The sequence is not included in the final output. |
max-tokens | The maximum number of tokens that the model can generate. | integer | No | None | Previous key name: |
content-type | The type of input data. | string | No | text |
Note Supported in VVR 11.6 and later. |
presence-penalty | Controls token repetition. | double | No | None | Valid range: -2.0 to 2.0. Positive values penalize tokens that have already appeared in the text, making the model more likely to discuss new topics. Note Supported in VVR 11.3 and later. |
n | The number of output choices to generate for each input message. | integer | No | None | Note Supported in VVR 11.3 and later. |
seed | A random number seed for the model's response. | long | No | None | If specified, the model provider attempts deterministic sampling, so repeated requests with the same seed and parameters should return the same result. Determinism is not guaranteed. Note Supported in VVR 11.3 and later. |
response-format | The format of the return value. | string | No | text | Valid values:
Note Supported in VVR 11.3 and later. |
extra-header | Extra HTTP headers for the request. | string | No | None | Must be a JSON-formatted string. The values in the JSON key-value pairs must be strings or lists of strings. Note Supported in VVR 11.3 and later. |
extra-body | Extra HTTP body for the request. | string | No | None | Must be a JSON-formatted string. Note Supported in VVR 11.3 and later. |
embeddings
The following parameters are specific to embeddings model tasks:
Parameter | Description | Type | Required | Default | Notes |
model | The model to invoke. | string | Yes | None | Supports text embedding models. Note You are charged based on the selected model and the number of tokens in the input and output. |
dimension | The dimension of the output vectors. | integer | No | None | The supported dimensions depend on the specific model. Common values include 1024, 768, and 512. |
Querying models
In the Data Query editor, run one of the following commands.
List the names of registered models:
SHOW MODELS [ ( FROM | IN ) [catalog_name.]database_name ];Show the statement used to create a model:
SHOW CREATE MODEL [catalog_name.][db_name.]model_name;Show the input and output schema of a model:
DESCRIBE MODEL [catalog_name.][db_name.]model_name;
Example
SHOW MODELS;
-- RESULT
--+------------+
--| model name |
--+------------+
--| m |
--+------------+
DESCRIBE MODEL m;
-- RESULT
-- +---------+--------+------+----------+
-- | name | type | null | is input |
-- +---------+--------+------+----------+
-- | content | STRING | TRUE | TRUE |
-- | label | BIGINT | TRUE | FALSE |
-- +---------+--------+------+----------+
Modifying models
In the Data Query editor, run the following command.
ALTER MODEL [IF EXISTS] [catalog_name.][db_name.]model_name {
RENAME TO new_table_name
SET (key1=val1, ...)
RESET (key1, ...)
}Examples
Rename a registered model:
ALTER MODEL m RENAME TO m1; -- Renames the model to m1.Modify a model parameter:
ALTER MODEL m SET ('endpoint' = '<Your_Endpoint>'); -- Adjusts the endpoint path.Reset a model parameter to its default value:
ALTER MODEL m RESET ('endpoint'); -- Resets the endpoint path.
Deleting models
In the Data Query editor, run the following command.
DROP [TEMPORARY] MODEL [IF EXISTS] [catalog_name.][db_name.]model_nameExample
DROP MODEL m;