AI_GENERATE is an AI function in MaxCompute that invokes a model to perform inference based on a given prompt. You can use this function directly in SQL to process unstructured data for various scenarios, including natural language generation, complex logic analysis, sentiment analysis, and multimodal understanding, without depending on external services.
Syntax
The function signature for AI_GENERATE varies depending on whether you use a large language model (LLM) or a multimodal large language model (MLLM).
For LLMs, the function signature is as follows:
STRING AI_GENERATE( STRING <model_name> , STRING <version_name>, STRING <prompt> [, STRING <model_parameters>] );For MLLMs, the function signature is as follows:
STRING AI_GENERATE( STRING <model_name>, STRING <version_name>, STRING | BINARY <unstructured_data> , STRING <prompt> STRING <type> [, STRING <model_parameters>] );
Parameters
model_name: Required. STRING. The name of the model to use. The model can be a large language model (LLM) or a multimodal large language model (MLLM).
version_name: Required. STRING. The version of the model to use. You can specify
DEFAULT_VERSIONto call the default version.prompt: Required. STRING. The prompt to send to the model. This can be a
STRINGconstant, a column name, or an expression.unstructured_data: Required for MLLMs. STRING or BINARY.
The multimodal data to process. You can specify the URL of an image, audio, or video file as a STRING, or provide the binary data of an image as BINARY. If you use the BINARY type, you must also specify a BINARY input parameter when creating the model.
type: When unstructured_data is a STRING URL for an image, audio, or video, this parameter is required. Valid values are IMAGE, AUDIO, and VIDEO. For example:
Audio:
AI_GENERATE(model, version, audio_url, prompt, 'AUDIO');Video:
AI_GENERATE(model, version, video_url, prompt, 'VIDEO');Image:
AI_GENERATE(model, version, image_url, prompt, 'IMAGE');
model_parameters: Optional. STRING. A JSON string that specifies parameters for the model invocation, such as
max_tokens,temperature, andtop_p. Example:'{"max_tokens": 500, "temperature": 0.6, "top_p": 0.95}'. The following parameters are supported:max_tokens: The maximum number of tokens to generate in a single model call. The default value is 4096 for MaxCompute public models.
temperature: A value between 0 and 1 that controls the randomness of the model output. A higher value leads to more diverse and random output.
top_p: A value between 0 and 1 that controls the randomness and diversity of the model output. A higher value leads to more random output.
To call a public model with an AI function, run SET odps.sql.using.public.model=true; to enable access.
Return value
Returns a STRING containing the content generated by the model.
Examples
Example 1: Generate content
Calls the Qwen3-0.6B-GGUF public model in MaxCompute to generate content.
SET odps.sql.using.public.model=true;
SET odps.namespace.schema=true;
SELECT AI_GENERATE(bigdata_public_modelset.default.Qwen3-0.6B-GGUF,DEFAULT_VERSION,'what is the capital of China');
-- Returns:
-- "The capital of China is **Beijing**."Example 2: Perform sentiment analysis
Calls the Qwen3-1.7B-GGUF public model in MaxCompute to perform sentiment analysis on user comments.
SET odps.sql.using.public.model=true;
SET odps.namespace.schema=true;
SELECT
prompt,
AI_GENERATE(
bigdata_public_modelset.default.Qwen3-1.7B-GGUF,
DEFAULT_VERSION,
concat('Perform sentiment analysis on the following comment. The output must be one of the following three options: Positive, Negative, or Neutral. Comment to analyze:', prompt)
) AS generated_text
FROM (
VALUES
('The weather is great today, and I am in a good mood! It is sunny and perfect for a walk. /no_think'),
('The weather is great today, and I am in a good mood! It is sunny /no_think'),
('Technology is advancing rapidly, and artificial intelligence is changing lives. /no_think'),
('The prevention and control measures are excellent. Thumbs up to the medical staff! /no_think'),
('The quality of this product is very poor. /no_think')
) t (prompt);
-- Returns:
+-----------------------------------------------------------------------------------------+----------------+
| prompt | generated_text |
+-----------------------------------------------------------------------------------------+----------------+
| The weather is great today, and I am in a good mood! It is sunny and perfect for a walk. /no_think | "Positive" |
| The weather is great today, and I am in a good mood! It is sunny /no_think | "Positive" |
| Technology is advancing rapidly, and artificial intelligence is changing lives. /no_think | "Neutral" |
| The prevention and control measures are excellent. Thumbs up to the medical staff! /no_think | "Positive" |
| The quality of this product is very poor. /no_think | "Negative" |
+-----------------------------------------------------------------------------------------+----------------+
Example 3: Process multimodal data
This example calls a pre-created PAI-EAS remote model named PAI_EAS_Qwen25_Omni_3B and uses an Object Table to invoke the AI function. For detailed instructions, see Use a MaxCompute remote model to automatically generate e-commerce product descriptions.
VPC endpoints (recommended) and PAI-EAS public endpoints are supported.
If you use a PAI-EAS VPC endpoint, you must establish a dedicated network connection and specify the configured network connection name in the AI function call. For configuration instructions, see Access a VPC over a leased line.
If you use a PAI-EAS public endpoint, you must add the endpoint to the list of allowed external network addresses in MaxCompute before you call the AI function. For configuration instructions, see Configure network access.
Process images
You can call the model with an image URL and a prompt to tag product categories. In this example, the Object Table is named
image_demo.SELECT key, AI_GENERATE( PAI_EAS_Qwen25_Omni_3B, v1, image_url, "Identify and extract the product category from the e-commerce sales poster. The result must be one of the following six options: Cosmetics, Apparel, Daily Necessities, Food, Other, or Electronic Products. Do not include any other text or information.","IMAGE" ) AS item_catagory FROM ( SELECT GET_SIGNED_URL_FROM_OSS( 'project_test_model.default.image_demo', key, 604800 ) AS image_url, key AS key FROM project_test_model.default.image_demo ) Limit 10; -- Returns: +--------------------+---------------------+ | key | item_catagory | +--------------------+---------------------+ | alimamazszw-1.jpg | Food | | alimamazszw-10.jpg | Electronic Products | | alimamazszw-11.jpg | Electronic Products | | alimamazszw-12.jpg | Cosmetics | | alimamazszw-13.jpg | Electronic Products | | alimamazszw-14.jpg | Daily Necessities | | alimamazszw-15.jpg | Cosmetics | | alimamazszw-16.jpg | Cosmetics | | alimamazszw-18.jpg | Daily Necessities | +--------------------+---------------------+Process audio
Call a model to perform audio category tagging based on an audio URL and a prompt. In this example, the Object Table is named
music_demo. The audio dataset is available at this download link.SELECT key, AI_GENERATE( PAI_EAS1_Qwen25_Omni_3B, v1, audio_url, "Accurately analyze the music genre of the audio. The result must be one of the following seven options, with no additional information: Classical, Country, Hip-Hop, Metal, Pop, Reggae, or Rock.","AUDIO" ) as item_catagory from ( select GET_SIGNED_URL_FROM_OSS( 'project_test_model.default.music_demo', key, 604800 ) as audio_url, key as key from project_test_model.default.music_demo ) Limit 42;
FAQ
Troubleshoot PAI-EAS remote models
If your job that calls a MaxCompute remote model with the AI_GENERATE function returns an empty result, follow these troubleshooting steps:
CREATE MODEL parameters
PAI_EAS_SERVICE_NAME:Check whether the PAI-EAS service has been added to a service group.
If the service is in a service group, you must specify the service name in the format of
group_name.service_name, such asgroup.service_name.If the service is not in a service group, specify only the PAI-EAS service name.
Endpoint:When you set the
ENDPOINTparameter for the model, do not include the path after the.comdomain, regardless of whether you use a PAI-EAS public endpoint or a VPC endpoint (recommended). Correct format example:http://1*************70.cn-shanghai.pai-eas.aliyuncs.com.APIKEY:Ensure that you have provided the correct PAI-EAS service token.
For more information about how to obtain these parameter values, see Obtain an endpoint and a token.
AI_GENERATE function parameters
When you process audio or video data, the unstructured_data parameter does not support the BINARY type.
If you use a STRING URL and still receive an empty result, ensure that you have correctly specified the
typeparameter. The supported values areIMAGE,AUDIO, andVIDEO. For example:SELECT key, AI_GENERATE( PAI_EAS1_Qwen25_Omni_3B, v1, audio_url, "Accurately analyze the music genre of the audio. The result must be one of the following seven options, with no additional information: Classical, Country, Hip-Hop, Metal, Pop, Reggae, or Rock.","AUDIO" ) as item_catagory from ( select GET_SIGNED_URL_FROM_OSS( 'project_test_model.default.music_demo', key, 604800 ) as audio_url, key as key from project_test_model.default.music_demo ) Limit 42;
Runtime state
Disable query acceleration to determine if it interferes with the model call. Add the following statement before the
AI_GENERATEfunction call:set odps.mcqa.disable=true;Query acceleration, which is enabled by default in some scenarios, can interfere with remote model calls and must be manually disabled.
PAI-EAS resources
Check if the PAI-EAS resources allocated to the remote model are sufficient. If the job fails with an out-of-memory (OOM) error, scale up the resources and try again.