All Products
Search
Document Center

Hologres:AI function list

最終更新日:Apr 02, 2026

Hologres V3.2 introduces AI Function, which includes operators such as embedding, rank, and LLM. You can use standard SQL to call these functions and invoke models directly, eliminating the need for a separate inference service. This enables AI applications such as building enterprise-level knowledge bases and performing inference. To use AI Function, you must deploy a model in Hologres, for example, from Model Studio or on a Hologres AI node. To deploy a model on an AI node, you must first purchase AI node (GPU) resources.

Prerequisites

  • A model must be deployed in Hologres (supported model sources include Model Studio and Hologres AI nodes). If you deploy a Hologres AI node model, you must purchase AI resources. For deployment instructions, see Deploy a model.

  • Each AI Function supports a set of recommended models. For details, see the summary table below and the "AI Function and models" section.

Usage limits

  • AI functions are supported only in Hologres version 3.2 or later.

  • Each AI function has specific usage limits. For more information, see the documentation for each function.

AI functions

Hologres supports the following AI functions:

  • By default, Hologres assigns each AI function an optimal model from your deployed models. You can view these assignments in the system table. To use a different model, update the assignment in the system table. For more information, see Modify the Model for an AI Function.

  • If you deploy multiple models in an instance, specify the model name when calling a function to avoid ambiguity.

  • When deploying a model to an AI node, remember that resource requirements vary by model. Purchase the correct specifications as described in AI Resource Pricing and Purchase.

Function

Description

Supported models

Supported versions

ai_gen

Uses a prompt to invoke a large language model (LLM) for inference on text and images. After deploying the appropriate generative models, it supports multimodal generation tasks, including text-to-image, image editing, multi-image fusion, text-to-video, and video generation from a start frame, start/end frames, or a reference image.

Text: Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

Image: Qwen/Qwen2.5-VL-xB series models.

Multimodal generation: For model names and task types related to image and video generation, see the Image Generation/Editing and Video Generation topics in Model Studio.

  • Text inference: Hologres V3.2 and later

  • Image inference: Hologres V4.0 and later

to_file

Converts a URL to the FILE data type.

No model is required. Typically used with object tables.

Hologres V4.0 and later

prompt

Assembles a prompt for a large language model, including multimodal prompts.

No model is required.

ai_parse_document

Parses unstructured data, such as PDFs and images, into text.

ds4sd/docling-models.

ai_embed

Computes a fixed-dimension vector embedding for input text or an image.

Text:

  • iic/nlp_gte_sentence-embedding_chinese series.

  • Qwen/Qwen3-Embedding-XB series.

Image: clip-ViT-B series.

  • Text vectors: Hologres V3.2 and later

  • Image vectors: Hologres V4.0 and later

ai_rank

Calculates a relevance score for input text.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

Hologres V3.2.2 and later

ai_chunk

Performs long-text chunking.

recursive-character-text-splitter.

ai_classify

Classifies input text based on a set of labels.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

Hologres V3.2 and later

ai_extract

Extracts specified information from the input text.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

ai_mask

Masks specified information in the input text by replacing it with the [MASKED] placeholder.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

ai_fix_grammar

Fixes grammar errors in the input text.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

ai_summarize

Generates a summary of an input text.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

ai_translate

Translates the input text into a specified language.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

ai_similarity

Calculates the similarity score between two input texts.

Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

ai_analyze_sentiment

Analyzes the sentiment of the input text.

  • Qwen3 series large language models. We recommend the Qwen/Qwen3-32B model.

  • iic/nlp_structbert_sentiment-classification_chinese-base model.

AI function

ai_gen

  • Description: Invokes a large language model to perform inference on text or images using a prompt and returns the result. After deploying a multimodal model in the console, you can also use the ai_gen function for text-to-image, image editing, multi-image fusion, text-to-video, and video generation from a start frame, start and end frames, or a reference image. For more information about model names and parameters, see Model Studio Models.

    --Text inference
    SELECT ai_gen([model,] text)
    --Image inference
    SELECT ai_gen([model,] text, file)
    --Encapsulated prompt
    SELECT ai_gen([model,] prompt)
    --Multimodal generation (example: specify a model and a JSON prompt, where file is used for OSS authentication)
    SELECT ai_gen('<model_name>', <prompt_json>::text, to_file('oss://...','oss-xxx-internal.aliyuncs.com','acs:ram::...'));
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns an optimal model from your deployed models. To use a different model, you must first deploy the target model and then modify the system table configuration. For more information, see Modify the Model for an AI Function.

    • text: Required. The input prompt. Supported data types are CHAR, VARCHAR, and TEXT.

    • file: Required. A FILE data type value. For example, you can convert an image to the FILE data type using an Object Table. This parameter is supported only in Hologres V4.0 and later.

    • prompt: Required. A JSON value returned by the prompt() function.

  • Return Values

    • Returns the large language model's response.

    • Returns NULL if the text parameter is NULL.

    • Returns an empty string ("") if the text parameter is an empty string.

    • Throws an error if the prompt parameter is NULL.

  • Examples

    • Text inference

      CREATE TABLE questions (
          question TEXT
      );
      
      INSERT INTO questions (question) VALUES
        ('What is artificial intelligence?'),
        ('How can I improve my spoken English?'),
        ('What are the key points for a healthy diet?');
      
      SELECT
        question,
        ai_gen('Answer the following question in 20 words: ' || question) AS answer
      FROM
        questions;

      The following result is returned.

       question                                  |	answer
        ------------------------------------------|-------
      How can I improve my spoken English?        |Speak and practice more, imitate pronunciation, build vocabulary, and be brave to speak up.
      What are the key points for a healthy diet? |Eat a balanced diet, control oil, salt, and sugar intake, eat more fruits and vegetables, and maintain regular eating habits.
      What is artificial intelligence?            |AI is a computer system that simulates human intelligence, capable of learning, reasoning, perception, and problem-solving.
    • Image inference

      SELECT  ai_gen('jpg_llm','What is in this picture?', to_file('oss://****/bd****k/val/images/b9b53753-91a5****.jpg','oss-cn-hangzhou-internal.aliyuncs.com','acs:ram::****' ) )

      The following result is returned.

      ai_gen
      -----
      This image shows a city street scene. A white car with license plate BTB-9784 is parked on the side of the road. There are several cars on the street, including a yellow taxi. In the background, there are buildings and trees. The weather appears to be gloomy, possibly a rainy day. There are also pedestrians and traffic lights on the street.
    • Multimodal generation (image and video)

      Set the prompt to a JSON string that contains business parameters. Pass a to_file function call as the third argument to complete OSS authentication. This is required for permission checks even if the file is not used to read image or video content. The function typically returns a JSON string. The fields in the JSON string, such as image_urls and video_url, vary depending on the model. If you configure the output_dir parameter to write the results to your OSS bucket, public network traffic charges may be incurred. Evaluate the costs before you proceed.

      Text-to-image (Example)

      SELECT ai_gen(
        'qwen_image_2_pro',
        json_build_object(
          'prompt', 'Example: Generate a product promotion image',
          'parameters', json_build_object('n', 1, 'size', '1024*1024', 'watermark', false)
        )::text,
        to_file('oss://your-bucket/path/placeholder.png', 'oss-cn-hangzhou-internal.aliyuncs.com', 'acs:ram::your-account-id:role/your-role')
      );

      Text-to-video (Example)

      SELECT ai_gen(
        'wan26_t2v',
        json_build_object(
          'prompt', 'Example: Show a game character''s entrance in 10 seconds',
          'parameters', json_build_object(
            'size', '1280*720',
            'prompt_extend', true,
            'duration', 5,
            'shot_type', 'multi'
          )
        )::text,
        to_file('oss://your-bucket/path/placeholder.png', 'oss-cn-hangzhou-internal.aliyuncs.com', 'acs:ram::your-account-id:role/your-role')
      );

      Use cases and corresponding models (The deployment name in the console is authoritative. The following table is for reference only.)

      Use case

      Model examples

      Text-to-image

      qwen-image-2.0-pro, qwen-image-max, qwen-image, etc.

      Image editing / multi-image fusion

      qwen-image-edit, qwen-image-2.0-pro, etc.

      Text-to-video

      wan2.6-t2v

      Video generation from a start frame / start and end frames / a reference image

      wan2.6-i2v-flash, wan2.2-kf2v-flash, wan2.6-r2v, etc.

      For a complete pipeline example of generating a game ad video, see Best practice: Generate game ad videos by using AI Function.

to_file

  • Converts a URL to the FILE data type.

    SELECT to_file(oss_url, oss_endpoint, oss_rolearn);
  • Usage notes

    This is a utility function and does not require a model.

  • Parameters

    • oss_url: Required. TEXT. The URL of the OSS object.

    • oss_endpoint: Required. TEXT. The OSS region endpoint. Only Classic Network endpoints are supported.

    • oss_rolearn: Required. The RoleARN for OSS access.

  • Return value

    Returns a value of the FILE data type. The function throws an error if the URL is invalid or the specified object does not exist.

  • Example

    SELECT to_file('oss://****/bd****k/val/images/b9b53753-91a5****.jpg','oss-cn-hangzhou-internal.aliyuncs.com','acs:ram::****' );

prompt

  • Description: This helper function for AI Function constructs a prompt for a large model and supports multimodal content.

    SELECT prompt('<template_string>', <expr_1> [ , <expr_2>, ... ])
        FROM <table>;
  • Usage notes

    Scalar strings are not supported. If you have only a single string, pass it directly to the large model instead of using the prompt function. Typically, you use this function with columns from a table specified in the FROM clause.

  • Parameters

    • template_string: Required. A TEXT value that specifies the prompt template. Use placeholders such as {0} and {1} for variables.

    • <expr_1> [, <expr_2>, ...]: One or more expressions to substitute into the template_string. Supported data types include TEXT, NUMERIC, and FILE.

  • Return value

    The function typically returns a JSON value in the following format:

    {
      "prompt": "<template_string>",
      "args": ARRAY(<value_1>, <value_2>, ...)
    }

    Special cases:

    • If the template_string parameter is NULL, the function returns an error.

    • If an expression evaluates to NULL, the function replaces its corresponding placeholder in the template with the string 'None'.

    • If all expressions for a given row evaluate to NULL, the function does not filter out the row. Instead, it sets all values in the args array of the returned JSON to 'None'.

  • Example

    CREATE TABLE customer_service_konwledge_detail(
      question TEXT, 
      question_summarize TEXT
    );
    
    INSERT INTO customer_service_konwledge_detail VALUES
    ('An instance is suddenly experiencing many OOM SQL errors.', 'Backend investigation revealed that an increase in customer traffic exceeded the existing resource capacity. The customer resolved the issue by scaling.'),
    ('DataWorks cannot connect to Hologres.', 'This is not our issue. Please contact the DataWorks on-call engineer.');
    
    -- prompt
    SELECT question, question_summarize, ai_gen(
      prompt('Ticket content: {0}, Ayun Customer Service response: {1}. Does this answer solve the customer issue? Output only "yes" or "no".', question, question_summarize)) 
          FROM customer_service_konwledge_detail;
         

    The query returns the following result.

                       question                   |                                                 question_summarize                                                 | ai_gen 
    ----------------------------------------------+--------------------------------------------------------------------------------------------------------------------+--------
     DataWorks cannot connect to Hologres.        | This is not our issue. Please contact the DataWorks on-call engineer.                                              | no
     An instance is suddenly experiencing many OOM SQL errors. | Backend investigation revealed that an increase in customer traffic exceeded the existing resource capacity. The customer resolved the issue by scaling. | yes
    (2 rows)

ai_parse_document

  • Description: Parses unstructured data from various file formats, such as PDF, image, Word, PPT, TXT, and Markdown, into text.

    SELECT ai_parse_document([model,] input_bytes , input_format [, output_format]);
    SELECT ai_parse_document([model,] file [, input_format, output_format]);
  • Parameters

    Parameter

    Description

    model

    Optional. The name of the model used by the AI Function. By default, the system automatically assigns the optimal model from your deployed models. To use a different model, you must first deploy the target model and then modify the system table configuration. For more information, see Modify the model for an AI function.

    input_bytes

    Required. The binary content of the file to parse. Type: BYTEA.

    file

    Required. The file to parse. This parameter is typically used with Object Tables. Type: File.

    input_format

    Optional. The format of the input file. Supported formats include PDF, Word, PPT, TXT, IMAGE, and AUTO. The default value is auto. Type: TEXT.

    • Supported image formats: ["jpg", "jpeg", "png", "tif", "tiff", "bmp"].

    • When an OSS directory contains documents of different types, you can set input_format=auto. The model will then automatically determine the file type based on its file extension.

    output_format

    Optional. The format of the parsed result. Supported formats include JSON and markdown. The default value is JSON. Type: TEXT.

  • Return value

    Returns a TEXT value formatted according to the output_format setting.

    • If output_format is JSON, the function returns a JSON string as a TEXT value. You must explicitly cast this value to JSON before use.

    • If parsing fails, the function returns an error message as a TEXT value instead of raising an error.

  • Examples

    • Convert a single PDF file in OSS into text.

      SELECT object_uri, etag, ai_parse_document(to_file ('oss://your-bucket/bs_challenge_financial_14b_dataset/pdf', 'oss-cn-hangzhou-internal.aliyuncs.com', 'acs:ram::123456789012:role/your-role-name'), 'auto', 'markdown') AS doc 
          FROM your_object_table LIMIT 1;
    • Convert unstructured data in an Object Table to text. For more information, see Unstructured Data (Object Table).

ai_embed

  • Description: Computes a fixed-dimension continuous vector for the input text or image.

    --Compute a text vector
    SELECT ai_embed([model,] content);
    --Compute an image vector
    SELECT ai_embed([model,] file);
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system assigns the optimal model from your deployed models. To switch to a different model, you must first deploy the target model and then update the system table configuration. For more information, see Modify the model for an AI Function.

    • content: Required. The input text. Supported data types: CHAR, VARCHAR, and TEXT.

    • file: Required. A FILE type object, typically representing an image. This parameter is supported only in Hologres V4.0 and later.

  • Return value

  • Examples

    • Text embedding

      SELECT ai_embed('Hologres is a one-stop real-time data warehouse engine developed by Alibaba. It supports real-time ingestion, updates, processing, and analysis of massive data.');

      The following result is returned.

      ai_embed
      -------
      {-0.020090256, -0.009496426, -0.01584659, ..., -0.057956327}
    • Image embedding

      -- Compute an embedding for an image from OSS.
      SELECT ai_embed('clip-ViT-B-32', to_file('oss://****', 'oss-cn-hangzhou-internal.aliyuncs.com', 'acs:ram::****'));

ai_rank

  • Description: Calculates the relevance score between two sentences.

    SELECT ai_rank([model,] source_sentence, sentence_to_compare);
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns an optimal model from your deployed models. To use a different model, first deploy the target model and then modify the system table configuration. For more information, see Modify the model for an AI Function.

    • source_sentence: Required. The source text. Supported character types: CHAR, VARCHAR, and TEXT.

    • sentence_to_compare: Required. The text to compare with the source_sentence parameter. Supported character types: CHAR, VARCHAR, and TEXT.

  • Return value

    • Returns a FLOAT relevance score in the range [0, 1]. A higher value indicates higher relevance.

    • If either the source_sentence or sentence_to_compare parameter is NULL, the function returns 0.

  • Example

    SELECT knowledge, ai_rank('What is Alibaba''s revenue in 2024?', knowledge) AS score
      FROM (
          VALUES ('Amazon''s 2024 revenue was 638 billion USD'), 
                 ('Alibaba''s 2024 revenue was 941.168 billion CNY'), 
                 ('Alibaba''s 2023 revenue was 868.687 billion CNY')
      ) AS knowledge_table(knowledge)
      ORDER BY score DESC;

    The query returns the following result.

    knowledge	                                   | score
    ---------------------------------------------|-------
    Alibaba's 2024 revenue was 941.168 billion CNY |0.899999976
    Alibaba's 2023 revenue was 868.687 billion CNY  |0.200000003
    Amazon's 2024 revenue was 638 billion USD	   |0.100000001

ai_chunk

  • Description: Splits long text into chunks.

    SELECT ai_chunk([model,] long_sentence[, chunk_size, chunk_overlap, separators])
  • Parameters

    Parameter

    Description

    model

    Optional. The model to use for the AI function. By default, the system automatically selects an optimal model from the ones you have deployed. To use a different model, deploy it and then modify the system table configuration. For more information, see Modify the model for an AI function.

    long_sentence

    Required. The source text to split into chunks. Supported character types are CHAR, VARCHAR, and TEXT.

    chunk_size

    Optional. The length of each chunk. This parameter supports the INT data type. The default value is 300.

    chunk_overlap

    Optional. The overlap length between adjacent chunks. This overlap helps preserve semantic integrity by preventing sentences from being split across chunks. This parameter supports the INT data type. The default value is 50.

    separators

    Optional. The text separators, specified as a TEXT[] array. The default value is ["\n\n", "\n", " ", ""], which is suitable for English text. For Chinese text, use Chinese separators such as ["\n\n", "\n", "。", "!", "?", ";", ",", " "].

  • Return value

    • Returns a TEXT[] array of chunks.

    • Returns NULL if the long_sentence parameter is NULL.

  • Example

    SELECT ai_chunk('Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与即席分析(Ad Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解决方案。',40,10);

    The function returns the following result.

    ai_chunk
    ---
    "{"Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data","Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标","工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分Po","语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与","维分析(OLAP)与即席分析(Ad","Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度","支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、D","te、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解","线一体化全栈数仓解决方案。"}"

ai_classify

  • Description: Classifies the input text using the specified classification labels.

    SELECT ai_classify([model,] content, labels)
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns the optimal model from your deployed models. To use a different model, deploy the target model and modify the system table configuration. For more information, see Modify the Model for an AI Function.

    • content: Required. The input text to classify. Supported character types are CHAR, VARCHAR, and TEXT.

    • labels: Required. An ARRAY of expected classification labels. The number of labels must be between 2 and 20.

  • Return value

    • Returns the matching classification label as a TEXT value.

    • If the content parameter is NULL, the function returns NULL.

    • If the content parameter is an empty string (""), the function returns NULL.

    • If the labels parameter has the wrong number of values, an error is thrown.

  • Example

    CREATE TABLE product_detail(
        product_name TEXT,
        product_desc TEXT
    );
    INSERT INTO product_detail VALUES
    ('iPhone','Apple smartphone'),
    ('P50','Huawei smartphone'),
    ('X200','Vivo smartphone'),
    ('dress_01','Dior dress'),
    ('pants_01','Dior pants'),
    ('oolong_tea','Chayan Yuese milk tea'),
    ('sandwich_cookie','Oreo cookie');
    
    -- Classify text using ai_classify
    SELECT
        product_name,
        ai_classify(product_desc, ARRAY['Electronics', 'Apparel', 'Food']) AS category
      FROM
         product_detail
      LIMIT 10;

    The query returns the following result:

    product_name	| category
    ----------------|-----------
    dress_01	    | Apparel
    iPhone	        | Electronics
    oolong_tea	    | Food
    P50	            | Electronics
    X200	        | Electronics
    pants_01	    | Apparel
    sandwich_cookie	| Food

ai_extract

  • Description: Extracts entities specified by labels from the input text.

    SELECT ai_extract([model,] content, labels)
  • Parameters

    • Model: Optional. The name of the model for the AI Function. By default, the system automatically assigns an optimal model from your deployed models. To use a different model, you must first deploy the model and then modify the system table configuration. For more information, see Modify the model for an AI Function.

    • Content: Required. The source text for extraction. Supported character types are CHAR, VARCHAR, and TEXT.

    • Labels: Required. An ARRAY of labels specifying the entities to extract. The number of labels must be between 1 and 20.

  • Return value

    • Returns a JSON object that contains the extracted information for each label.

    • Returns NULL if the content parameter is NULL or an empty string ("").

    • Raises an error if the number of labels is not between 1 and 20.

  • Example

    CREATE TABLE users (
      user_id TEXT,
      resume TEXT
    );
    
    INSERT INTO users (user_id, resume) VALUES
      ('u001', 'Name: Zhang San, male, 28. Email: zhangsan@example.com. Phone: 1380013****. Extensive work experience.'),
      ('u002', 'Name: Li Si, female, 35. Phone: 1390013****. Email: lisi@example.com. Has management experience.'),
      ('u003', 'Name: Wang Wu, male, 25. Email: wangwu@example.com. Phone: 1370013****.');
    
    SELECT
      user_id,
      ai_extract(resume, ARRAY['name','email','phone','gender','age']) AS user_desc_obj
    FROM
      users;

    The query returns the following result.

    user_id	| user_desc_obj
    --------|---------------
    u002	| "{\"name\":\"Li Si\",\"age\":\"35\",\"gender\":\"female\",\"phone\":\"1390013****\",\"email\":\"lisi@example.com\"}"
    u003	| "{\"name\":\"Wang Wu\",\"age\":\"25\",\"gender\":\"male\",\"phone\":\"1370013****\",\"email\":\"wangwu@example.com\"}"
    u001	| "{\"name\":\"Zhang San\",\"age\":\"28\",\"gender\":\"male\",\"phone\":\"1380013****\",\"email\":\"zhangsan@example.com\"}"

ai_mask

  • Masks specified information in the input text and replaces it with the [MASKED] placeholder.

    SELECT ai_mask([model,] content, labels)
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns the optimal model from your deployed models. To use a different model, you must first deploy the target model and then update the system table configuration. For more information, see Modify the model for an AI Function.

    • content: Required. The input text to mask. This parameter supports character types such as CHAR, VARCHAR, and TEXT.

    • labels: Required. An array of labels that specify the information to mask. The number of labels must be between 1 and 20. This parameter supports the ARRAY type.

  • Return value

    • Returns the masked text.

    • If the content parameter is NULL, the function returns NULL.

    • If the content parameter is an empty string (""), the function returns an empty string.

    • If the number of labels is invalid, the function throws an error.

  • Usage example

    SELECT ai_mask(
      'User Wang Xiaoming, ID number: 23030611111111, mobile phone number: 1388888****.',
      ARRAY['ID number', 'mobile phone number']); 

    The query returns the following result.

    ai_mask
    -------
    User Wang Xiaoming, ID number: [MASKED], mobile phone number: [MASKED].

ai_fix_grammar

  • Description: Corrects grammatical errors in the text.

    SELECT ai_fix_grammar([model,] content)
  • Parameters

    • model: Optional. The name of the model used by the ai function. By default, the system assigns an optimal model from your deployed models. To use a different model, deploy it and modify the system table configuration. For more information, see Modify the model for an ai function.

    • content: Required. The text to correct. Supported character types are CHAR, VARCHAR, and TEXT.

  • Return value

    • Returns the corrected text.

    • If the content parameter is NULL, the function returns NULL.

    • If the content parameter is an empty string (""), the function returns an empty string ("").

  • Example

    SELECT ai_fix_grammar('He dont know what to did.');

    Output:

    ai_fix_grammar
    --------------
    He doesn't know what to do.

ai_summarize

  • Description: Generates a summary of the input text.

    SELECT ai_summarize([model,] content[, max_words])
  • Parameters

    • model: Optional. The name of the model used for the AI function. By default, the system automatically assigns an optimal model from the deployed models. To switch models, you must first deploy the target model and then modify the system table configuration. For more information, see Change the model associated with an AI function.

    • content: Required. The input text to summarize. Supported data types are CHAR, VARCHAR, and TEXT.

    • max_words: Optional. The maximum number of words for the generated summary. The model generates a summary with a word count close to this value. The default is 50. A value of 0 indicates no limit.

  • Return value

    • Returns a summary of the input text.

    • Returns NULL if the content parameter is NULL.

    • Returns an empty string ("") if the content parameter is an empty string.

    • Throws an error if max_words is less than 0.

  • Example

    SELECT ai_summarize('Hologres is a one-stop real-time data warehouse developed by Alibaba Cloud. It supports real-time writes, updates, processing, and analysis for massive data. The service supports standard SQL (compatible with the PostgreSQL protocol and syntax, and most PostgreSQL functions), petabyte-scale multidimensional analysis (OLAP) and ad hoc analysis, and high-concurrency, low-latency online data services (Serving). It also provides fine-grained isolation for various workloads and enterprise-level security. Deeply integrated with MaxCompute, Flink, and DataWorks, Hologres offers a full-stack, enterprise-level solution for offline and online integration.', 15);

    The query returns the following result.

    ai_summarize
    ------------
    Hologres is an Alibaba Cloud real-time data warehouse for massive data processing and analysis.

ai_translate

  • Description: Translates text into a target language.

    SELECT ai_translate([model,] content, to_lang)
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns the optimal model from the deployed models. To use a different model, first deploy the target model, and then modify the system table configuration. For details, see Modify the Model for an AI Function.

    • content: Required. The text to translate. Supported data types are CHAR, VARCHAR, and TEXT.

    • to_lang: Required. The target language code. For details, see ISO-639.

  • Return Value

    • Returns the translated text.

    • If the content parameter is NULL, the function returns NULL.

    • If the content parameter is an empty string, the function returns an empty string.

    • If the value of to_lang is invalid, the function throws an error.

  • Example

    SELECT ai_translate('Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与即席分析(Ad Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解决方案。', 'en');

    The query returns the following:

    ai_translate
    -----------
    Hologres is a one-stop real-time data warehouse engine developed by Alibaba. It supports real-time ingestion, updates, processing, and analysis for massive datasets. The engine is compatible with standard SQL and the PostgreSQL protocol and syntax, and supports most PostgreSQL functions. It provides petabyte-scale multi-dimensional analysis (OLAP) and ad-hoc analysis, high-concurrency and low-latency online data services (serving), fine-grained isolation for multiple workloads, and enterprise-level security capabilities. Deeply integrated with MaxCompute, Flink, and DataWorks, it offers a full-stack, enterprise-level solution for online and offline integration.

ai_similarity

  • Description: Calculates the similarity between two input texts.

    SELECT ai_similarity([model,] text1, text2)
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns the optimal model from your deployed models. To use a different model, you must first deploy the target model and then modify the system table configuration. For more information, see Modify the model for an AI Function.

    • text1 and text2: The two texts to compare. Supports the CHAR, VARCHAR, and TEXT character types.

  • Return value

    • Returns a float value in the range [0, 1], where a larger value indicates a higher similarity. A value of 0 indicates no similarity, and a value of 1 indicates that the two texts are identical.

    • Returns 0 if either text1 or text2 is NULL.

    • Returns 1 if both text1 and text2 are empty strings ("").

    • Returns 0 if one parameter is an empty string ("") and the other is a non-empty string.

  • Example

    CREATE TABLE products2 (
        product_name TEXT
    );
    
    INSERT INTO products2 (product_name) VALUES
      ('White shirt'), ('Black suit pants'), ('Casual top'), ('Sports jacket'), ('White dress'),
      ('Bluetooth headphones'), ('Milk chocolate'), ('White top'), ('Men''s T-shirt'), ('Down jacket');
    
    SELECT product_name FROM products2 
      ORDER BY ai_similarity(product_name, 'White top') DESC LIMIT 5;
    

    The query returns the following results.

    product_name
    ----------
    White top
    White shirt
    Casual top
    White dress
    Men's T-shirt

ai_analyze_sentiment

  • Description: Performs sentiment analysis on the input text.

    SELECT ai_analyze_sentiment([model,] content);
  • Parameters

    • model: Optional. The name of the model used by the AI Function. By default, the system automatically assigns the optimal model from the deployed models. To use a different model, first deploy the target model, and then modify the system table configuration. For more information, see Modify the model for an AI Function.

    • content: Required. The input text. Supported character types: CHAR, VARCHAR, and TEXT.

  • Return value

    • Returns character-type sentiment labels. The returned labels vary by model.

      • For Qwen3 series large models, the function returns one of the following labels: positive, negative, neutral, or mixed.

      • For the iic/nlp_structbert_sentiment-classification_chinese-base model, the function returns the label with the highest probability, which can be positive, negative, or NULL.

    • Returns NULL if the content parameter is NULL or an empty string ("").

  • Examples

    -- Using a large language model:
    SELECT ai_analyze_sentiment('On my wedding night and when I top the exam list.');
    -- output example: positive
    
    -- Using the iic/nlp_structbert_sentiment-classification_chinese-base model:
    SELECT ai_analyze_sentiment('Drizzle falls at Qingming; travelers on the road feel their souls break.');
    -- output example: negative

AI functions and models

Function and model mapping

Hologres provides the list_ai_function_infos system table to view the mapping between AI Functions and models. After you deploy a model in the Hologres console, this system table automatically updates to show the model mapped to each AI Function. You can then invoke the corresponding model by calling the AI Function.

Note

Each AI Function must be paired with a compatible model. For example, ai_embed works with an embedding model, and ai_classify is for a large language model. If you have deployed only one type of model in your instance, some AI Functions might not have a model assigned to them. You cannot use an AI Function if no compatible model is deployed and mapped to it.

SELECT * FROM list_ai_function_infos();

The query returns the following results.

    function_name     |    model_name    
----------------------+------------------
 ai_embed             | my_gte_embedding
 ai_classify          | my_qwen32b
 ai_extract           | my_qwen32b

Modify the model mapping

Hologres automatically assigns a default model from your deployments to an AI Function. You can change this assignment. After you update the assignment, any call to the AI Function will invoke the new model.

  • Global modification

    SELECT set_ai_function_info('<function_name>', '<model_name>');

    Parameters

    • function_name: The name of the AI Function. For a list of supported function names, see AI Function summary.

    • model_name: The name of a deployed model. To view deployed model names, log on to the Hologres console and navigate to the AI Node page.

    Note

    An error occurs if the specified function name or deployed model name does not exist.

    Example

    SELECT set_ai_function_info('ai_embed', 'my_gte_embedding');
  • Session-level modification

    When you use an AI Function to invoke a model, a session-level configuration takes precedence over the global configuration (ai_function_info).

    -- This setting applies only to the current connection.
    SET hg_experimental_ai_function_name_to_model_name_mapping='<function_name>:<model_name>[,<function_name1>:<model_name1>]';

Best practices

Now that you've learned the basics of AI Function, explore these real-world best practices to learn how to solve complex business problems.