All Products
Search
Document Center

ApsaraDB for SelectDB:AI_AGG

Last Updated:Mar 28, 2026

AI_AGG is an aggregate function that uses a large language model (LLM) to summarize or analyze a text column based on a natural language instruction. For example, AI_AGG(details, 'Summarize every ticket into one short paragraph of 40 words or less.') returns a single paragraph summarizing all ticket details per group.

Syntax

AI_AGG([<resource_name>], <expr>, <instruction>)

Parameters

ParameterDescription
<resource_name>(Optional) The name of the AI resource to use. If omitted, the session default set by SET default_ai_resource is used.
<expr>The text column to aggregate. Each value must be fewer than 128 K characters.
<instruction>The natural language instruction that tells the LLM how to aggregate the data. Must be a literal string value — column references and expressions are not accepted.

Return value

  • Returns a string containing the aggregation result.

  • Returns NULL if all input values are NULL.

  • Because the result is generated by an LLM, the output may vary between executions.

Usage notes

Writing effective instructions

The quality of the output depends heavily on the instruction you provide. Follow these guidelines:

  • Use declarative statements, not questions.

    • Instead of: 'Can you summarize this?'

    • Use: 'Summarize every ticket detail into one short paragraph of 40 words or less.'

  • Describe what the data represents.

    • Instead of: 'Summarize'

    • Use: 'Summarize the customer support tickets'

  • Describe the intended use case and desired output format.

    • Instead of: 'Find the best review'

    • Use: 'Find the most positive review to feature on the product page, in one sentence'

Setting a default AI resource

To avoid specifying resource_name in every query, set a session default:

SET default_ai_resource = '<resource_name>';

After setting the default, omit resource_name from the function call:

AI_AGG(<expr>, <instruction>)

Examples

Summarize customer support tickets by issue type

The following table contains sample customer support tickets.

CREATE TABLE support_tickets (
    ticket_id      BIGINT,
    customer_name  VARCHAR(100),
    subject        VARCHAR(200),
    details        TEXT
)
DUPLICATE KEY(ticket_id)
DISTRIBUTED BY HASH(ticket_id) BUCKETS 5
PROPERTIES (
    "replication_num" = "1"
);

INSERT INTO support_tickets VALUES
(1, 'Alice', 'Login Failure', 'Cannot log in after password reset. Tried clearing cache and different browsers.'),
(2, 'Bob', 'Login Failure', 'Same problem as Alice. Also seeing 502 errors on the SSO page.'),
(3, 'Carol', 'Payment Declined', 'Credit card charged twice but order still shows pending.'),
(4, 'Dave', 'Slow Dashboard', 'Dashboard takes >30 seconds to load since the last release.'),
(5, 'Eve', 'Login Failure', 'Getting redirected back to login after entering 2FA code.');

Use AI_AGG to produce one summary per issue type.

SELECT
    subject,
    AI_AGG(
        'ai_resource_name',
        details,
        'Summarize every ticket detail into one short paragraph of 40 words or less.'
    ) AS ai_summary
FROM support_tickets
GROUP BY subject;

Expected output:

+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| subject          | ai_summary                                                                                                                                                                                                |
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Slow Dashboard   | The dashboard loading time has significantly increased to over 30 seconds following the latest release, indicating a potential issue with the recent update.                                              |
| Login Failure    | User experiences login issues, including redirection post-2FA, inability to log in after password reset despite using different browsers and clearing cache, and encountering 502 errors on the SSO page. |
| Payment Declined | The customer's credit card was charged twice, but the order status remains pending, indicating a potential issue with the transaction processing or system update.                                        |
+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Aggregate e-commerce product reviews

The following table contains sample user reviews from an e-commerce platform.

CREATE TABLE product_reviews (
    review_id   BIGINT,
    product_id  BIGINT,
    rating      TINYINT,
    comment     STRING
)
DUPLICATE KEY(review_id)
DISTRIBUTED BY HASH(product_id) BUCKETS 10
PROPERTIES (
    "replication_num" = "1"
);

INSERT INTO product_reviews VALUES
(1, 1001, 5, 'The shoe size is perfect, they are comfortable, the color is nice, and delivery was fast!'),
(2, 1001, 4, 'Good quality, but the sole is a bit stiff and needs a few days to break in.'),
(3, 1001, 3, 'It looks the same as the picture, but had a slight glue smell on arrival.'),
(4, 1002, 5, 'The cup is compact, juices quickly, is easy to clean, and is perfect for taking to work.'),
(5, 1002, 3, 'It is a bit loud, but acceptable. A full charge only makes 5 cups of juice.'),
(6, 1002, 2, 'It stopped charging after two weeks, and the after-sales replacement process is too slow.'),
(7, 1003, 5, 'The fabric is breathable, not stuffy. The cuff design is thoughtful, and the UPF50+ really prevents sunburn.'),
(8, 1003, 4, 'Nice color, but the zipper stutters and requires force.'),
(9, 1004, 5, 'Great noise cancellation. I can listen to music quietly on the subway. The battery lasts a week on one charge.');

This example uses the session default AI resource to omit resource_name from the function call.

SET default_ai_resource = 'ai_resource_name';

SELECT
    product_id,
    AI_AGG(
        comment,
        'Summarize user reviews into a single sentence of 50 characters or less, highlighting the pros and cons mentioned most by buyers.'
    ) AS review_summary
FROM product_reviews
GROUP BY product_id;

Expected output:

+------------+------------------------------------------------------------------------------------------------------------------------------------+
| product_id | review_summary                                                                                                                     |
+------------+------------------------------------------------------------------------------------------------------------------------------------+
|       1003 | The product has breathable fabric, good sun protection, and a nice color, but the zipper is not smooth.                              |
|       1004 | Users praise its effective noise cancellation and long battery life, requiring a charge only once a week.                            |
|       1001 | Buyers generally find the shoes comfortable and stylish with fast delivery, but note the sole is stiff and has a slight glue smell.  |
|       1002 | Buyers find the juicer cup compact, fast, and easy to clean, but criticize its short battery life and slow replacement process.      |
+------------+------------------------------------------------------------------------------------------------------------------------------------+