All Products
Search
Document Center

ApsaraDB RDS:AI (rds_ai)

Last Updated:Mar 28, 2026

The rds_ai extension lets you call large language models (LLMs) and run embedding operations directly from SQL in ApsaraDB RDS for PostgreSQL. It integrates with Alibaba Cloud Model Studio models — including Qwen and text-embedding-v3 — so you can build AI features without moving data out of your database.

With rds_ai, you can run LLM Q&A, convert text to vectors, retrieve semantically similar records, and run end-to-end retrieval-augmented generation (RAG) pipelines — all from standard PostgreSQL queries. It also supports custom models, so you can connect any HTTP-compatible model API.

Prerequisites

Before you begin, make sure you have:

  • An ApsaraDB RDS for PostgreSQL instance that meets the following version requirements: To upgrade, see Upgrade a minor engine version.

    Major versionMinimum minor engine version
    PostgreSQL 1620241230
    PostgreSQL 14, 15, 1720250430
    PostgreSQL 1820251130
  • A privileged account for your instance. To create one, see Create an account.

  • An active Alibaba Cloud Model Studio subscription with an API key. To get one, see Obtain an API key.

Network configuration

By default, an RDS for PostgreSQL instance cannot access external networks. To enable access to external models, you must configure a NAT gateway for the instance's VPC. For more information about NAT gateways, see Use the SNAT feature of an Internet NAT gateway to access the Internet.

Install the extension

Important

Confirm your instance's major and minor engine versions before installing. See Prerequisites.

Option 1: Install from the console

  1. Go to the RDS Instances page, select a region, and click the target instance ID.

  2. In the left navigation pane, click Plug-ins.

  3. On the Plugin Marketplace page, click Install next to rds_ai.

  4. Select the target database and account, then click Install.

To uninstall, go to the Installed Plugins tab on the Manage Plugins page.

Option 2: Install with SQL

Run the following command as a privileged account:

CREATE EXTENSION IF NOT EXISTS rds_ai CASCADE;

Installing rds_ai also installs the pgvector and pgsql-http extensions.

To verify the installation:

SELECT * FROM pg_extension;

To remove the extension:

DROP EXTENSION rds_ai;

Default models

rds_ai ships with the following default models:

APIParameterDefault modelOptions
Promptrds_ai.default_prompt_modelqwen-plusqwen-plus, qwen-max, qwen-turbo
Embedrds_ai.default_embed_modeltext-embedding-v3text-embedding-v3

To change the default prompt model, add rds_ai to the Running Value of the shared_preload_libraries parameter, then set the model. For instructions on updating parameters, see Set instance parameters. For example, set Running Value to 'pg_stat_statements,auto_explain,rds_ai', then run:

SET rds_ai.default_prompt_model TO 'qwen-max';

Configure the extension

Before calling any model, set the API key and model URLs.

Step 1: Set the API key

-- Set the API key for a specific model
SELECT rds_ai.update_model('qwen-plus', 'token', 'sk-****');

-- Set the same API key for all models
SELECT rds_ai.update_model(model_name, 'token', 'sk-****')
FROM rds_ai.model_list;

Step 2: Set the model URLs

-- Prompt models (qwen-plus, qwen-max, qwen-turbo share the same endpoint)
SELECT rds_ai.update_model('qwen-plus', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation');
SELECT rds_ai.update_model('qwen-max', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation');
SELECT rds_ai.update_model('qwen-turbo', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation');

-- Embedding model
SELECT rds_ai.update_model('text-embedding-v3', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding');

Step 3: (Optional) Configure request timeouts

rds_ai uses the pgsql-http extension to make remote calls. Set timeouts to interrupt long-running requests.

These settings apply to the current session only. Reset them for each new connection.
-- Request timeout (milliseconds)
SET http.timeout_msec TO 200000;
SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000');

-- Connection timeout (milliseconds)
SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');

LLM Q&A

Use rds_ai.prompt to send a question to a Qwen text generation model and get a response. The default model is qwen-plus.

Simple call:

SELECT rds_ai.prompt('Give me a recipe using carrots, potatoes, and eggplants.');

Full call with explicit parameters:

SELECT rds_ai.prompt(
  model_name => 'qwen-plus',                                 -- model name
  content    => 'Give me a recipe using carrots, potatoes, and eggplants.',  -- prompt text
  args       => '{"top_p": 0.7}'::jsonb                      -- model parameters
);

Text-to-vector conversion

Use rds_ai.embed to convert text to a vector using the text-embedding-v3 model. The output is a 1024-dimensional dense vector by default. To request a sparse vector, pass {"output_type": "sparse"} — the output is 250,002 dimensions.

Dense vector (default):

SELECT rds_ai.embed(
  'A strong wind blows from a high sky, and monkeys cry sadly.'
);

Sparse vector:

SELECT rds_ai.embed(
  content => 'A strong wind blows from a high sky, and monkeys cry sadly.',
  args    => '{"output_type": "sparse"}'::jsonb
);

Store embeddings in a table:

  1. Create the table:

    CREATE TABLE test_embed (
      a text,
      b vector(1024),         -- dense vector column
      c sparsevec(250002)     -- sparse vector column
    );
  2. Insert text:

    INSERT INTO test_embed (a) VALUES ('hello world');
  3. Write the embeddings:

    -- Write a dense vector
    UPDATE test_embed
    SET b = rds_ai.embed(a, '{"output_type": "dense"}'::jsonb)::vector;
    
    -- Write a sparse vector
    UPDATE test_embed
    SET c = rds_ai.embed(a, '{"output_type": "sparse"}'::jsonb)::sparsevec;

Semantic vector retrieval

Use rds_ai.retrieve to find the top-N most similar records to a query. It embeds the query using text-embedding-v3 and compares against stored vectors using the distance type you specify.

Choose a distance type

Distance typeWhen to use
cosineComparing text meaning regardless of length — the most common choice for semantic search
L2When the magnitude of vectors matters, not just their direction
L1For sparse or high-dimensional vectors where L2 is computationally expensive
negativeFor normalized vectors where dot product is equivalent to cosine similarity but faster

The default distance type for rds_ai.retrieve is cosine.

Choose a vector index type

Without an index, PostgreSQL performs an exact sequential scan — accurate but slow for large datasets. Add a vector index for faster approximate nearest neighbor (ANN) search.

Index typeSpeedAccuracyMemoryWhen to use
NoneSlow100% recallLowSmall tables or when exact results are required
HNSWFastHigh (tunable)HigherMost production workloads — best query speed
IVFFlatModerateModerate (tunable)LowerLarge datasets where memory is constrained

Example: Set up a table and run retrieval

  1. Create a table and add an embedding column:

    CREATE TABLE test_rag (
      id    SERIAL PRIMARY KEY,
      chunk TEXT
    );
    
    ALTER TABLE test_rag ADD COLUMN embedding VECTOR(1024);
    UPDATE test_rag SET embedding = rds_ai.embed(chunk)::vector;
  2. Create a vector index:

    Both index types require the vector_cosine_ops operator class.
    -- HNSW index (recommended for most workloads)
    CREATE INDEX ON test_rag USING hnsw (embedding vector_cosine_ops);
    
    -- IVFFlat index (for memory-constrained environments)
    CREATE INDEX ON test_rag USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
  3. Run a similarity search:

    -- Simple call (uses defaults: text-embedding-v3, top 10, cosine distance)
    SELECT * FROM rds_ai.retrieve(
      'Why is PostgreSQL considered the most advanced open-source database?',
      'public', 'test_rag', 'chunk', 'embedding'
    );

    Full call with explicit parameters:

    SELECT * FROM rds_ai.retrieve(
      embed_model   => 'text-embedding-v3',
      question      => 'Why is PostgreSQL considered the most advanced open-source database?',
      source_schema => 'public',         -- schema of the source table
      source_table  => 'test_rag',       -- table to search
      chunk_col     => 'chunk',          -- text column
      vector_col    => 'embedding',      -- vector column
      topn          => 10,               -- number of results to return
      embed_args    => '{}'::jsonb,      -- extra parameters for the embedding model
      distance_type => 'cosine'          -- distance algorithm: L1, L2, cosine, or negative
    );

RAG Q&A

rds_ai.rag combines vector retrieval and LLM generation in a single call. It retrieves the top-N relevant chunks from your table and passes them as context to the prompt model.

The default models are text-embedding-v3 for embedding and qwen-plus for text generation.

Simple call:

SELECT * FROM rds_ai.rag(
  'Why is PostgreSQL considered the most advanced open-source database?',
  'public', 'test_rag', 'chunk', 'embedding'
);

Full call with explicit parameters:

SELECT * FROM rds_ai.rag(
  embed_model   => 'text-embedding-v3',  -- embedding model
  prompt_model  => 'qwen-plus',          -- text generation model
  question      => 'Why is PostgreSQL considered the most advanced open-source database?',
  source_schema => 'public',             -- schema of the source table
  source_table  => 'test_rag',           -- table to search
  chunk_col     => 'chunk',              -- text column
  vector_col    => 'embedding',          -- vector column
  topn          => 10,                   -- number of chunks to retrieve
  embed_args    => '{}'::jsonb,          -- extra parameters for the embedding model
  prompt_args   => '{}'::jsonb,          -- extra parameters for the prompt model
  distance_type => 'L2'                  -- distance algorithm: L1, L2, cosine, or inner product
);

Custom models

rds_ai stores model configurations in the rds_ai.model_list table. Add any HTTP-compatible model by inserting a row into this table.

Important

If you encounter issues when adding a custom model, contact us.

rds_ai.model_list fields

FieldTypeDescription
model_namenameModel name. Primary key.
request_typetextHTTP method (e.g., POST)
request_headerhttp.http_header[]HTTP request headers — used for authentication
uritextModel endpoint URL
content_typetextRequest content type (e.g., application/json)
content_templatetextRequest body template with %s placeholders filled at call time
json_pathtextSQL expression to parse the HTTP response
tokentextAuthentication token placed into the request header

Add a custom model

SELECT rds_ai.add_model(
  'test-model',                                          -- model name
  'POST',                                                -- HTTP method
  ARRAY[('Authorization', 'Bearer %s')]::http.http_header[],  -- request header
  'https://****.com',                                    -- model URL
  'application/json',                                    -- content type
  '{"key":"%s"}',                                        -- request body template
  'SELECT %L'                                            -- JSON response parser
);

Call a custom model

Use rds_ai.raw_invoke_model to get the full HTTP response, or rds_ai.invoke_model to extract just the field defined in json_path:

-- Returns the full HTTP response
SELECT * FROM rds_ai.raw_invoke_model('qwen-plus', ARRAY['who are you']);

-- Returns the parsed result from json_path
SELECT * FROM rds_ai.invoke_model('qwen-plus', ARRAY['who are you']);

Delete a custom model

SELECT rds_ai.del_model('test-model');

Add a PAI-RAG model

Deploy your RDS instance and PAI-EAS service in the same VPC to keep all traffic within your private network.

  1. Deploy the RAG service in PAI-EAS. See Deploy an LLM-based RAG chatbot using PAI-EAS and ApsaraDB RDS for PostgreSQL.

  2. Get the service endpoint and token:

    1. Click the RAG service name to open the Service Details page.

    2. In Basic Information, click View Endpoint Information.

    3. In the Invocation Information dialog box, copy the service endpoint and token.

  3. Add the PAI-RAG model. Replace the URL with your actual endpoint and append /service/query:

    SELECT rds_ai.add_model(
      'pai-rag',
      'POST',
      ARRAY[('Authorization', '%s')]::http.http_header[],
      'http://rds-pai-rag-demo.****.cn-hangzhou.pai-eas.aliyuncs.com/service/query',
      'application/json',
      '{"question": "%s"}',
      'SELECT (%L::jsonb->''answer'')::text'
    );
  4. Set the authentication token:

    SELECT rds_ai.update_model('pai-rag', 'token', 'MTFkYjMwZjgzYzA1YmE2N2YyNWMxM2NkNDVjMjEzNjYxMDAzMzE5****');
  5. Verify the model:

    SELECT rds_ai.invoke_model('pai-rag', ARRAY['What parameters are related to WAL log accumulation?']);

For more information on calling PAI-RAG models, see Call PAI models using APIs.

Add a Function Compute RAG model

  1. Deploy the AgentCraft application in Function Compute. See Cloud Deployment of AgentCraft.

  2. Create an agent and a Client Access integration. When you create the Client Integration (as in Integrate an agent into a DingTalk robot), save the endpoint and token.

  3. Add the Function Compute RAG model. Replace the URL with your actual endpoint and append /completions:

    SELECT rds_ai.add_model(
      'fc-rag',
      'POST',
      ARRAY[('Authorization', 'Bearer %s')]::http.http_header[],
      'https://agentcrckend-de-obnhjsknam.cn-hangzhou.fcapp.run/v1/chat/completions',
      'application/json',
      '{
        "messages": [{"role": "user", "content": "%s"}],
        "stream": false,
        "max_tokens": 1024
      }',
      'SELECT (%L::jsonb->''choices''->0->''message''->>''content'')::text'
    );
  4. Set the authentication token:

    SELECT rds_ai.update_model('fc-rag', 'token', '8UiGAziWgYGPxM3qR5sAChBfDJRt****');
  5. Verify the model:

    SELECT rds_ai.invoke_model('fc-rag', ARRAY['What parameters are related to WAL log accumulation?']);

API reference

rds_ai.prompt

Sends a prompt to an LLM and returns the response.

rds_ai.prompt(
  model_name TEXT,
  content    TEXT,
  args       jsonb DEFAULT '{}'
) RETURNS text
ParameterTypeDescriptionDefault
model_nametextModel to call.qwen-plus
contenttextPrompt text.
argsjsonbAdditional model parameters (e.g., {"top_p": 0.2}).{}

rds_ai.embed

Converts text to a vector using the specified embedding model.

rds_ai.embed(
  model_name TEXT,
  content    TEXT,
  args       jsonb DEFAULT '{}'
) RETURNS text
ParameterTypeDescriptionDefault
model_nametextModel to call.text-embedding-v3
contenttextText to convert.
argsjsonbAdditional parameters (e.g., {"output_type": "dense"}).{}

rds_ai.retrieve

Searches a table for the top-N most similar records to the query text.

rds_ai.retrieve(
  model_name    TEXT,
  question      TEXT,
  source_schema TEXT,
  source_table  TEXT,
  chunk_col     TEXT,
  vector_col    TEXT,
  topn          INT DEFAULT 10,
  embed_args    jsonb DEFAULT '{}',
  distance_type TEXT DEFAULT 'cosine'
) RETURNS TABLE(chunk TEXT, distance float)
ParameterTypeDescriptionDefault
model_nametextEmbedding model to call.text-embedding-v3
questiontextQuery text to search for.
source_schematextSchema of the table to search.
source_tabletextTable to search.
chunk_coltextColumn containing the text chunks.
vector_coltextColumn containing the stored vectors.
topnintNumber of results to return.10
embed_argsjsonbExtra parameters for the embedding model.{}
distance_typetextDistance algorithm: L1, L2, cosine, or negative.cosine

rds_ai.rag

Retrieves relevant chunks from a table and passes them to an LLM to generate a grounded answer.

rds_ai.rag(
  embed_model   TEXT,
  prompt_model  TEXT,
  question      TEXT,
  source_schema TEXT,
  source_table  TEXT,
  chunk_col     TEXT,
  vector_col    TEXT,
  topn          INT DEFAULT 10,
  embed_args    jsonb DEFAULT '{}',
  prompt_args   jsonb DEFAULT '{}',
  distance_type TEXT DEFAULT 'L2'
) RETURNS text
ParameterTypeDescriptionDefault
embed_modeltextEmbedding model for vector search.text-embedding-v3
prompt_modeltextLLM for generating the answer.qwen-plus
questiontextQuestion to answer.
source_schematextSchema of the table to search.
source_tabletextTable to search.
chunk_coltextColumn containing the text chunks.
vector_coltextColumn containing the stored vectors.
topnintNumber of chunks to retrieve.10
embed_argsjsonbExtra parameters for the embedding model.{}
prompt_argsjsonbExtra parameters for the prompt model.{}
distance_typetextDistance algorithm: L1, L2, cosine, or inner product.L2

rds_ai.show_models

Lists all configured models.

rds_ai.show_models() RETURNS setof rds_ai.model_list

rds_ai.add_model

Adds a model configuration.

rds_ai.add_model(
  model_name       TEXT,
  request_type     TEXT,
  request_header   http.http_header[],
  uri              TEXT,
  content_type     TEXT,
  content_template TEXT,
  json_path        TEXT
) RETURNS TEXT
ParameterTypeDescriptionExample
model_nameTEXTName to identify the model.'text-embedding-v3'
request_typeTEXTHTTP method.'POST'
request_headerhttp.http_header[]Request headers. Use %s as a placeholder for the token.ARRAY[('Authorization', 'Bearer %s')]
uriTEXTModel endpoint URL.

'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'

content_typeTEXTRequest content type.'application/json'
content_templateTEXTRequest body template. Use %s for values filled at call time.'{ "model": "text-embedding-v3", "input": {"texts": ["%s"]}, "parameters": %s }'
json_pathTEXTSQL expression to extract the result from the HTTP response. Use 'SELECT %L' if the exact path is unknown.'SELECT %L'

rds_ai.update_model

Updates a field in an existing model configuration.

rds_ai.update_model(
  model_name  TEXT,
  config_name TEXT,
  value       TEXT
) RETURNS void
ParameterTypeDescriptionExample
model_nameTEXTName of the model to update.'text-embedding-v3'
config_nameTEXTField to update. See rds_ai.add_model for field names.'uri'
valueTEXTNew value for the field.

'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'

rds_ai.del_model

Deletes a model configuration.

rds_ai.del_model(model_name text) RETURNS void
ParameterTypeDescriptionExample
model_nametextName of the model to delete.'test-model'

rds-ai.invoke_model

Calls a model and returns the result parsed by the json_path expression in the model configuration.

rds-ai.invoke_model(
  model_name  TEXT,
  params_list TEXT[]
) RETURNS http.http_response
ParameterTypeDescriptionExample
model_nametextModel to call.'qwen-plus'
params_listtext[]Values to fill into content_template, in order.ARRAY['who are you', '{}']

rds-ai.raw_invoke_model

Calls a model and returns the complete HTTP response.

rds-ai.raw_invoke_model(
  model_name  TEXT,
  params_list TEXT[]
) RETURNS http.http_response
ParameterTypeDescriptionExample
model_nametextModel to call.'qwen-plus'
params_listtext[]Values to fill into content_template, in order.ARRAY['who are you', '{}']

What's next