All Products
Search
Document Center

ApsaraDB RDS:Use the AI capabilities provided by the rds_ai extension

Last Updated:Jun 17, 2025

ApsaraDB RDS for PostgreSQL introduces support for the rds_ai extension that integrates the advanced models of Alibaba Cloud Model Studio, including Qwen and the text embedding model. This extension enables a wide range of applications, such as LLM-based Q&A service, text-to-vector conversion, top-N similar vector retrieval, and Q&A service based on retrieval-augmented generation (RAG). In addition, you can deploy custom models through the rds_ai extension to develop various AI-driven applications in RDS for PostgreSQL. This topic describes how to deploy this extension and use its AI capabilities.

Prerequisites

  • The major and minor engine versions of your RDS instance meet the requirements in the following table:

    Major engine version

    Minor engine version

    PostgreSQL 16

    20241230 and later

    PostgreSQL 14, PostgreSQL 15, and PostgreSQL 17

    20250430 or later

    For more information about how to update the minor engine version, see Update the minor engine version.

  • A privileged account is created for your RDS instance. For more information about how to create a privileged account, see Create an account.

  • Alibaba Cloud Model Studio is activated and an API key is available. For more information, see Obtain an API key.

Network configurations

By default, an RDS for PostgreSQL instance can access resources only over the internal network. Therefore, you must create a NAT gateway for the virtual private cloud (VPC) in which the RDS instance resides to allow the RDS instance to access external models. For more information about NAT gateways, see Use the SNAT feature of an Internet NAT gateway to access the Internet.

Create and delete rds_ai

Important

Before creating the rds_ai extension, make sure that your RDS for PostgreSQL instance meets the requirements described in Prerequisites.

  • Create and delete the rds_ai extension on the Plug-in page

    1. Log on to the ApsaraDB RDS console and go to the Instances page. In the top navigation bar, select the region in which the RDS instance resides. Then, find the RDS instance and click its ID.

    2. In the left-side navigation pane, click Plug-ins.

    3. On the Extension Marketplace tab, click Install in the rds_ai section.

    4. On the dialog box that appears, select a database and an account. Then, click Install to create the rds_ai extension in the selected database.

    (Optional) If you no longer need the rds_ai extension, you can delete it on the Installed Extensions page of the Extension Management tab.

  • Create and delete the rds_ai extension by executing a SQL statement

    • Create the extension

      CREATE EXTENSION IF NOT EXISTS rds_ai CASCADE;
      Note
      • Only privileged accounts have the permissions to execute the preceding statement. For more information about how to create a privileged account, see Create an account.

      • When you create the rds_ai extension, the pgvector and pgsql-http extensions are automatically created.

      • You can execute the SELECT * FROM pg_extension; statement to view the created extensions.

    • Delete the extension

      DROP EXTENSION rds_ai;

Use default models to implement AI capabilities

The rds_ai extension supports the default models described in the following table. You can also create custom models based on your requirements.

Interface type

Parameter

Data type

Default model

Prompt API

rds_ai.default_prompt_model

enum

  • Qwen-Plus (default value)

  • Qwen-Max

  • Qwen-Turbo

Embedding API

rds_ai.default_embed_model

enum

text-embedding-v3

Note

To change the default model of the extension, add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure instance parameters, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.

For example, you can set the Running Value of the shared_preload_libraries parameter to 'pg_stat_statements,auto_explain,rds_ai'.

The default model of the rds_ai extension can provides the following AI capabilities:

Basic settings

  1. Before you use rds_ai to invoke a large language model (LLM), configure an API key.

    -- Configure an API key for the specified model.
    SELECT rds_ai.update_model('qwen-plus', 'token', 'sk-****'); 
    
    -- Configure an API key for all models specified by rds_ai.model_list.
    SELECT rds_ai.update_model(model_name,'token','sk-****') FROM rds_ai.model_list;
  2. Execute the following statements to configure the URLs for default models:

    --qwen-plus model
    SELECT rds_ai.update_model('qwen-plus', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); 
    
    --qwen-max model
    SELECT rds_ai.update_model('qwen-max', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); 
    
    --qwen-turbo model
    SELECT rds_ai.update_model('qwen-turbo', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); 
    
    --text-embedding-v3 model
    SELECT rds_ai.update_model('text-embedding-v3', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'); 
  3. The rds_ai extension uses the pgsql-http extension to remotely invoke models. Execute the following statements to configure the timeout settings to interrupt long-running invocation.

    Note

    The following timeout settings are available only in the current session. You must configure them again when you establish a new connection.

    -- Configure the request timeout period in milliseconds. 
    SET http.timeout_msec TO 200000;
    
    SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000');
    
    -- Configure the connection timeout period.
    SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');

LLM-based Q&A

By default, the rds_ai extension uses the text generation model provided by Qwen to implement LLM-based Q&A.

  1. (Optional) Configure the default model used to implement LLM-based Q&A. By default, the Qwen-Plus model is used.

    Note

    To change the default model of the extension, add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure instance parameters, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.

    For example, you can set the Running Value of the shared_preload_libraries parameter to 'pg_stat_statements,auto_explain,rds_ai'.

    SET rds_ai.default_prompt_model TO "qwen-max" ;
  2. Use the rds_ai.prompt function to invoke the default model for Q&A services.

    Sample command:

    SELECT rds_ai.prompt('Give me a recipe for a dish using potatoes, eggplants, and radishes. ');

    Complete command:

    SELECT rds_ai.prompt(
      model_name=>'qwen-plus', -- Specify the LLM used for  the Q&A service.
      content=>'Give me a recipe for a dish using potatoes, eggplants, and radishes. ',  -- Specify the question.
      args=>'{"top_p": 0.7}'::jsonb -- Specify the parameters required for invoking the model.
    );

Text-to-vector conversion

By default, the rds_ai extension uses the text-embedding-v3 model to convert texts to vectors.

  1. Convert the input text to vectors.

    • You can invoke the rds_ai.embed function to convert input texts to vectors without specifying parameters. By default, texts are converted to dense vectors.

      SELECT rds_ai.embed(
          'The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again' -- Specify the texts that you want to convert.
      );
    • You can also specify parameters to convert input texts to sparse vectors.

      -- Convert the input text to sparse vectors that have 250,002 dimensions by default.
      SELECT rds_ai.embed(
          content=>'The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again',
          args=>'{"output_type": "sparse"}'::jsonb -- Specify parameters for the conversion.
      );
  2. Save the vectors to the specified table.

    1. Create a table to save the vectors. In this example, a table named test_embed is created.

      CREATE TABLE test_embed(a text, b vector(1024), c sparsevec(250002));
    2. Insert the text that you want to convert into the table.

      INSERT INTO test_embed (a) values ('hello world');
    3. Use the rds_ai.embed function to invoke the text-embedding-v3 model for text-to-vector conversion. Then, write the vectors to the test_embed table.

      • Write dense vectors to the table.

        UPDATE test_embed 
        SET b = rds_ai.embed(a, '{"output_type": "dense"}'::jsonb)::vector;
      • Write sparse vectors to the table.

        UPDATE test_embed 
        SET c = rds_ai.embed(a, '{"output_type": "sparse"}'::jsonb)::sparsevec;

Top-N similar vector retrieval

By default, the rds_ai extension uses the text-embedding-v3 model to retrieve similar vectors.

  1. Create a test table.

    -- Create a test table named test_rag.
    CREATE TABLE test_rag (
        id SERIAL PRIMARY KEY,
        chunk TEXT
    );
    
    -- Add the embedding column to store the vectors converted from the text chunk.
    ALTER TABLE test_rag ADD COLUMN embedding VECTOR(1024);
    
    UPDATE test_rag SET embedding=rds_ai.embed(chunk)::vector;
  2. Create a vector index for embeddings.

    Note

    You must specify the vector_cosine_ops class for the vector index that you want to create.

    -- Create a vector index based on the HNSW algorithm.
    CREATE INDEX ON test_rag USING hnsw (embedding vector_cosine_ops);
    -- Create a vector index based on the IVFFlat algorithm. 
    CREATE INDEX ON test_rag USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
  3. Use the rds_ai.retrieve function to invoke the text-embedding-v3 model to retrieve vectors in the test table.

    Sample command:

    SELECT * FROM rds_ai.retrieve
    ('Why is PostgreSQL the most advanced open source database',
    'public', 'test_rag', 'chunk', 'embedding');

    Complete command:

    SELECT * FROM rds_ai.retrieve
    (
      embed_model=>'text-embedding-v3', -- Specify the model used for vector retrieval.
      question=>'Why is PostgreSQL the most advanced open source database',
      source_schema=>'public', -- Specify the schema of the table that stores the vectors to be retrieved.
      source_table=>'test_rag', -- Specify the table that stores the vectors to be retrieved.
      chunk_col=>'chunk', -- Specify the column that stores the chunk text.
      vector_col=>'embedding', -- Specify the column that stores the vectors.
      -- Specify the following default parameters.
      topn=>10, -- Specify the top N value for vector retrieval.
      embed_args=>'{}'::jsonb, -- Specify the parameter that is required by the model.
      distance_type=>'cosine' -- Specify the algorithm used to calculate the distance between vectors. Valid values: L1, L2, cosine, and negative.
    );

RAG-based Q&A

By default, the rds_ai extension uses the text-embedding-v3 model and the text generation model provided by Qwen to implement RAG-based Q&A services.

Sample command:

-- rds_ai uses the text-embedding-v3 model for vector-related operations. The text generation model used by rds_ai is specified by the rds_ai.default_prompt_model parameter.
SELECT * FROM rds_ai.rag
('Why is PostgreSQL the most advanced open source database', 
'public', 'test_rag', 'chunk', 'embedding');

Complete command:

SELECT * FROM rds_ai.rag
(
  embed_model=>'text-embedding-v3', -- Specify the vector model used for RAG-based Q&A services.
  prompt_model=>'qwen-plus', -- Specify the prompt model used for RAG-based Q&A.
  question=>'Why is PostgreSQL the most advanced open source database',
  source_schema=>'public', -- Specify the schema of the table that stores the vectors to be retrieved.
  source_table=>'test_rag', -- Specify the table that stores the vectors to be retrieved.
  chunk_col=>'chunk', -- Specify the column that stores the chunk text.
  vector_col=>'embedding', -- Specify the column that stores the vectors.
  -- Specify the following default parameters.
  topn=>10,
  embed_args=>'{}'::jsonb,  -- Specify the parameter that is required by the vector model used for RAG-based Q&A.
  prompt_args=>'{}'::jsonb, -- Specify the parameter that is required by the prompt model used for RAG-based Q&A.
  distance_type=>'L2' -- Specify the algorithm used to calculate the distance between vectors. Valid values: L1, L2, cosine, and inner product.
);

Use custom models to implement AI capabilities

Important

Contact us if you have problems when you add custom models.

You can add a custom model by inserting a record for the model into the metadata table rds_ai.model_list. The following table shows the fields included in the rds_ai.model_list table:

Field

Data type

Description

model_name

name

The name of the model. This field is the primary key of the rds_ai.model_list table.

request_type

text

The method of HTTP requests sent to invoke this model.

request_header

http.http_header[]

The HTTP request headers that include authentication information. The data type of this field is defined by the pgsql-http extension.

uri

text

The URL of the model.

content_type

text

The content type of the request. Example: application/json.

content_template

text

The template of the request body. You must replace the placeholders in the template with actual values when you invoke the model.

json_path

text

The SQL statement used to resolve the response and obtain required information.

token

text

The token specified in the headers.

Basic operations

  • Add a custom model

    SELECT rds_ai.add_model(
        'test-model',                   -- The model name.
        'POST',                         -- The HTTP request method.
        ARRAY[('Authorization', 'Bearer %s')]::http.http_header[], -- http_header
        'https://****.com',           -- The model URL specified in the request.
        'application/json',            -- The content type of the request.
        '{"key":"%s"}',               -- The template of the request body.
        'SELECT %L'          -- The SQL statement used to resolve the response.
    );
  • Use a custom model

    • The following command shows how to use the rds_ai.raw_invoke_model function to invoke Qwen-Plus based on its settings in the metadata table. In the request, an array is specified to fulfill the template. The complete response is returned.

      SELECT * FROM
      rds_ai.raw_invoke_model('qwen-plus', ARRAY['who are you']);
    • The following command shows how to use the rds_ai.invoke_model function to invoke Qwen-Plus based on its settings in the metadata table. Specific fields are resolved from the response and returned based on the json_path field specified for Qwen-Plus.

      SELECT * FROM
      rds_ai.invoke_model('qwen-plus', ARRAY['who are you']);
  • Delete a custom model

    SELECT rds_ai.del_model('test-model');

Add a RAG-based PAI model

To add a RAG-based PAI model, we recommend that you deploy your RDS for PostgreSQL instance and the Elastic Algorithm Service (EAS) module of Platform for AI (PAI) in the same VPC, and configure the VPC endpoint of PAI in the settings of rds_ai. In addition, you must use the VPC endpoint of the RDS for PostgreSQL instance to connect to the EAS module of PAI. This way, different services communicate only within the VPC to ensure data security.

  1. Deploy a RAG-based service in the EAS module of PAI. For more information, see Use the EAS module of PAI and ApsaraDB RDS for PostgreSQL to deploy a RAG-based LLM chatbot.

  2. Obtain the invocation information of the RAG-based service.

    1. Click the name of the RAG-based service to go to the Service Details page.

    2. In the Basic Information section, click View Endpoint Information.

    3. In the Invocation Method dialog box, obtain the endpoint and token of the service.

  3. Add the RAG-based PAI model.

    SELECT rds_ai.add_model(
        'pai-rag',      -- The model name.
        'POST',         -- The HTTP request method.
        ARRAY[('Authorization','%s')]::http.http_header[],  -- The request header.
        'http://rds-pai-rag-demo.****.cn-hangzhou.pai-eas.aliyuncs.com/service/query',  -- The model URL specified in the request.
        'application/json',  -- The content type of the request.
        '{
            "question": "%s"
        }',  -- The template of the request body.
        'SELECT (%L::jsonb->''answer'')::text'  -- The SQL statement used to resolve the response.
    );
    Note
  4. Configure the token of the RAG-based PAI model.

    SELECT rds_ai.update_model('pai-rag', 'token','MTFkYjMwZjgzYzA1YmE2N2YyNWMxM2NkNDVjMjEzNjYxMDAzMzE5****');
  5. (Optional) Test the RAG-based PAI model.

    Sample command

    SELECT rds_ai.invoke_model('pai-rag', ARRAY['Which parameters may result in the accumulation of WAL logs']);

Add a RAG-based Function Compute model

  1. Deploy an AgentCraft application in Function Compute.

  2. Create an agent and configure client access information for the agent.

  3. Add a RAG-based Function Compute model.

    SELECT rds_ai.add_model(
        'fc-rag',      -- The model name.
        'POST',         -- The HTTP request method.
        ARRAY[('Authorization','Bearer %s')]::http.http_header[],  -- The request header.
        'https://agentcrckend-de-obnhjsknam.cn-hangzhou.fcapp.run/v1/chat/completions',  -- The model URL specified in the request.
        'application/json',  -- The content type of the request.
        '{ 
            "messages":[ { "role": "user", "content": "%s" } ], 
            "stream": false, 
            "max_tokens": 1024 
         }', 
        'SELECT (%L::jsonb->''choices''->0->''message''->>''content'')::text'  -- The SQL statement used to resolve the response.
    );
    Note

    Replace the URL in the preceding example with the actual model URL appended with /completions.

  4. Configure the token of the RAG-based Function Compute model.

    SELECT rds_ai.update_model('fc-rag', 'token','8UiGAziWgYGPxM3qR5sAChBfDJRt****');
  5. (Optional) Test the RAG-based Function Compute model.

    Sample command:

    SELECT rds_ai.invoke_model('fc-rag', ARRAY['Which parameters may result in the accumulation of WAL logs']);

Functions provided by rds_ai

rds-ai.raw_invoke_model: invokes custom models and returns the complete HTTP response.

rds-ai.raw_invoke_model(
    model_name TEXT, 
    params_list TEXT[]
) 
RETURNS http.http_response

Parameter

Data type

Description

Example

model_name

text

The name of the model.

'qwen-plus'

param_list

text[]

A sequential list that includes the parameters specified in the content_template field of the model to invoke.

ARRAY['who are you', '{}']

rds-ai.invoke_model: invokes custom models and returns the result that is resolved from the HTTP response by using the SQL statement specified for the json_path field.

rds-ai.invoke_model(
    model_name TEXT, 
    params_list TEXT[]
) 
RETURNS http.http_response

Parameter

Data type

Description

Example

model_name

text

The name of the model.

'qwen-plus'

param_list

text[]

A sequential list that includes the parameters specified in the content_template field of the model to invoke.

ARRAY['who are you', '{}']

rds_ai.embed: converts the specified text into vectors by using the specified model and returns the result that is resolved from the HTTP response by using the SQL statement specified for the json_path field.

rds_ai.embed(
    model_name TEXT, 
    content TEXT, 
    args jsonb DEFAULT '{}' 
) RETURNS text

Parameter

Data type

Description

Example

model_name

text

The name of the model.

Note

If the model_name parameter is not specified, text-embedding-v3 is invoked.

'text-embedding-v3'

content

text

The text that you want to convert to vectors.

'who are you'

args

jsonb

The parameter that is required to invoke the model.

'{"output_type": "dense"}'

rds_ai.prompt: invokes the specified model to perform a prompt-based operation.

rds_ai.prompt(
    model_name TEXT, 
    content TEXT, 
    args jsonb DEFAULT '{}'
) 
RETURNS text

Parameter

Data type

Description

Example

model_name

text

The name of the model.

Note

If the model_name parameter is not specified, Qwen-Plus is invoked.

'qwen-plus'

content

text

The text that you want to convert to vectors.

'who are you'

args

jsonb

The parameter that is required to invoke the model.

'{"top_p": 0.2}'

rds_ai.retrieve: invokes the specified model to retrieve stored vectors based on the specified text.

rds_ai.retrieve(
    model_name TEXT, 
    question TEXT, 
    source_schema TEXT, 
    source_table TEXT, 
    chunk_col TEXT, 
    vector_col TEXT, 
    topn INT DEFAULT 10, 
    embed_args jsonb DEFAULT '{}', 
    distance_type TEXT DEFAULT 'L2'
) 
RETURNS TABLE(chunk TEXT, distance float);

Parameter

Data type

Description

Example

model_name

text

The name of the model.

Note

If the model_name parameter is not specified, text-embedding-v3 is invoked.

'text-embedding-v3'

question

text

The text based on which you want to retrieve vectors.

'who are you'

source_schema

text

The schema of the table that stores the vectors to retrieve.

'public'

source_table

text

The name of the table that stores the vectors to retrieve.

'vec_tbl'

chunk_col

text

The column that stores the chunk text.

'chunk'

vector_col

text

The vector column of the table.

'vec'

topn

int

The top N value for vector retrieval. Default value: 10.

20

embed_args

jsonb

The parameter that is required for text-to-vector conversion. This parameter acts the same role as the args parameter of the rds_ai.embed function.

'{"output_type": "dense"}'

distance_type

text

The algorithm used to calculate the distance between vectors. Default value: L2.

'L1'

rds_ai.rag: invokes the specified model to perform a prompt-based operation on the result of vector retrieval.

rds_ai.rag(
    embed_model TEXT,
    prompt_model TEXT,
    question TEXT,
    source_schema TEXT,
    source_table TEXT,
    chunk_col TEXT,
    vector_col TEXT,
    topn INT DEFAULT 10,
    embed_args jsonb DEFAULT '{}',
    prompt_args jsonb DEFAULT '{}',
    distance_type TEXT DEFAULT 'L2'
) 
returns text

Parameter

Data type

Description

Example

embed_model

text

The name of the model.

Note

If the model_name parameter is not specified, Qwen-Plus is invoked.

'qwen-plus'

question

text

The text based on which you want to retrieve vectors.

'who are you'

source_schema

text

The schema of the table that stores the vectors to retrieve.

'public'

source_table

text

The name of the table that stores the vectors to retrieve.

'vec_tbl'

chunk_col

text

The column that stores the chunk text.

'chunk'

vector_col

text

The vector column of the table.

'vec'

topn

int

The top N value for vector retrieval. Default value: 10.

20

embed_args

jsonb

The parameter that is required for text-to-vector conversion. This parameter acts the same role as the args parameter of the rds_ai.embed function.

'{"output_type": "dense"}'

prompt_args

jsonb

The parameter that is required for prompt-based operations. This parameter acts the same role as the args parameter of the rds_ai.prompt function.

'{"top_p": 0.2}'

distance_type

text

The algorithm used to calculate the distance between vectors. Default value: L2.

'L1'

rds_ai.show_models: views the metadata of configured models.

rds_ai.show_models() 
RETURNS setof rds_ai.model_list

rds_ai.del_model: deletes a model.

rds_ai.del_model (model_name text) 
RETURNS void

Parameter

Data type

Description

Example

model_name

text

The name of the model that you want to delete.

'qwen-plus'

rds_ai.add_model: adds a model.

rds_ai.add_model(
    model_name TEXT,
    request_type TEXT,
    request_header http.http_header[],
    uri TEXT,
    content_type TEXT,
    content_template TEXT,
    json_path TEXT
) 
RETURNS TEXT

Parameter

Data type

Description

Example

model_name

text

The name of the model that you want to add.

'text-embedding-v3'

request_type,

text

The request method.

'POST'

request_header

http.http_header[]

The headers in the HTTP request. Use the value of the token field to replace the placeholder in actual use.

ARRAY[('Authorization', 'Bearer %s')]

uri

text

The URL of the model.

'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'

content_type

text

The content type of the request.

'application/json'

content_template

text

The template of the request body. When you call the rds-ai.invoke_model function, use the value of the param_list parameter to replace the parameters in the template. The value of the parameters parameter is in JSON format.

'{

"model": "text-embedding-v3",

"input": {"texts": ["%s"]},

"parameters": %s

}'

json_path

text

The SQL statement used to resolve the response. When you call the rds-ai.invoke_model function, the SQL statement is used to resolve the JSON data included in the value of content_template.

If you do not know how to configure the SQL statement, set the parameter value to 'SELECT %L'.

'SELECT %L'

rds_ai.update_model: updates the metadata of configured models.

rds_ai.update_model(
    model_name TEXT,
    config_name TEXT, 
    value TEXT
) 
RETURNS void

Parameter

Data type

Description

Example

model_name

text

The name of the model for which you want to update metadata.

'text-embedding-v3'

config_name

text

The field that you want to update. For more information about the fields that you can update, see the fields described in the rds_ai.add_model section.

'uri'

value

text

The new value of the field that you want to update.

'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'