AI (rds_ai) - ApsaraDB RDS - Alibaba Cloud Documentation Center

ApsaraDB RDS for PostgreSQL provides the rds_ai AI extension. This extension integrates advanced models from Alibaba Cloud Model Studio, such as Qwen, text embedding,. You can use this extension to implement various scenarios in your ApsaraDB RDS for PostgreSQL database, including large language model (LLM) Q&A, text-to-vector conversion, top-N similar vector retrieval, and retrieval-augmented generation (RAG) Q&A. In addition, rds_ai supports custom models, which lets you add the models that you need to implement a wide range of AI applications in ApsaraDB RDS for PostgreSQL.

Prerequisites

Your instance meets the following version requirements:
Major version
Minor engine version
PostgreSQL 16
20241230 or later
PostgreSQL 14, 15, 17
20250430 or later
PostgreSQL 18
20251130 or later
For instructions on minor engine version upgrades, see Upgrade a minor engine version.
You have created a privileged account for ApsaraDB RDS for PostgreSQL. For more information, see Create an account.
The models used in this topic are provided by Alibaba Cloud Model Studio. You must activate Model Studio and obtain an API key. For more information, see Obtain an API key.

Network configuration

By default, an ApsaraDB RDS for PostgreSQL database cannot access external networks. To allow access to external models, you can configure a NAT Gateway for the VPC where your ApsaraDB RDS for PostgreSQL instance resides. For more information about NAT gateways, see Use the SNAT feature of an Internet NAT gateway to access the Internet.

Create and delete the extension

Important

Before you install the extension, confirm the major and minor engine versions of your ApsaraDB RDS for PostgreSQL instance to ensure that the extension is supported. For more information, see Prerequisites.

Install the plugin on the Plugin Management page.
1. Go to the RDS Instances page, select a region in the top navigation bar, and then click the ID of the target instance.
2. In the navigation pane on the left, click Plug-ins.
3. On the Plugin Marketplace page, click Install for the rds_ai plugin.
4. In the window that appears, select the target database and account. Then, click Install to install the extension in the selected database.
(Optional) You can uninstall plugins on the Installed Plugins tab of the Manage Plugins page.
Install the extension using an SQL command
Create the extension
```
CREATE EXTENSION IF NOT EXISTS rds_ai CASCADE;
```
Note
- Only privileged accounts can run this command. For more information about how to create a privileged account, see Create an account.
- When you create the rds_ai extension, the pgvector User Guide and pgsql-http extensions are also created.
- You can execute SELECT * FROM pg_extension; to view the installed extensions.
Delete a plugin
```
DROP EXTENSION rds_ai;
```

Default models

The rds_ai extension supports the following default models. You can also add custom models as needed.

API type	Parameter	Parameter type	Default model
Prompt API	rds_ai.default_prompt_model	enum	qwen-plus (default) qwen-max qwen-turbo
Embed API	rds_ai.default_embed_model	enum	text-embedding-v3

Note

To change the default model, you must add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure parameters, see Set instance parameters. For example, change the Running Value to 'pg_stat_statements,auto_explain,rds_ai'.

The default models of the rds_ai extension provide the following capabilities:

LLM-based Q&A
Text-to-vector conversion
Top-N similar vector retrieval
RAG-based Q&A

Basic settings

Before you use rds_ai to call an LLM, you must configure the corresponding API key.

-- Set the API key for the target model.
SELECT rds_ai.update_model('qwen-plus', 'token', 'sk-****'); 

-- Set the API key for all models in rds_ai.model_list.
SELECT rds_ai.update_model(model_name,'token','sk-****') FROM rds_ai.model_list;

Run the following commands to configure the URLs of the default models.

-- qwen-plus model
SELECT rds_ai.update_model('qwen-plus', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); 

-- qwen-max model
SELECT rds_ai.update_model('qwen-max', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); 

-- qwen-turbo model
SELECT rds_ai.update_model('qwen-turbo', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); 

-- text-embedding-v3 model
SELECT rds_ai.update_model('text-embedding-v3', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding');

The rds_ai extension uses the pgsql-http extension to make remote calls to models. You can configure the following timeout settings to interrupt long-running calls.
Note
The following timeout settings are at the session level. If you create a new connection, you must configure them again.
```
-- Set the request timeout in milliseconds.
SET http.timeout_msec TO 200000;

SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000');

-- Set the connection timeout.
SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');
```

LLM-based Q&A

The default LLM-based Q&A model for the rds_ai extension is Qwen-Text Generation.

(Optional) You can set the default large language model for Q&A. If this is not configured, the system uses the qwen-plus model by default.
Note
To change the default model, you must add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure parameters, see Set instance parameters. For example, change the Running Value to 'pg_stat_statements,auto_explain,rds_ai'.
```
SET rds_ai.default_prompt_model TO 'qwen-max';
```

Use rds_ai.prompt to call the default LLM for Q&A. For example:

SELECT rds_ai.prompt('Give me a recipe using carrots, potatoes, and eggplants.');

Complete API call:

SELECT rds_ai.prompt(
  model_name=>'qwen-plus', -- model name
  content=>'Give me a recipe using carrots, potatoes, and eggplants.',  -- question
  args=>'{"top_p": 0.7}'::jsonb -- parameters for calling the large model
);

Text-to-vector conversion

The default text-to-vector conversion model for the rds_ai extension is text-embedding-v3.

By default, this operation outputs a dense vector.

SELECT rds_ai.embed(
    'A strong wind blows from a high sky, and monkeys cry sadly. The water is clear and the sand is white, and birds are flying back. The leaves of the boundless forest fall, and the endless Yangtze River rolls on.' -- text to be converted to a vector
);

You can specify to output a sparse vector.

-- Convert text to a sparse vector, with 250002 dimensions by default.
SELECT rds_ai.embed(
    content=>'A strong wind blows from a high sky, and monkeys cry sadly. The water is clear and the sand is white, and birds are flying back. The leaves of the boundless forest fall, and the endless Yangtze River rolls on.',
    args=>'{"output_type": "sparse"}'::jsonb -- parameter list
);

Store the vector conversion result in the target table.

Create the test_embed table.

CREATE TABLE test_embed(a text, b vector(1024), c sparsevec(250002));

Insert the text content to be converted.

INSERT INTO test_embed (a) values ('hello world');

Use rds_ai.embed to call the text-embedding-v3 model and write the conversion result to the test_embed table.

Write a dense vector.

UPDATE test_embed 
SET b = rds_ai.embed(a, '{"output_type": "dense"}'::jsonb)::vector;

Write a sparse vector.

UPDATE test_embed 
SET c = rds_ai.embed(a, '{"output_type": "sparse"}'::jsonb)::sparsevec;

Top-N similar vector retrieval

The rds_ai extension uses the text-embedding-v3 model by default for similar vector retrieval. For example:

Create a test table.

-- Create the test table test_rag.
CREATE TABLE test_rag (
    id SERIAL PRIMARY KEY,
    chunk TEXT
);

-- Add an embedding column to store the vectors converted from the chunk.
ALTER TABLE test_rag ADD COLUMN embedding VECTOR(1024);

UPDATE test_rag SET embedding=rds_ai.embed(chunk)::vector;

Create a vector index.

Note

When you create a vector index, you must select the vector_cosine_ops type.

-- Create an HNSW index.
CREATE INDEX ON test_rag USING hnsw (embedding vector_cosine_ops);
-- Create an IVFFlat index.
CREATE INDEX ON test_rag USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Use rds_ai.retrieve to call the text-embedding-v3 model to perform vector retrieval in the test table.

SELECT * FROM rds_ai.retrieve
('Why is PostgreSQL considered the most advanced open-source database?',
'public', 'test_rag', 'chunk', 'embedding');

Complete API call:

SELECT * FROM rds_ai.retrieve
(
  embed_model=>'text-embedding-v3', -- The vector model to use.
  question=>'Why is PostgreSQL considered the most advanced open-source database?',
  source_schema=>'public', -- The schema of the table for the similar vector search.
  source_table=>'test_rag', -- The table for the similar vector search.
  chunk_col=>'chunk', -- The chunk column.
  vector_col=>'embedding', -- The vector column.
  -- The following are default parameters.
  topn=>10, -- Set top_n.
  embed_args=>'{}'::jsonb, -- Pass parameters for the text-to-vector model.
  distance_type=>'cosine' -- The distance algorithm. Supported values are L1, L2, cosine, and negative.
);

RAG-based Q&A

The rds_ai extension uses the text-embedding-v3 model and the Qwen-Text Generation model by default for RAG-based Q&A.

-- Vector model: text-embedding-v3. Text generation model: specified by the rds_ai.default_prompt_model parameter.
SELECT * FROM rds_ai.rag
('Why is PostgreSQL considered the most advanced open-source database?', 
'public', 'test_rag', 'chunk', 'embedding');

Complete API call:

SELECT * FROM rds_ai.rag
(
  embed_model=>'text-embedding-v3', -- The vector model to use.
  prompt_model=>'qwen-plus', -- The prompt model to use.
  question=>'Why is PostgreSQL considered the most advanced open-source database?',
  source_schema=>'public', -- The schema of the table for the similar vector search.
  source_table=>'test_rag', -- The table for the similar vector search.
  chunk_col=>'chunk', -- The chunk column.
  vector_col=>'embedding', -- The vector column.
  -- The following are default parameters.
  topn=>10,
  embed_args=>'{}'::jsonb,  -- Parameters for the text-to-vector model.
  prompt_args=>'{}'::jsonb, -- Parameters for the large language model prompt.
  distance_type=>'L2' -- The distance algorithm. Supported values are L1, L2, cosine, and inner product.
);

Custom models

Important

If you encounter any issues when you add a custom model, contact us.

You can add a custom model using the rds_ai.model_list metadata table. The fields of the rds_ai.model_list table are as follows:

Field	Type	Description
model_name	name	The model name. This field has a PRIMARY KEY constraint.
request_type	text	The HTTP method to call.
request_header	http.http_header[]	A type provided by the pgsql-http extension. It records the header information of the HTTP request, mainly for authentication.
uri	text	The URL of the model.
content_type	text	The request type, such as application/json.
content_template	text	The template for the request body. It usually contains placeholders that are filled in during the API call.
json_path	text	Parses the SQL based on http_response and extracts the required content.
token	text	The key in the header.

Add a custom model

SELECT rds_ai.add_model(
    'test-model',                   -- model name
    'POST',                         -- http request type
    ARRAY[('Authorization', 'Bearer %s')]::http.http_header[], -- http_header
    'https://****.com',           -- request URL
    'application/json',            -- request Content-type
    '{"key":"%s"}',               -- request_body template
    'SELECT %L'          -- parse http request result
);

Use a custom model

Use the rds_ai.raw_invoke_model function to send a request based on the configuration of the qwen-plus model. Use an ARRAY to fill the call template. The function returns the complete HTTP request result.
```
SELECT * FROM
rds_ai.raw_invoke_model('qwen-plus', ARRAY['who are you']);
```
Use the rds_ai.invoke_model function to make a request based on the configuration of the qwen-plus model. The function extracts the specified fields according to the json_path field in the configuration.
```
SELECT * FROM
rds_ai.invoke_model('qwen-plus', ARRAY['who are you']);
```

Delete a custom model

SELECT rds_ai.del_model('test-model');

Add a PAI-RAG model

We recommend that you deploy PAI-EAS and the ApsaraDB RDS for PostgreSQL instance in the same VPC. Configure the PAI VPC address in the rds_ai extension. The connection address for PAI-EAS must also use the ApsaraDB RDS for PostgreSQL VPC address. This ensures that all communication occurs within your VPC and does not traverse the public network, which enhances security.

Deploy the RAG service in PAI-EAS. For more information, see Deploy an LLM-based RAG chatbot using PAI-EAS and ApsaraDB RDS for PostgreSQL.
Obtain the call information for the RAG service.
1. Click the RAG service name to go to the Service Details page.
2. In the Basic Information section, click View Endpoint Information.
3. In the Invocation Information dialog box, obtain the service endpoint and Token.

Add the PAI-RAG model.

SELECT rds_ai.add_model(
    'pai-rag',      -- model name
    'POST',         -- request method
    ARRAY[('Authorization','%s')]::http.http_header[],  -- request header,
    'http://rds-pai-rag-demo.****.cn-hangzhou.pai-eas.aliyuncs.com/service/query',  -- request URL
    'application/json',  -- request content format
    '{
        "question": "%s"
    }',  -- request body
    'SELECT (%L::jsonb->''answer'')::text'  -- parsing path
);

Note

Replace the model URL with the actual URL and append /service/query.
For more information about calling PAI-RAG models, see Call PAI models using APIs.

Configure the token for the PAI-RAG model.

SELECT rds_ai.update_model('pai-rag', 'token','MTFkYjMwZjgzYzA1YmE2N2YyNWMxM2NkNDVjMjEzNjYxMDAzMzE5****');

Verify the new PAI-RAG model. For example:

SELECT rds_ai.invoke_model('pai-rag', ARRAY['What parameters are related to WAL log accumulation?']);

Add an FC Function Compute RAG model

Deploy the AgentCraft application in Function Compute. For more information, see Cloud Deployment of AgentCraft.
You have created an agent and a Client Access.
For example, when you created the Client Integration as described in Integrate an agent into a DingTalk robot, you obtained and saved the endpoint and Token.

Add the FC Function Compute RAG model.

SELECT rds_ai.add_model(
    'fc-rag',      -- model name
    'POST',         -- request method
    ARRAY[('Authorization','Bearer %s')]::http.http_header[],  -- request header,
    'https://agentcrckend-de-obnhjsknam.cn-hangzhou.fcapp.run/v1/chat/completions',  -- request URL
    'application/json',  -- request content format
    '{ 
        "messages":[ { "role": "user", "content": "%s" } ], 
        "stream": false, 
        "max_tokens": 1024 
     }', 
    'SELECT (%L::jsonb->''choices''->0->''message''->>''content'')::text'  -- parsing path
);

Note

Replace the model URL with the actual URL and append /completions.

Configure the token for the FC Function Compute RAG model.

SELECT rds_ai.update_model('fc-rag', 'token','8UiGAziWgYGPxM3qR5sAChBfDJRt****');

Verify the new FC Function Compute RAG model. For example:

SELECT rds_ai.invoke_model('fc-rag', ARRAY['What parameters are related to WAL log accumulation?']);

Provided functions

rds-ai.raw_invoke_model: Calls a custom model and returns the complete HTTP response (http_response).

rds-ai.raw_invoke_model(
    model_name TEXT, 
    params_list TEXT[]
) 
RETURNS http.http_response

Parameter	Type	Description	Example
model_name	text	The name of the model to call.	'qwen-plus'
param_list	text[]	The content to fill into the model's corresponding `content_template` field, in order.	ARRAY['who are you', '{}']

rds-ai.invoke_model: Calls a custom model, parses the HTTP response content based on the json_path field configured in the metadata table, and returns the result.

rds-ai.invoke_model(
    model_name TEXT, 
    params_list TEXT[]
) 
RETURNS http.http_response

Parameter	Type	Description	Example
model_name	text	The name of the model to call.	'qwen-plus'
param_list	text[]	The content to fill into the model's corresponding `content_template` field, in order.	ARRAY['who are you', '{}']

rds_ai.embed: Uses the specified model to convert text to a vector and returns the result based on json_path.

rds_ai.embed(
    model_name TEXT, 
    content TEXT, 
    args jsonb DEFAULT '{}' 
) RETURNS text

Parameter	Type	Description	Example
model_name	text	The name of the model to call. Note If the model_name parameter is not passed, the default model (text-embedding-v3) is called.	'text-embedding-v3'
content	text	The text to convert to a vector.	'who are you'
args	jsonb	The parameters specified when calling the model.	'{"output_type": "dense"}'

rds_ai.prompt: Executes a prompt operation using the specified model.

rds_ai.prompt(
    model_name TEXT, 
    content TEXT, 
    args jsonb DEFAULT '{}'
) 
RETURNS text

Parameter	Type	Description	Example
model_name	text	The name of the model to call. Note If the model_name parameter is not passed, the default model (qwen-plus) is called.	'qwen-plus'
content	text	The text to convert to a vector.	'who are you'
args	jsonb	The parameters specified when calling the model.	'{"top_p": 0.2}'

rds_ai.retrieve: Performs a vector search on the specified text to retrieve stored vectors using the specified model.

rds_ai.retrieve(
    model_name TEXT, 
    question TEXT, 
    source_schema TEXT, 
    source_table TEXT, 
    chunk_col TEXT, 
    vector_col TEXT, 
    topn INT DEFAULT 10, 
    embed_args jsonb DEFAULT '{}', 
    distance_type TEXT DEFAULT 'L2'
) 
RETURNS TABLE(chunk TEXT, distance float);

Parameter	Type	Description	Example
model_name	text	The name of the model to call. Note If the model_name parameter is not passed, the default model (text-embedding-v3) is called.	'text-embedding-v3'
question	text	Text to be retrieved.	'who are you'
source_schema	text	The schema where the table for vector retrieval is located.	'public'
source_table	text	The name of the table for vector retrieval.	'vec_tbl'
chunk_col	text	The chunk column for vector retrieval.	'chunk'
vector_col	text	The vector column of the table.	'vec'
topn	int	Specifies the top N for vector retrieval. The default is 10.	20
embed_args	jsonb	The parameters for text-to-vector conversion, same as the args of the rds_ai.embed function.	'{"output_type": "dense"}'
distance_type	text	The method for calculating vector distance. The default is L2.	'L1'

rds_ai.rag: Executes a prompt operation on the vector retrieval result using the specified model.

rds_ai.rag(
    embed_model TEXT,
    prompt_model TEXT,
    question TEXT,
    source_schema TEXT,
    source_table TEXT,
    chunk_col TEXT,
    vector_col TEXT,
    topn INT DEFAULT 10,
    embed_args jsonb DEFAULT '{}',
    prompt_args jsonb DEFAULT '{}',
    distance_type TEXT DEFAULT 'L2'
) 
returns text

Parameter	Type	Description	Example
embed_model	text	The name of the model to call. Note If the model_name parameter is not passed, the default model (qwen-plus) is called.	'qwen-plus'
question	text	The text to retrieve.	'who are you'
source_schema	text	The schema where the table for vector retrieval is located.	'public'
source_table	text	The name of the table for vector retrieval.	'vec_tbl'
chunk_col	text	The chunk column for vector retrieval.	'chunk'
vector_col	text	The vector column of the table.	'vec'
topn	int	Specifies the top N for vector retrieval. The default is 10.	20
embed_args	jsonb	The parameters for text-to-vector conversion, same as the args of the rds_ai.embed function.	'{"output_type": "dense"}'
prompt_args	jsonb	The parameters for the prompt operation, same as the args of the rds_ai.prompt function.	'{"top_p": 0.2}'
distance_type	text	The method for calculating vector distance. The default is L2.	'L1'

rds_ai.show_models: Displays the information of configured models.

rds_ai.show_models() 
RETURNS setof rds_ai.model_list

rds_ai.del_model: Deletes the information of a specified model.

rds_ai.del_model (model_name text) 
RETURNS void

Parameter	Type	Description	Example
model_name	text	The name of the model to delete.	'qwen-plus'

rds_ai.add_model: Adds a model.

rds_ai.add_model(
    model_name TEXT,
    request_type TEXT,
    request_header http.http_header[],
    uri TEXT,
    content_type TEXT,
    content_template TEXT,
    json_path TEXT
) 
RETURNS TEXT

Parameter	Type	Description	Example
model_name	TEXT	The name of the model to add.	'text-embedding-v3'
request_type,	TEXT	The request type.	'POST'
request_header	http.http_header[]	The header information in the HTTP request. When you make a call, use the token field to fill the placeholder.	ARRAY[('Authorization', 'Bearer %s')]
uri	TEXT	The URL of the model.	'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'
content_type	TEXT	The request type.	'application/json'
content_template	TEXT	The request body template. When you call the rds-ai.invoke_model function, use the param_list parameter value to fill this template. The value of the parameters parameter is of the JSON type.	'{ "model": "text-embedding-v3", "input": {"texts": ["%s"]}, "parameters": %s }'
json_path	TEXT	The parsing path. When you call the rds-ai.invoke_model function, this SQL statement is used to parse the JSON structure in content_template. If you are unsure of the specific parsing path, you can set it to `'SELECT %L'`. If you have confirmed the specific parsing path, replace the placeholder with it.	`'SELECT %L'`

rds_ai.update_model: Updates the information of a configured model.

rds_ai.update_model(
    model_name TEXT,
    config_name TEXT, 
    value TEXT
) 
RETURNS void

Parameter	Type	Description	Example
model_name	TEXT	The name of the model to update.	'text-embedding-v3'
config_name	TEXT	The field to update. For possible values, see the rds_ai.add_model function.	'uri'
value	TEXT	Update the value of the field.	'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding'

Major version	Minor engine version
PostgreSQL 16	20241230 or later
PostgreSQL 14, 15, 17	20250430 or later
PostgreSQL 18	20251130 or later