ApsaraDB RDS for PostgreSQL introduces support for the rds_ai extension that integrates the advanced models of Alibaba Cloud Model Studio, including Qwen and the text embedding model. This extension enables a wide range of applications, such as LLM-based Q&A service, text-to-vector conversion, top-N similar vector retrieval, and Q&A service based on retrieval-augmented generation (RAG). In addition, you can deploy custom models through the rds_ai extension to develop various AI-driven applications in RDS for PostgreSQL. This topic describes how to deploy this extension and use its AI capabilities.
Prerequisites
The major and minor engine versions of your RDS instance meet the requirements in the following table:
Major engine version
Minor engine version
PostgreSQL 16
20241230 and later
PostgreSQL 14, PostgreSQL 15, and PostgreSQL 17
20250430 or later
For more information about how to update the minor engine version, see Update the minor engine version.
A privileged account is created for your RDS instance. For more information about how to create a privileged account, see Create an account.
Alibaba Cloud Model Studio is activated and an API key is available. For more information, see Obtain an API key.
Network configurations
By default, an RDS for PostgreSQL instance can access resources only over the internal network. Therefore, you must create a NAT gateway for the virtual private cloud (VPC) in which the RDS instance resides to allow the RDS instance to access external models. For more information about NAT gateways, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
Create and delete rds_ai
Before creating the rds_ai extension, make sure that your RDS for PostgreSQL instance meets the requirements described in Prerequisites.
Create and delete the rds_ai extension on the Plug-in page
Log on to the ApsaraDB RDS console and go to the Instances page. In the top navigation bar, select the region in which the RDS instance resides. Then, find the RDS instance and click its ID.
In the left-side navigation pane, click Plug-ins.
On the Extension Marketplace tab, click Install in the rds_ai section.
On the dialog box that appears, select a database and an account. Then, click Install to create the rds_ai extension in the selected database.
(Optional) If you no longer need the rds_ai extension, you can delete it on the Installed Extensions page of the Extension Management tab.
Create and delete the rds_ai extension by executing a SQL statement
Create the extension
CREATE EXTENSION IF NOT EXISTS rds_ai CASCADE;
NoteOnly privileged accounts have the permissions to execute the preceding statement. For more information about how to create a privileged account, see Create an account.
When you create the rds_ai extension, the pgvector and pgsql-http extensions are automatically created.
You can execute the
SELECT * FROM pg_extension;
statement to view the created extensions.
Delete the extension
DROP EXTENSION rds_ai;
Use default models to implement AI capabilities
The rds_ai extension supports the default models described in the following table. You can also create custom models based on your requirements.
Interface type | Parameter | Data type | Default model |
Prompt API | rds_ai.default_prompt_model | enum |
|
Embedding API | rds_ai.default_embed_model | enum | text-embedding-v3 |
To change the default model of the extension, add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure instance parameters, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.
For example, you can set the Running Value of the shared_preload_libraries parameter to 'pg_stat_statements,auto_explain,rds_ai'
.
The default model of the rds_ai extension can provides the following AI capabilities:
Basic settings
Before you use rds_ai to invoke a large language model (LLM), configure an API key.
-- Configure an API key for the specified model. SELECT rds_ai.update_model('qwen-plus', 'token', 'sk-****'); -- Configure an API key for all models specified by rds_ai.model_list. SELECT rds_ai.update_model(model_name,'token','sk-****') FROM rds_ai.model_list;
Execute the following statements to configure the URLs for default models:
--qwen-plus model SELECT rds_ai.update_model('qwen-plus', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); --qwen-max model SELECT rds_ai.update_model('qwen-max', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); --qwen-turbo model SELECT rds_ai.update_model('qwen-turbo', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); --text-embedding-v3 model SELECT rds_ai.update_model('text-embedding-v3', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding');
The rds_ai extension uses the pgsql-http extension to remotely invoke models. Execute the following statements to configure the timeout settings to interrupt long-running invocation.
NoteThe following timeout settings are available only in the current session. You must configure them again when you establish a new connection.
-- Configure the request timeout period in milliseconds. SET http.timeout_msec TO 200000; SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000'); -- Configure the connection timeout period. SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');
LLM-based Q&A
By default, the rds_ai extension uses the text generation model provided by Qwen to implement LLM-based Q&A.
(Optional) Configure the default model used to implement LLM-based Q&A. By default, the Qwen-Plus model is used.
NoteTo change the default model of the extension, add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure instance parameters, see Modify the parameters of an ApsaraDB RDS for PostgreSQL instance.
For example, you can set the Running Value of the shared_preload_libraries parameter to
'pg_stat_statements,auto_explain,rds_ai'
.SET rds_ai.default_prompt_model TO "qwen-max" ;
Use the rds_ai.prompt function to invoke the default model for Q&A services.
Sample command:
SELECT rds_ai.prompt('Give me a recipe for a dish using potatoes, eggplants, and radishes. ');
Complete command:
SELECT rds_ai.prompt( model_name=>'qwen-plus', -- Specify the LLM used for the Q&A service. content=>'Give me a recipe for a dish using potatoes, eggplants, and radishes. ', -- Specify the question. args=>'{"top_p": 0.7}'::jsonb -- Specify the parameters required for invoking the model. );
Text-to-vector conversion
By default, the rds_ai extension uses the text-embedding-v3 model to convert texts to vectors.
Convert the input text to vectors.
You can invoke the rds_ai.embed function to convert input texts to vectors without specifying parameters. By default, texts are converted to dense vectors.
SELECT rds_ai.embed( 'The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again' -- Specify the texts that you want to convert. );
You can also specify parameters to convert input texts to sparse vectors.
-- Convert the input text to sparse vectors that have 250,002 dimensions by default. SELECT rds_ai.embed( content=>'The quality of the clothes is excellent, very beautiful, worth the wait, I like it and will buy here again', args=>'{"output_type": "sparse"}'::jsonb -- Specify parameters for the conversion. );
Save the vectors to the specified table.
Create a table to save the vectors. In this example, a table named test_embed is created.
CREATE TABLE test_embed(a text, b vector(1024), c sparsevec(250002));
Insert the text that you want to convert into the table.
INSERT INTO test_embed (a) values ('hello world');
Use the rds_ai.embed function to invoke the text-embedding-v3 model for text-to-vector conversion. Then, write the vectors to the test_embed table.
Write dense vectors to the table.
UPDATE test_embed SET b = rds_ai.embed(a, '{"output_type": "dense"}'::jsonb)::vector;
Write sparse vectors to the table.
UPDATE test_embed SET c = rds_ai.embed(a, '{"output_type": "sparse"}'::jsonb)::sparsevec;
Top-N similar vector retrieval
By default, the rds_ai extension uses the text-embedding-v3 model to retrieve similar vectors.
Create a test table.
-- Create a test table named test_rag. CREATE TABLE test_rag ( id SERIAL PRIMARY KEY, chunk TEXT ); -- Add the embedding column to store the vectors converted from the text chunk. ALTER TABLE test_rag ADD COLUMN embedding VECTOR(1024); UPDATE test_rag SET embedding=rds_ai.embed(chunk)::vector;
Create a vector index for embeddings.
NoteYou must specify the vector_cosine_ops class for the vector index that you want to create.
-- Create a vector index based on the HNSW algorithm. CREATE INDEX ON test_rag USING hnsw (embedding vector_cosine_ops); -- Create a vector index based on the IVFFlat algorithm. CREATE INDEX ON test_rag USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
Use the rds_ai.retrieve function to invoke the text-embedding-v3 model to retrieve vectors in the test table.
Sample command:
SELECT * FROM rds_ai.retrieve ('Why is PostgreSQL the most advanced open source database', 'public', 'test_rag', 'chunk', 'embedding');
Complete command:
SELECT * FROM rds_ai.retrieve ( embed_model=>'text-embedding-v3', -- Specify the model used for vector retrieval. question=>'Why is PostgreSQL the most advanced open source database', source_schema=>'public', -- Specify the schema of the table that stores the vectors to be retrieved. source_table=>'test_rag', -- Specify the table that stores the vectors to be retrieved. chunk_col=>'chunk', -- Specify the column that stores the chunk text. vector_col=>'embedding', -- Specify the column that stores the vectors. -- Specify the following default parameters. topn=>10, -- Specify the top N value for vector retrieval. embed_args=>'{}'::jsonb, -- Specify the parameter that is required by the model. distance_type=>'cosine' -- Specify the algorithm used to calculate the distance between vectors. Valid values: L1, L2, cosine, and negative. );
RAG-based Q&A
By default, the rds_ai extension uses the text-embedding-v3 model and the text generation model provided by Qwen to implement RAG-based Q&A services.
Sample command:
-- rds_ai uses the text-embedding-v3 model for vector-related operations. The text generation model used by rds_ai is specified by the rds_ai.default_prompt_model parameter.
SELECT * FROM rds_ai.rag
('Why is PostgreSQL the most advanced open source database',
'public', 'test_rag', 'chunk', 'embedding');
Complete command:
SELECT * FROM rds_ai.rag
(
embed_model=>'text-embedding-v3', -- Specify the vector model used for RAG-based Q&A services.
prompt_model=>'qwen-plus', -- Specify the prompt model used for RAG-based Q&A.
question=>'Why is PostgreSQL the most advanced open source database',
source_schema=>'public', -- Specify the schema of the table that stores the vectors to be retrieved.
source_table=>'test_rag', -- Specify the table that stores the vectors to be retrieved.
chunk_col=>'chunk', -- Specify the column that stores the chunk text.
vector_col=>'embedding', -- Specify the column that stores the vectors.
-- Specify the following default parameters.
topn=>10,
embed_args=>'{}'::jsonb, -- Specify the parameter that is required by the vector model used for RAG-based Q&A.
prompt_args=>'{}'::jsonb, -- Specify the parameter that is required by the prompt model used for RAG-based Q&A.
distance_type=>'L2' -- Specify the algorithm used to calculate the distance between vectors. Valid values: L1, L2, cosine, and inner product.
);
Use custom models to implement AI capabilities
Contact us if you have problems when you add custom models.
You can add a custom model by inserting a record for the model into the metadata table rds_ai.model_list. The following table shows the fields included in the rds_ai.model_list table:
Field | Data type | Description |
model_name | name | The name of the model. This field is the primary key of the rds_ai.model_list table. |
request_type | text | The method of HTTP requests sent to invoke this model. |
request_header | http.http_header[] | The HTTP request headers that include authentication information. The data type of this field is defined by the pgsql-http extension. |
uri | text | The URL of the model. |
content_type | text | The content type of the request. Example: application/json. |
content_template | text | The template of the request body. You must replace the placeholders in the template with actual values when you invoke the model. |
json_path | text | The SQL statement used to resolve the response and obtain required information. |
token | text | The token specified in the headers. |
Basic operations
Add a custom model
SELECT rds_ai.add_model( 'test-model', -- The model name. 'POST', -- The HTTP request method. ARRAY[('Authorization', 'Bearer %s')]::http.http_header[], -- http_header 'https://****.com', -- The model URL specified in the request. 'application/json', -- The content type of the request. '{"key":"%s"}', -- The template of the request body. 'SELECT %L' -- The SQL statement used to resolve the response. );
Use a custom model
The following command shows how to use the rds_ai.raw_invoke_model function to invoke Qwen-Plus based on its settings in the metadata table. In the request, an array is specified to fulfill the template. The complete response is returned.
SELECT * FROM rds_ai.raw_invoke_model('qwen-plus', ARRAY['who are you']);
The following command shows how to use the rds_ai.invoke_model function to invoke Qwen-Plus based on its settings in the metadata table. Specific fields are resolved from the response and returned based on the
json_path
field specified for Qwen-Plus.SELECT * FROM rds_ai.invoke_model('qwen-plus', ARRAY['who are you']);
Delete a custom model
SELECT rds_ai.del_model('test-model');
Add a RAG-based PAI model
To add a RAG-based PAI model, we recommend that you deploy your RDS for PostgreSQL instance and the Elastic Algorithm Service (EAS) module of Platform for AI (PAI) in the same VPC, and configure the VPC endpoint of PAI in the settings of rds_ai. In addition, you must use the VPC endpoint of the RDS for PostgreSQL instance to connect to the EAS module of PAI. This way, different services communicate only within the VPC to ensure data security.
Deploy a RAG-based service in the EAS module of PAI. For more information, see Use the EAS module of PAI and ApsaraDB RDS for PostgreSQL to deploy a RAG-based LLM chatbot.
Obtain the invocation information of the RAG-based service.
Click the name of the RAG-based service to go to the Service Details page.
In the Basic Information section, click View Endpoint Information.
In the Invocation Method dialog box, obtain the endpoint and token of the service.
Add the RAG-based PAI model.
SELECT rds_ai.add_model( 'pai-rag', -- The model name. 'POST', -- The HTTP request method. ARRAY[('Authorization','%s')]::http.http_header[], -- The request header. 'http://rds-pai-rag-demo.****.cn-hangzhou.pai-eas.aliyuncs.com/service/query', -- The model URL specified in the request. 'application/json', -- The content type of the request. '{ "question": "%s" }', -- The template of the request body. 'SELECT (%L::jsonb->''answer'')::text' -- The SQL statement used to resolve the response. );
NoteReplace the URL in the preceding example with the actual model URL appended with
/service/query
.For more information about how to invoke a RAG-based PAI model, see Call API operations to perform model inference.
Configure the token of the RAG-based PAI model.
SELECT rds_ai.update_model('pai-rag', 'token','MTFkYjMwZjgzYzA1YmE2N2YyNWMxM2NkNDVjMjEzNjYxMDAzMzE5****');
(Optional) Test the RAG-based PAI model.
Sample command
SELECT rds_ai.invoke_model('pai-rag', ARRAY['Which parameters may result in the accumulation of WAL logs']);
Add a RAG-based Function Compute model
Deploy an AgentCraft application in Function Compute.
Create an agent and configure client access information for the agent.
Add a RAG-based Function Compute model.
SELECT rds_ai.add_model( 'fc-rag', -- The model name. 'POST', -- The HTTP request method. ARRAY[('Authorization','Bearer %s')]::http.http_header[], -- The request header. 'https://agentcrckend-de-obnhjsknam.cn-hangzhou.fcapp.run/v1/chat/completions', -- The model URL specified in the request. 'application/json', -- The content type of the request. '{ "messages":[ { "role": "user", "content": "%s" } ], "stream": false, "max_tokens": 1024 }', 'SELECT (%L::jsonb->''choices''->0->''message''->>''content'')::text' -- The SQL statement used to resolve the response. );
NoteReplace the URL in the preceding example with the actual model URL appended with
/completions
.Configure the token of the RAG-based Function Compute model.
SELECT rds_ai.update_model('fc-rag', 'token','8UiGAziWgYGPxM3qR5sAChBfDJRt****');
(Optional) Test the RAG-based Function Compute model.
Sample command:
SELECT rds_ai.invoke_model('fc-rag', ARRAY['Which parameters may result in the accumulation of WAL logs']);