ApsaraDB RDS for PostgreSQL provides the rds_ai AI extension. This extension integrates advanced models from Alibaba Cloud Model Studio, such as Qwen, text embedding,. You can use this extension to implement various scenarios in your ApsaraDB RDS for PostgreSQL database, including large language model (LLM) Q&A, text-to-vector conversion, top-N similar vector retrieval, and retrieval-augmented generation (RAG) Q&A. In addition, rds_ai supports custom models, which lets you add the models that you need to implement a wide range of AI applications in ApsaraDB RDS for PostgreSQL.
Prerequisites
Your instance meets the following version requirements:
Major version
Minor engine version
PostgreSQL 16
20241230 or later
PostgreSQL 14, 15, 17
20250430 or later
PostgreSQL 18
20251130 or later
For instructions on minor engine version upgrades, see Upgrade a minor engine version.
You have created a privileged account for ApsaraDB RDS for PostgreSQL. For more information, see Create an account.
The models used in this topic are provided by Alibaba Cloud Model Studio. You must activate Model Studio and obtain an API key. For more information, see Obtain an API key.
Network configuration
By default, an ApsaraDB RDS for PostgreSQL database cannot access external networks. To allow access to external models, you can configure a NAT Gateway for the VPC where your ApsaraDB RDS for PostgreSQL instance resides. For more information about NAT gateways, see Use the SNAT feature of an Internet NAT gateway to access the Internet.
Create and delete the extension
Before you install the extension, confirm the major and minor engine versions of your ApsaraDB RDS for PostgreSQL instance to ensure that the extension is supported. For more information, see Prerequisites.
Install the plugin on the Plugin Management page.
Go to the RDS Instances page, select a region in the top navigation bar, and then click the ID of the target instance.
In the navigation pane on the left, click Plug-ins.
On the Plugin Marketplace page, click Install for the rds_ai plugin.
In the window that appears, select the target database and account. Then, click Install to install the extension in the selected database.
(Optional) You can uninstall plugins on the Installed Plugins tab of the Manage Plugins page.
Install the extension using an SQL command
Create the extension
CREATE EXTENSION IF NOT EXISTS rds_ai CASCADE;NoteOnly privileged accounts can run this command. For more information about how to create a privileged account, see Create an account.
When you create the rds_ai extension, the pgvector User Guide and pgsql-http extensions are also created.
You can execute
SELECT * FROM pg_extension;to view the installed extensions.
Delete a plugin
DROP EXTENSION rds_ai;
Default models
The rds_ai extension supports the following default models. You can also add custom models as needed.
API type | Parameter | Parameter type | Default model |
Prompt API | rds_ai.default_prompt_model | enum |
|
Embed API | rds_ai.default_embed_model | enum | text-embedding-v3 |
To change the default model, you must add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure parameters, see Set instance parameters. For example, change the Running Value to 'pg_stat_statements,auto_explain,rds_ai'.
The default models of the rds_ai extension provide the following capabilities:
Basic settings
Before you use rds_ai to call an LLM, you must configure the corresponding API key.
-- Set the API key for the target model. SELECT rds_ai.update_model('qwen-plus', 'token', 'sk-****'); -- Set the API key for all models in rds_ai.model_list. SELECT rds_ai.update_model(model_name,'token','sk-****') FROM rds_ai.model_list;Run the following commands to configure the URLs of the default models.
-- qwen-plus model SELECT rds_ai.update_model('qwen-plus', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); -- qwen-max model SELECT rds_ai.update_model('qwen-max', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); -- qwen-turbo model SELECT rds_ai.update_model('qwen-turbo', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation'); -- text-embedding-v3 model SELECT rds_ai.update_model('text-embedding-v3', 'uri', 'https://dashscope-intl.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding');The rds_ai extension uses the pgsql-http extension to make remote calls to models. You can configure the following timeout settings to interrupt long-running calls.
NoteThe following timeout settings are at the session level. If you create a new connection, you must configure them again.
-- Set the request timeout in milliseconds. SET http.timeout_msec TO 200000; SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000'); -- Set the connection timeout. SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');
LLM-based Q&A
The default LLM-based Q&A model for the rds_ai extension is Qwen-Text Generation.
(Optional) You can set the default large language model for Q&A. If this is not configured, the system uses the qwen-plus model by default.
NoteTo change the default model, you must add rds_ai to the Running Value of the shared_preload_libraries parameter. For more information about how to configure parameters, see Set instance parameters. For example, change the Running Value to
'pg_stat_statements,auto_explain,rds_ai'.SET rds_ai.default_prompt_model TO 'qwen-max';Use rds_ai.prompt to call the default LLM for Q&A. For example:
SELECT rds_ai.prompt('Give me a recipe using carrots, potatoes, and eggplants.');Complete API call:
SELECT rds_ai.prompt( model_name=>'qwen-plus', -- model name content=>'Give me a recipe using carrots, potatoes, and eggplants.', -- question args=>'{"top_p": 0.7}'::jsonb -- parameters for calling the large model );
Text-to-vector conversion
The default text-to-vector conversion model for the rds_ai extension is text-embedding-v3.
By default, this operation outputs a dense vector.
SELECT rds_ai.embed( 'A strong wind blows from a high sky, and monkeys cry sadly. The water is clear and the sand is white, and birds are flying back. The leaves of the boundless forest fall, and the endless Yangtze River rolls on.' -- text to be converted to a vector );You can specify to output a sparse vector.
-- Convert text to a sparse vector, with 250002 dimensions by default. SELECT rds_ai.embed( content=>'A strong wind blows from a high sky, and monkeys cry sadly. The water is clear and the sand is white, and birds are flying back. The leaves of the boundless forest fall, and the endless Yangtze River rolls on.', args=>'{"output_type": "sparse"}'::jsonb -- parameter list );
Store the vector conversion result in the target table.
Create the test_embed table.
CREATE TABLE test_embed(a text, b vector(1024), c sparsevec(250002));Insert the text content to be converted.
INSERT INTO test_embed (a) values ('hello world');Use rds_ai.embed to call the text-embedding-v3 model and write the conversion result to the test_embed table.
Write a dense vector.
UPDATE test_embed SET b = rds_ai.embed(a, '{"output_type": "dense"}'::jsonb)::vector;Write a sparse vector.
UPDATE test_embed SET c = rds_ai.embed(a, '{"output_type": "sparse"}'::jsonb)::sparsevec;
Top-N similar vector retrieval
The rds_ai extension uses the text-embedding-v3 model by default for similar vector retrieval. For example:
Create a test table.
-- Create the test table test_rag. CREATE TABLE test_rag ( id SERIAL PRIMARY KEY, chunk TEXT ); -- Add an embedding column to store the vectors converted from the chunk. ALTER TABLE test_rag ADD COLUMN embedding VECTOR(1024); UPDATE test_rag SET embedding=rds_ai.embed(chunk)::vector;Create a vector index.
NoteWhen you create a vector index, you must select the vector_cosine_ops type.
-- Create an HNSW index. CREATE INDEX ON test_rag USING hnsw (embedding vector_cosine_ops); -- Create an IVFFlat index. CREATE INDEX ON test_rag USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);Use rds_ai.retrieve to call the text-embedding-v3 model to perform vector retrieval in the test table.
SELECT * FROM rds_ai.retrieve ('Why is PostgreSQL considered the most advanced open-source database?', 'public', 'test_rag', 'chunk', 'embedding');Complete API call:
SELECT * FROM rds_ai.retrieve ( embed_model=>'text-embedding-v3', -- The vector model to use. question=>'Why is PostgreSQL considered the most advanced open-source database?', source_schema=>'public', -- The schema of the table for the similar vector search. source_table=>'test_rag', -- The table for the similar vector search. chunk_col=>'chunk', -- The chunk column. vector_col=>'embedding', -- The vector column. -- The following are default parameters. topn=>10, -- Set top_n. embed_args=>'{}'::jsonb, -- Pass parameters for the text-to-vector model. distance_type=>'cosine' -- The distance algorithm. Supported values are L1, L2, cosine, and negative. );
RAG-based Q&A
The rds_ai extension uses the text-embedding-v3 model and the Qwen-Text Generation model by default for RAG-based Q&A.
-- Vector model: text-embedding-v3. Text generation model: specified by the rds_ai.default_prompt_model parameter.
SELECT * FROM rds_ai.rag
('Why is PostgreSQL considered the most advanced open-source database?',
'public', 'test_rag', 'chunk', 'embedding');Complete API call:
SELECT * FROM rds_ai.rag
(
embed_model=>'text-embedding-v3', -- The vector model to use.
prompt_model=>'qwen-plus', -- The prompt model to use.
question=>'Why is PostgreSQL considered the most advanced open-source database?',
source_schema=>'public', -- The schema of the table for the similar vector search.
source_table=>'test_rag', -- The table for the similar vector search.
chunk_col=>'chunk', -- The chunk column.
vector_col=>'embedding', -- The vector column.
-- The following are default parameters.
topn=>10,
embed_args=>'{}'::jsonb, -- Parameters for the text-to-vector model.
prompt_args=>'{}'::jsonb, -- Parameters for the large language model prompt.
distance_type=>'L2' -- The distance algorithm. Supported values are L1, L2, cosine, and inner product.
);Custom models
If you encounter any issues when you add a custom model, contact us.
You can add a custom model using the rds_ai.model_list metadata table. The fields of the rds_ai.model_list table are as follows:
Field | Type | Description |
model_name | name | The model name. This field has a PRIMARY KEY constraint. |
request_type | text | The HTTP method to call. |
request_header | http.http_header[] | A type provided by the pgsql-http extension. It records the header information of the HTTP request, mainly for authentication. |
uri | text | The URL of the model. |
content_type | text | The request type, such as application/json. |
content_template | text | The template for the request body. It usually contains placeholders that are filled in during the API call. |
json_path | text | Parses the SQL based on http_response and extracts the required content. |
token | text | The key in the header. |
Add a custom model
SELECT rds_ai.add_model(
'test-model', -- model name
'POST', -- http request type
ARRAY[('Authorization', 'Bearer %s')]::http.http_header[], -- http_header
'https://****.com', -- request URL
'application/json', -- request Content-type
'{"key":"%s"}', -- request_body template
'SELECT %L' -- parse http request result
);Use a custom model
Use the rds_ai.raw_invoke_model function to send a request based on the configuration of the qwen-plus model. Use an ARRAY to fill the call template. The function returns the complete HTTP request result.
SELECT * FROM rds_ai.raw_invoke_model('qwen-plus', ARRAY['who are you']);Use the rds_ai.invoke_model function to make a request based on the configuration of the qwen-plus model. The function extracts the specified fields according to the
json_pathfield in the configuration.SELECT * FROM rds_ai.invoke_model('qwen-plus', ARRAY['who are you']);
Delete a custom model
SELECT rds_ai.del_model('test-model');Add a PAI-RAG model
We recommend that you deploy PAI-EAS and the ApsaraDB RDS for PostgreSQL instance in the same VPC. Configure the PAI VPC address in the rds_ai extension. The connection address for PAI-EAS must also use the ApsaraDB RDS for PostgreSQL VPC address. This ensures that all communication occurs within your VPC and does not traverse the public network, which enhances security.
Deploy the RAG service in PAI-EAS. For more information, see Deploy an LLM-based RAG chatbot using PAI-EAS and ApsaraDB RDS for PostgreSQL.
Obtain the call information for the RAG service.
Click the RAG service name to go to the Service Details page.
In the Basic Information section, click View Endpoint Information.
In the Invocation Information dialog box, obtain the service endpoint and Token.
Add the PAI-RAG model.
SELECT rds_ai.add_model( 'pai-rag', -- model name 'POST', -- request method ARRAY[('Authorization','%s')]::http.http_header[], -- request header, 'http://rds-pai-rag-demo.****.cn-hangzhou.pai-eas.aliyuncs.com/service/query', -- request URL 'application/json', -- request content format '{ "question": "%s" }', -- request body 'SELECT (%L::jsonb->''answer'')::text' -- parsing path );NoteReplace the model URL with the actual URL and append
/service/query.For more information about calling PAI-RAG models, see Call PAI models using APIs.
Configure the token for the PAI-RAG model.
SELECT rds_ai.update_model('pai-rag', 'token','MTFkYjMwZjgzYzA1YmE2N2YyNWMxM2NkNDVjMjEzNjYxMDAzMzE5****');
Verify the new PAI-RAG model. For example:
SELECT rds_ai.invoke_model('pai-rag', ARRAY['What parameters are related to WAL log accumulation?']);Add an FC Function Compute RAG model
Deploy the AgentCraft application in Function Compute. For more information, see Cloud Deployment of AgentCraft.
You have created an agent and a Client Access.
For example, when you created the Client Integration as described in Integrate an agent into a DingTalk robot, you obtained and saved the endpoint and Token.
Add the FC Function Compute RAG model.
SELECT rds_ai.add_model( 'fc-rag', -- model name 'POST', -- request method ARRAY[('Authorization','Bearer %s')]::http.http_header[], -- request header, 'https://agentcrckend-de-obnhjsknam.cn-hangzhou.fcapp.run/v1/chat/completions', -- request URL 'application/json', -- request content format '{ "messages":[ { "role": "user", "content": "%s" } ], "stream": false, "max_tokens": 1024 }', 'SELECT (%L::jsonb->''choices''->0->''message''->>''content'')::text' -- parsing path );NoteReplace the model URL with the actual URL and append
/completions.Configure the token for the FC Function Compute RAG model.
SELECT rds_ai.update_model('fc-rag', 'token','8UiGAziWgYGPxM3qR5sAChBfDJRt****');
Verify the new FC Function Compute RAG model. For example:
SELECT rds_ai.invoke_model('fc-rag', ARRAY['What parameters are related to WAL log accumulation?']);