The rds_ai extension for ApsaraDB RDS for PostgreSQL integrates Alibaba Cloud Model Studio models — including Qwen and text embedding models — directly into your database. This tutorial walks through building a complete Retrieval-Augmented Generation (RAG) pipeline entirely in SQL, without writing application code or managing external RAG frameworks.
Prerequisites
Before you begin, make sure you have:
An RDS instance running PostgreSQL 16 with minor engine version 20241230 or later. To upgrade, see Update the minor engine version
A privileged account on the RDS instance. See Create an account
Alibaba Cloud Model Studio activated with an API key ready. See Obtain an API key
The RDS instance network configured to reach external models. See Network configurations
How it works
This pipeline runs entirely within your RDS instance through five stages:
Document chunking — Split documents into fixed-size chunks using SQL. (Runs locally in your PostgreSQL instance)
Text embedding — Convert chunks into embedding vectors and tsvectors using rds_ai. (Calls the remote embedding model)
Multimodal retrieval — Retrieve candidates by both vector similarity and keyword matching. (Runs locally)
Result reranking — Re-score candidates to surface the most relevant results. (Local algorithm or remote reranking model)
LLM invocation — Send the top results to an LLM and return the answer. (Calls the remote chat model)
Key functions
| Function | Description |
|---|---|
rds_ai.update_model(model_name, 'token', 'sk-...') | Set the API key for a model |
rds_ai.embed(text)::vector | Convert text to an embedding vector (remote call) |
rds_ai.retrieve(query, schema, table, text_col, embedding_col) | Retrieve chunks by vector similarity |
rds_ai.rank(query, result_array) | Rerank retrieved chunks by relevance score (remote call) |
rds_ai.prompt(prompt_text) | Send a prompt to the default LLM and return the response (remote call) |
rds_ai.rag(query, schema, table, text_col, embedding_col) | Run vector retrieval and LLM invocation in one call |
rds_ai.add_model(...) | Register a custom model |
http.http_set_curlopt(option, value) | Set HTTP curl options for the current session |
Step 1: Install and configure rds_ai
Install the extension
Open the Instances page in the ApsaraDB RDS console. In the top navigation bar, select the region where your instance resides, then click the instance ID.
In the left-side navigation pane, click Plug-ins.
On the Extension Marketplace tab, click Install in the rds_ai section.
In the dialog that appears, select your database and account, then click Install.
Wait for the instance status to change from Maintaining Instance to Running.
Installation takes about one minute. Refresh the page to check the current status.
Configure the extension
Connect to your RDS for PostgreSQL instance and run the following statements in the public schema.
Set the API key for all default models:
-- Configure the API key for all models listed in rds_ai.model_list. SELECT rds_ai.update_model(model_name, 'token', 'sk-****') FROM rds_ai.model_list;Configure request timeouts. The rds_ai extension calls models over HTTP using the pgsql-http extension, so long-running invocations need an explicit timeout.
ImportantThese settings apply only to the current session. Reconfigure them each time you open a new session.
-- Set the request timeout in milliseconds. SET http.timeout_msec TO 200000; SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000'); -- Set the connection timeout in milliseconds. SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');
Step 2: Prepare data for RAG
This step sets up the tables, chunking logic, and triggers that automatically embed documents as they are inserted.
Create the
doctable for raw documents and thechunktable for processed chunks, then add indexes:CREATE TABLE doc ( id SERIAL PRIMARY KEY, title VARCHAR(255), content TEXT ); CREATE TABLE chunk ( id SERIAL PRIMARY KEY, doc_id INTEGER NOT NULL, text TEXT, embedding VECTOR(1024), ts_vector_extra tsvector ); -- Index for joining chunks back to their source document. CREATE INDEX idx_doc_id ON chunk (doc_id); -- HNSW index for approximate nearest-neighbor vector search. -- Use the default index for most workloads. -- Use the index with explicit m and ef_construction parameters -- to tune the speed-recall tradeoff for larger datasets. CREATE INDEX ON chunk USING hnsw (embedding vector_cosine_ops); CREATE INDEX ON chunk USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64); -- GIN index for full-text keyword search. CREATE INDEX chunk_text_gin ON chunk USING gin (ts_vector_extra);Create the
split_textfunction to split a long text into fixed-size chunks with a configurable overlap. For complex documents, use a dedicated text chunking framework instead.CREATE OR REPLACE FUNCTION split_text( input_text TEXT, chunk_size INT, chunk_overlap INT ) RETURNS SETOF TEXT AS $$ DECLARE current_idx INT; start_idx INT; chunk TEXT; BEGIN -- Validate that overlap is smaller than chunk size. IF chunk_overlap >= chunk_size THEN RAISE EXCEPTION 'chunk_overlap must be less than chunk_size'; END IF; current_idx := 1; LOOP -- Apply overlap by moving the start position back. start_idx := current_idx - chunk_overlap; IF start_idx < 1 THEN start_idx := 1; END IF; chunk := substr(input_text, start_idx, chunk_size); IF chunk IS NULL OR length(chunk) = 0 THEN EXIT; END IF; RETURN NEXT chunk; current_idx := current_idx + chunk_size - chunk_overlap; END LOOP; END; $$ LANGUAGE plpgsql;Create triggers to automate chunking, embedding, and tsvector generation whenever a document is inserted:
NoteSkip this step and go directly to Step 3 if your documents are complex and difficult to split using a fixed-size function.
-- Split inserted documents into chunks and write them to the chunk table. CREATE OR REPLACE FUNCTION insert_into_chunk() RETURNS TRIGGER AS $$ BEGIN INSERT INTO chunk (doc_id, text) SELECT NEW.id, result FROM split_text(NEW.content, 300, 50) AS result; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER after_insert_doc AFTER INSERT ON doc FOR EACH ROW EXECUTE FUNCTION insert_into_chunk(); -- Remove chunks when their source document is deleted. CREATE OR REPLACE FUNCTION delete_from_chunk() RETURNS TRIGGER AS $$ BEGIN DELETE FROM chunk WHERE doc_id = OLD.id; RETURN OLD; END; $$ LANGUAGE plpgsql; CREATE TRIGGER after_delete_doc AFTER DELETE ON doc FOR EACH ROW EXECUTE FUNCTION delete_from_chunk(); -- Call rds_ai.embed to generate an embedding vector for each new or updated chunk. CREATE OR REPLACE FUNCTION update_chunk_embedding() RETURNS TRIGGER AS $$ BEGIN NEW.embedding := rds_ai.embed(NEW.text)::vector; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER set_chunk_embedding BEFORE INSERT OR UPDATE ON chunk FOR EACH ROW EXECUTE FUNCTION update_chunk_embedding(); -- Build tsvectors for keyword-based retrieval. CREATE TRIGGER embedding_tsvector_update BEFORE UPDATE OR INSERT ON chunk FOR EACH ROW EXECUTE PROCEDURE tsvector_update_trigger( 'ts_vector_extra', 'pg_catalog.english', 'text' );Insert documents into the
doctable. The triggers created in the previous step automatically split, embed, and index each document. The following sample data is for reference:INSERT INTO doc (title, content) VALUES ('A Brief History of PostgreSQL', 'The object-relational database management system now known as PostgreSQL is derived from the POSTGRES package written at the University of California, Berkeley. After more than two decades of development, PostgreSQL is the most advanced open-source database available in the world.\n\n2.1. The POSTGRES Project at Berkeley\nThe POSTGRES project, led by Professor Michael Stonebraker, was sponsored by the Defense Advanced Research Projects Agency (DARPA), the Army Research Office (ARO), the National Science Foundation (NSF), and ESL, Inc. The implementation of POSTGRES began in 1986. The original concepts of the system are detailed in [ston86]. The initial data model definition is found in [rowe87]. The design of the rule system at that time is described in [ston87a]. The theoretical foundation and architecture of the storage manager are described in detail in [ston87b].\n\nSince then, POSTGRES has undergone several major version updates. The first demonstration system was available in 1987 and was exhibited at the ACM-SIGMOD conference in 1988. Version 1 was released in June 1989 (see [ston90a]) for some external users. In response to criticism of the first rule system ([ston89]), the rule system was redesigned ([ston90b]), and version 2, using the new rule system, was released in June 1990. Version 3 appeared in 1991, adding support for multiple storage managers and improving the query executor and rewriting the rule system. Subsequent versions, until the release of Postgres95 (see below), focused mainly on portability and reliability.\n\nPOSTGRES has been used to implement many different research and production applications. These applications include a financial data analysis system, a jet engine performance monitoring package, an asteroid tracking database, a medical information database, and some geographic information systems. POSTGRES has also been used by many universities for teaching purposes. Finally, Illustra Information Technologies (later merged into Informix, which is now owned by IBM) commercialized the code. In late 1992, POSTGRES became the primary data manager for the Sequoia 2000 scientific computing project.\n\nIn 1993, the number of external users nearly doubled. As the number of users increased, the time spent on source code maintenance increased and took up too much time that should have been used for database research. To reduce the support burden, the POSTGRES project at Berkeley officially ended at version 4.2.\n\n2.2. Postgres95\nIn 1994, Andrew Yu and Jolly Chen added an SQL language interpreter to POSTGRES. They subsequently released the source code under the new name Postgres95 on the Internet for everyone to use, becoming the open-source successor to the original POSTGRES Berkeley code.\n\nThe source code of Postgres95 is entirely ANSI C, and the code size was reduced by 25%. Many internal modifications improved performance and maintainability. The 1.0.x version of Postgres95 was about 30-50% faster than version 4.2 of POSTGRES when running the Wisconsin Benchmark test. In addition to fixing some bugs, the following are some major improvements:\n\nThe original query language PostQUEL was replaced by SQL (implemented on the server side). The interface library libpq was named after PostQUEL. Before PostgreSQL, subqueries were not supported (see below), but they could be simulated in Postgres95 by user-defined SQL functions. Aggregate functions were re-implemented. Support for GROUP BY query clauses was also added.\n\nA new program for interactive SQL queries using the Readline of GNU was added (psql). This program largely replaced the old monitor program.\n\nA new frontend library (libpgtcl) was added to support Tcl-based clients. A sample shell (pgtclsh) provided new Tcl commands for interaction between Tcl programs and the Postgres95 server.\n\nThe interface for large objects was completely rewritten. The inversion file system was removed, retaining inversion as the only mechanism for storing large objects.\n\nThe instance-level rule system was removed. However, rules still exist in the form of rewrite rules.\n\nA brief tutorial introducing SQL and Postgres95 features was added to the released source code.\n\nPostgres95 could be compiled using unpatched GCC (fixing double-precision data alignment issues).\n\n\n2.3. PostgreSQL\nBy 1996, it was clear that the name Postgres95 was outdated. We chose a new name, PostgreSQL, to reflect the relationship between the original POSTGRES and the latest version with SQL capabilities. The version number also started from 6.0, returning to the sequence initially started by the Berkeley POSTGRES project.\n\nMany people continue to use Postgres to refer to PostgreSQL (now rarely in all uppercase letters) due to tradition or easier pronunciation. This usage is widely accepted as a nickname or alias.\n\nThe development of Postgres95 focused on identifying and understanding existing issues in the backend code. The development of PostgreSQL shifted to some controversial features and functions, with work in all areas being carried out simultaneously.\n\n Changes to PostgreSQL since then can be found in Appendix E.'), ('Overview of MySQL Database Management System', 'MySQL is the most popular open-source SQL database management system, developed, released, and supported by MySQL AB. MySQL AB is a commercial company founded by several MySQL developers. It is a second-generation open-source company that combines open-source values, methods, and a successful business model.\n\nThe MySQL website (http://www.mysql.com/) provides the latest information about MySQL and MySQL.\n\n· MySQL is a database management system.\n\nA database is a structured collection of data. It can be anything from a simple shopping list to an art gallery or a vast amount of information in a corporate network. To add data to a database or access and process data stored in a computer database, you need to use a database management system such as the MySQL server. Computers are ideal tools for handling large amounts of data, so database management systems play a critical central role in computing, either as standalone utilities or as components of other applications.\n\nMySQL is a relational database management system.\n\nA relational database stores data in different tables rather than putting all data in one large repository. This increases speed and flexibility. SQL in MySQL refers to Structured Query Language. SQL is the most commonly used standardized language for accessing databases, defined by the ANSI/ISO SQL standard. The SQL standard has evolved since 1986, with several versions. In this manual, SQL-92 refers to the standard released in 1992, SQL:1999 refers to the standard released in 1999, and SQL:2003 refers to the current version of the standard. We use the term SQL standard to denote the current version of the SQL standard.\n\nMySQL software is open-source software.\n\nOpen source means that anyone can use and change the software. Anyone can download MySQL software from the Internet without paying any fees. If you want, you can study the source code and make appropriate changes to meet your own needs. MySQL software is licensed under the GPL (GNU General Public License), http://www.mysql.com/company/legal/licensing/).\n\nThe MySQL database server is fast, reliable, and easy to use.\n\nIf it is what you are looking for, give it a try. The MySQL server also has a set of practical features developed through close cooperation with our users. Our benchmark homepage provides comparison results of the MySQL server and other database managers. See 7.1.4 MySQL Benchmark Suite.\n\nThe MySQL server was initially developed to handle large databases, and it is faster than existing solutions. Over the years, it has been successfully used in many demanding production environments. Although MySQL is constantly evolving, the MySQL server currently offers rich and useful features. It has good connectivity, speed, and security, making MySQL very suitable for accessing databases on the Internet.\n\nThe MySQL server operates in client/server mode or embedded systems.\n\nThe MySQL database software is a client/server system consisting of a multithreaded SQL server that supports different backends, several different client programs and libraries, numerous management tools, and extensive application programming interfaces (APIs).\n\nWe can also provide the MySQL server as an embedded multithreaded library that you can link to your application, resulting in a smaller, faster, and more manageable product.\n\nThere is a large amount of shared MySQL software available.\n\nIt is very likely that your favorite applications and languages support the MySQL database server.\n\nThe official pronunciation of MySQL is My Ess Que Ell (not my sequel), but we do not mind if you pronounce it as my sequel or in any other local way.'), ('What is SQL Server?', 'Microsoft SQL Server is a relational database management system (RDBMS). Applications and tools connect to SQL Server instances or databases and communicate using Transact-SQL (T-SQL).\n\nDeployment Options\nSQL Server can be installed on Windows or Linux, deployed in Linux containers, or deployed on Azure virtual machines or other virtual machine platforms. You may have previously referred to it as a boxed product.\n\nWhich SQL Server versions are supported depends on your licensing agreement, but for the purposes of this document, we refer to SQL Server 2016 (13.x) and later versions. For documentation on SQL Server 2014 (12.x) and earlier versions, see the documentation for previous versions of SQL Server. To find out which SQL Server versions are currently supported, see SQL Server End of Support Options.\n\nThe following products and services also use the underlying SQL Server database engine:\n\nAzure SQL Database\nAzure SQL Managed Instance\nMicrosoft Analytics Platform System (PDW)\nAzure Synapse Analytics\nAzure SQL Edge\nFor a list of features supported by various versions of SQL Server on Windows, see:\n\nSQL Server 2022 versions and their supported features\nSQL Server 2019 versions and their supported features\nVersions and features supported by SQL Server 2017\nVersions and features supported by SQL Server 2016\nSQL Server Components and Technologies\nThis section introduces some key technologies available in SQL Server.\n\nComponent\tDescription\nDatabase Engine\tThe database engine is the core service for storing, processing, and protecting data. The database engine provides controlled access and transaction processing to meet the requirements of the most demanding data consumption applications within an enterprise. The database engine also provides comprehensive support for business continuity and database recovery (SQL Server) to maintain business continuity.\nMachine Learning Services (MLS)\tSQL Server Machine Learning Services supports integrating machine learning into enterprise workflows using popular R and Python languages.\n\nMachine Learning Services (in-database) integrates R and Python with SQL Server, allowing users to easily generate, retrain models, and score models by calling stored procedures. Machine Learning Server provides enterprise-level support for R and Python, allowing users to use them without SQL Server.\nIntegration Services (SSIS)\tSQL Server Integration Services is a platform for building high-performance data integration solutions, including packages for extract, transform, and load (ETL) processing for data warehouses.\nAnalysis Services (SSAS)\tSQL Server Analysis Services is an analytical data platform and toolset for personal, team, and corporate business intelligence. Server and client designers support traditional OLAP solutions, new tabular modeling solutions, and self-service analytics and collaboration by using Power Pivot, Excel, and SharePoint Server environments. Analysis Services also includes data mining, allowing you to discover patterns and relationships hidden in large amounts of data.\nReporting Services (SSRS)\tSQL Server Reporting Services provides web-enabled enterprise-level reporting capabilities. It allows users to create reports that extract content from multiple data sources, publish reports in various formats, and centrally manage security and subscriptions.\nReplication\tSQL Server Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. Replication allows you to distribute data to different locations and distribute it to remote or mobile users over local area networks, wide area networks, dial-up connections, wireless connections, and the Internet.\nData Quality Services (DQS)\tData Quality Services provides a knowledge-driven data cleansing solution. DQS allows you to generate a knowledge base and then use this knowledge base, along with computer-assisted and interactive methods, to perform data correction and deduplication. You can use cloud-based reference data services and generate a data management solution that integrates DQS with SQL Server Integration Services and Master Data Services.\nMaster Data Services (MDS)\tMaster Data Services is the master data management solution for SQL Server. Solutions built on Master Data Services help ensure that reports and analyses are based on the appropriate information. With Master Data Services, you can create a central repository for master data and maintain an auditable, secure record of master data as it changes over time.\n') ;Verify that the
chunktable contains the split chunks, their embedding vectors, and tsvectors:SELECT * FROM chunk;
Step 3: Generate embeddings in batches
Skip this step if you created the embedding trigger in Step 2. The trigger automatically generates embeddings when documents are inserted.
For complex documents that cannot be split by a fixed-size function, use a framework such as LangChain to prepare chunks in advance. After chunks are in the chunk table without embeddings, use one of the following methods to generate them.
Use the same embedding model here as you used when creating the chunks. Comparing embeddings generated by different models produces meaningless retrieval results.
Option 1: Recursive (recommended)
Update all rows in one statement. rds_ai calls the embedding API once per row:
UPDATE chunk
SET embedding = rds_ai.embed(text)::vector;Option 2: Batch
Use batch mode when you need finer control over API call batching.
Register the batch embedding model:
SELECT rds_ai.add_model( 'text-embedding-v3-batch', -- Model name 'POST', -- Request method ARRAY[('Authorization', 'Bearer %s')]::http.http_header[], -- Request header 'https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings', -- Request URL 'application/json', -- Content type '{ "model": "text-embedding-v3", "input": %s }', -- Request body 'SELECT %L::jsonb->''data''' -- SQL to process the response );Set the API key for the batch model:
SELECT rds_ai.update_model( 'text-embedding-v3-batch', 'token', 'sk-xxxxxx' );Test the model:
SELECT * FROM rds_ai.invoke_model( 'text-embedding-v3-batch', ARRAY['["01","02","03"]'] );Run the batch embedding loop:
DO $$ DECLARE batch_size INT := 20; -- Number of rows per API call; adjust based on model limits pointer INT := 0; -- Current offset into the table record_count INT; -- Total number of rows to process input_text TEXT; -- JSON array of texts: ["item1", "item2", ..., "item n"] json_results JSON; -- Embedding API response json_item JSON; -- Single element from the response array idx INT; -- Index within the current batch BEGIN SELECT COUNT(*) INTO record_count FROM chunk; WHILE pointer < record_count LOOP -- Build the input array for this batch. SELECT json_agg(text::TEXT) INTO input_text FROM ( SELECT text FROM chunk ORDER BY id LIMIT batch_size OFFSET pointer ) AS subquery; -- Call the embedding model. json_results := rds_ai.invoke_model('text-embedding-v3-batch', ARRAY[input_text]); -- Write each embedding back to the corresponding row. FOR idx IN 0..json_array_length(json_results) - 1 LOOP json_item := json_results->idx; UPDATE chunk SET embedding = (SELECT (json_item->>'embedding')::VECTOR(1024)) WHERE id = (SELECT id FROM chunk ORDER BY id LIMIT 1 OFFSET pointer + idx); END LOOP; pointer := pointer + batch_size; RAISE NOTICE '%/% rows processed.', pointer, record_count; END LOOP; END; $$ LANGUAGE plpgsql;
You can choose either embedding method based on your needs. However, the recursive method (Option 1) is recommended for most use cases.
Step 4: Implement multimodal retrieval
Vector search finds semantically related content even when exact words differ. Keyword search finds results that contain the specific terms in the query. Combining both improves recall across a wider range of queries: for a question about a specific error code, keyword search finds the exact term; for a conceptual question where the wording varies, vector search finds the relevant passage.
Create a multi_retrieve function that runs both retrieval methods and returns their combined results:
CREATE OR REPLACE FUNCTION multi_retrieve(query TEXT)
RETURNS TABLE(ret_chunk text, score numeric, method text, rank int) AS $$
DECLARE
rec RECORD;
BEGIN
-- Keyword-based retrieval using tsvector full-text search.
FOR rec IN
SELECT
text,
subquery.score,
'retrieve_by_key_word' AS method,
RANK() OVER (ORDER BY subquery.score DESC) AS rank_id
FROM (
SELECT
text,
ts_rank(
ts_vector_extra,
to_tsquery(replace(
text(plainto_tsquery('pg_catalog.english', query)), '&', '|'
))
) AS score
FROM chunk
WHERE ts_vector_extra @@ to_tsquery(replace(
text(plainto_tsquery('pg_catalog.english', query)), '&', '|'
))
) AS subquery
ORDER BY subquery.score DESC
LIMIT 5
LOOP
ret_chunk := rec.text;
score := rec.score;
method := rec.method;
rank := rec.rank_id;
RETURN NEXT;
END LOOP;
-- Vector retrieval using cosine distance on embedding vectors.
FOR rec IN
SELECT
*,
'retrieve_by_vector' AS method,
RANK() OVER (ORDER BY distance) AS rank_id
FROM rds_ai.retrieve(
query,
'public',
'chunk',
'text',
'embedding',
distance_type => 'cosine',
topn => 5
)
LOOP
ret_chunk := rec.chunk;
score := rec.distance;
method := rec.method;
rank := rec.rank_id;
RETURN NEXT;
END LOOP;
END;
$$ LANGUAGE plpgsql;To retrieve using only vector similarity, call rds_ai.retrieve directly instead of using multi_retrieve.
Test retrieval with a sample query:
SELECT *
FROM multi_retrieve('Introduce the origins of the PostgreSQL database.')
ORDER BY method, rank;Step 5: Rerank the results
Reranking re-scores the combined retrieval results to surface the most relevant chunks before passing them to the LLM. Two approaches are available.
Use rds_ai.rank (model-based reranking)
rds_ai.rank calls a remote reranking model to score each candidate chunk against the query. This approach is more accurate but adds a remote API call:
DROP FUNCTION get_reranked_results(text, integer);
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score FLOAT, value TEXT) AS $$
DECLARE
result_array TEXT[];
BEGIN
SELECT array_agg(ret_chunk)
INTO result_array
FROM multi_retrieve(query);
RETURN QUERY
SELECT *
FROM rds_ai.rank(query, result_array)
ORDER BY score_value DESC
LIMIT top_n;
END $$ LANGUAGE plpgsql;
SELECT * FROM get_reranked_results('Introduce the origins of the PostgreSQL database.');Use the RRF function (algorithm-based reranking)
Reciprocal Rank Fusion (RRF) combines the vector and keyword rankings using a score formula, with no remote API call. Use this approach when lower latency matters more than reranking accuracy.
The RRF score for each chunk is 1/(60 + vector_rank) + 1/(60 + keyword_rank). Chunks that rank highly in both retrieval methods get a higher combined score.
Create the aggregate function:
CREATE TYPE score_agg_state AS (
vector_score numeric,
keyword_score numeric
);
CREATE OR REPLACE FUNCTION score_agg_transfn(state score_agg_state, rank numeric, method text)
RETURNS score_agg_state AS $$
BEGIN
CASE method
WHEN 'retrieve_by_vector' THEN
state.vector_score := COALESCE(1 / (60+rank), 0);
WHEN 'retrieve_by_key_word' THEN
state.keyword_score := COALESCE(1 / (60+rank), 0);
END CASE;
RETURN state;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION score_agg_finalfn(state score_agg_state)
RETURNS numeric AS $$
BEGIN
RETURN COALESCE(state.vector_score, 0) + COALESCE(state.keyword_score, 0);
END;
$$ LANGUAGE plpgsql;
CREATE AGGREGATE rff_function(numeric, text)(
SFUNC = score_agg_transfn,
STYPE = score_agg_state,
FINALFUNC = score_agg_finalfn,
INITCOND = '(0, 0)'
);Apply the RRF function to rerank the retrieval results:
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score numeric, value TEXT) AS $$
BEGIN
RETURN QUERY
SELECT
rff_function(rank, method) AS score,
ret_chunk AS value
FROM
multi_retrieve(query)
GROUP BY
value
ORDER BY
score DESC
LIMIT top_n;
END $$ LANGUAGE plpgsql;Rerank embedding-only retrieval results
If you use vector retrieval without the keyword component, call rds_ai.retrieve directly and pass the results to rds_ai.rank:
DROP FUNCTION get_reranked_results(text, integer);
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score FLOAT, value TEXT) AS $$
DECLARE
result_array TEXT[];
BEGIN
SELECT array_agg(chunk)
INTO result_array
FROM rds_ai.retrieve(query, 'public', 'chunk', 'text', 'embedding');
RETURN QUERY
SELECT *
FROM rds_ai.rank(query, result_array)
ORDER BY score_value DESC
LIMIT top_n;
END $$ LANGUAGE plpgsql;
-- Test
SELECT * FROM get_reranked_results('Introduce the origins of the PostgreSQL database.');Step 6: Return answers from the LLM
With multimodal retrieval
Use the reranked multimodal results as context for LLM-based Q&A.
Create a helper function to escape special characters in text before embedding it in a JSON prompt. This is required because
rds_ai.promptassembles a JSON object from its parameters:CREATE OR REPLACE FUNCTION escape_for_json(input TEXT) RETURNS TEXT AS $$ BEGIN RETURN replace(replace(input, '"', '\\"'), E'\n', '\\n'); END; $$ LANGUAGE plpgsql;Create the
ragfunction, which combines the reranked results into a prompt and passes it to the LLM:CREATE OR REPLACE FUNCTION rag(query TEXT) RETURNS TEXT AS $$ DECLARE prompt_content TEXT; result TEXT; BEGIN -- Aggregate the top reranked chunks into a single prompt. WITH rank_result AS ( SELECT string_agg(value, ',') AS prompt_content_pre FROM get_reranked_results(query) ) SELECT 'Based on the following content' || prompt_content_pre || 'answer my question,' || query INTO prompt_content FROM rank_result; -- Send the prompt to the LLM and return the answer. SELECT rds_ai.prompt(escape_for_json(prompt_content)) INTO result; RETURN result; END; $$ LANGUAGE plpgsql;Test the RAG function:
SELECT rag('Introduce the origins of the PostgreSQL database.');
With vector retrieval only
For a simpler pipeline based on vector retrieval alone, use the built-in rds_ai.rag function:
SELECT *
FROM rds_ai.rag(
'Introduce the origins of the PostgreSQL database.',
'public',
'chunk',
'text',
'embedding'
);