All Products
Search
Document Center

ApsaraDB RDS:Use the rds_ai extension to implement RAG

Last Updated:Mar 21, 2025

ApsaraDB RDS for PostgreSQL introduces support for the rds_ai extension that integrates the advanced models of Alibaba Cloud Model Studio, including Qwen and the text embedding model. This topic describes how to use the rds_ai extension to implement Retrieval-Augmented Generation (RAG) within a database by executing SQL statements.

Prerequisites

  • The RDS instance runs PostgreSQL 16.

  • The RDS instance runs the minor engine version 20241230 or later.

    To upgrade the minor engine version, see Update the minor engine version.

  • A privileged account is created for the RDS instance. For more information, see Create an account.

  • Alibaba Cloud Model Studio is activated and an API key is available to invoke models. For more information, see Obtain an API key.

  • The network of the RDS instance is configured to access external models. For more information, see Network configurations.

Solution overview

The solution described in this topic is ideal for simple scenarios in which you want to implement RAG without coding.

Using this solution, you can efficiently implement RAG-based operations by executing SQL statements. You do not need to be an expert on RAG or worry about the O&M of RAG frameworks, runtime environments, or complex components.

The solution involves the following key stages:

  1. Document chunking: Execute SQL statements to split the document into multiple chunks.

  2. Text embedding: Use rds_ai to convert the text into embeddings and tsvectors.

  3. Multimodal retrieval: Use rds_ai to retrieve information from various dimensions such as embeddings and keywords.

  4. Result reranking: Use rds_ai to rerank the retrieval result for relevance optimization.

  5. Large language model (LLM) invocation: Use rds_ai to invoke LLMs for RAG-based Q&A.

Step 1: Install and configure rds_ai

  1. Install the rds_ai extension.

    1. Go to the Instances page in the ApsaraDB RDS console. In the top navigation bar, select the region in which your RDS for PostgreSQL instance resides and then click the ID of your RDS instance in the instance list.

    2. In the left-side navigation pane, click Plug-ins.

    3. On the Extension Marketplace tab, click Install in the rds_ai section.

    4. On the dialog box that appears, select your database and account and click Install.

    5. Wait until the instance status changes from Maintaining Instance to Running.

      Note

      The installation takes about one minute. You can refresh the page to check the instance status.

  2. Connect to your RDS for PostgreSQL instance and configure the rds_ai extension in the public schema of your database.

    1. Configure the API key used to invoke all default models.

      -- Configure the API key used to invoke all models specified in rds_ai.model_list.
      SELECT rds_ai.update_model(model_name,'token','sk-****') FROM rds_ai.model_list;
    2. The rds_ai extension remotely invokes models based on the pgsql-http extension. Therefore, you must configure the timeout settings to interrupt long-running invocations.

      Important

      The following timeout settings are available only in the current session. You must configure them again when you establish a new session.

      -- Configure the request timeout period in milliseconds.
      SET http.timeout_msec TO 200000;
      
      SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000');
      
      -- Configure the connection timeout period.
      SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');

Step 2: Prepare data for RAG

Use SQL to insert knowledge base document data to the database. Before insertion, escape special characters in the text. For example, use \n to replace line feeds.

  1. Create a table named doc to store document data and a table named chunk to store the chunks. Then, create indexes on the two tables.

    CREATE TABLE doc (
        id SERIAL PRIMARY KEY,
        title VARCHAR(255),
        content TEXT
    );
    
    CREATE TABLE chunk (
        id SERIAL PRIMARY KEY,
        doc_id INTEGER NOT NULL,
        text TEXT,
        embedding VECTOR(1024),
        ts_vector_extra tsvector
    );
    
    -- Create indexes on the created tables.
    CREATE INDEX idx_doc_id ON chunk (doc_id);
    
    CREATE INDEX ON chunk USING hnsw (embedding vector_cosine_ops);
    
    CREATE INDEX chunk_text_gin ON chunk USING gin (ts_vector_extra);
    
    CREATE INDEX ON chunk USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
  2. Create the split_text function to split long texts into chunks at a fixed length and specify the overlap length for the chunks. This function is for reference only. To split complex texts, we recommend you use a specialized text chunking framework.

    CREATE OR REPLACE FUNCTION split_text(
        input_text TEXT,
        chunk_size INT,
        chunk_overlap INT
    ) RETURNS SETOF TEXT AS $$
    DECLARE
        current_idx INT;
        start_idx INT;
        chunk TEXT;
    BEGIN
        -- Handle exceptions when the overlap legnth exceeds the chunk length.
        IF chunk_overlap >= chunk_size THEN
            RAISE EXCEPTION 'chunk_overlap must be less than chunk_size';
        END IF;
    
        current_idx := 1;
    
        LOOP
            -- Specify the starting position.
            start_idx := current_idx - chunk_overlap;
            
            IF start_idx < 1 THEN
                start_idx := 1;
            END IF;
    
            chunk := substr(input_text, start_idx, chunk_size);
    
            IF chunk IS NULL OR length(chunk) = 0 THEN
                EXIT;
            END IF;
    
            RETURN NEXT chunk;
            current_idx := current_idx + chunk_size - chunk_overlap;
        END LOOP;
    END;
    $$ LANGUAGE plpgsql;
  3. Create triggers to ensure the split data is automatically inserted to the chunk table and converted to embeddings and tsvectors while document data is inserted to the doc table.

    If the document data is complex and difficult to split into chunks, refer to Step 3: Convert document data into embeddings in batches to convert the data into embeddings in batches.

    -- Create a trigger to automatically execute the split_text function to insert the split data into the chunk table when data is added to or deleted from the doc table.
    CREATE OR REPLACE FUNCTION insert_into_chunk() 
    RETURNS TRIGGER AS $$
    BEGIN
        INSERT INTO chunk (doc_id, text)
        SELECT NEW.id, result 
        FROM split_text(NEW.content, 300, 50) AS result;
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER after_insert_doc
    AFTER INSERT ON doc
    FOR EACH ROW
    EXECUTE FUNCTION insert_into_chunk();
    
    CREATE OR REPLACE FUNCTION delete_from_chunk() 
    RETURNS TRIGGER AS $$
    BEGIN
        DELETE FROM chunk WHERE doc_id = OLD.id;
        RETURN OLD;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER after_delete_doc
    AFTER DELETE ON doc
    FOR EACH ROW
    EXECUTE FUNCTION delete_from_chunk();
    
    -- Create a trigger to automatically execute the rds_ai.embed function to convert the chunks in the chunk table to embeddings.
    CREATE OR REPLACE FUNCTION update_chunk_embedding()
    RETURNS TRIGGER AS $$
    BEGIN
        NEW.embedding := rds_ai.embed(NEW.text)::vector;
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER set_chunk_embedding
    BEFORE INSERT OR UPDATE ON chunk
    FOR EACH ROW
    EXECUTE FUNCTION update_chunk_embedding();
    
    -- Create a trigger to automatically convert the chunks in the chunk table to tsvectors for keyword-based retrieval.
    CREATE TRIGGER embedding_tsvector_update
    BEFORE UPDATE OR INSERT ON chunk
    FOR EACH ROW
    EXECUTE PROCEDURE tsvector_update_trigger(
        'ts_vector_extra',
        'pg_catalog.english',
        'text'
    );
  4. Insert knowledge base document data to the doc table. The data in the statement is for reference only.

    INSERT INTO doc (title, content) VALUES 
      ('A Brief History of PostgreSQL', 'The object-relational database management system now known as PostgreSQL is derived from the POSTGRES package written at the University of California, Berkeley. After more than two decades of development, PostgreSQL is the most advanced open-source database available in the world.\n\n2.1. The POSTGRES Project at Berkeley\nThe POSTGRES project, led by Professor Michael Stonebraker, was sponsored by the Defense Advanced Research Projects Agency (DARPA), the Army Research Office (ARO), the National Science Foundation (NSF), and ESL, Inc. The implementation of POSTGRES began in 1986. The original concepts of the system are detailed in [ston86]. The initial data model definition is found in [rowe87]. The design of the rule system at that time is described in [ston87a]. The theoretical foundation and architecture of the storage manager are described in detail in [ston87b].\n\nSince then, POSTGRES has undergone several major version updates. The first demonstration system was available in 1987 and was exhibited at the ACM-SIGMOD conference in 1988. Version 1 was released in June 1989 (see [ston90a]) for some external users. In response to criticism of the first rule system ([ston89]), the rule system was redesigned ([ston90b]), and version 2, using the new rule system, was released in June 1990. Version 3 appeared in 1991, adding support for multiple storage managers and improving the query executor and rewriting the rule system. Subsequent versions, until the release of Postgres95 (see below), focused mainly on portability and reliability.\n\nPOSTGRES has been used to implement many different research and production applications. These applications include a financial data analysis system, a jet engine performance monitoring package, an asteroid tracking database, a medical information database, and some geographic information systems. POSTGRES has also been used by many universities for teaching purposes. Finally, Illustra Information Technologies (later merged into Informix, which is now owned by IBM) commercialized the code. In late 1992, POSTGRES became the primary data manager for the Sequoia 2000 scientific computing project.\n\nIn 1993, the number of external users nearly doubled. As the number of users increased, the time spent on source code maintenance increased and took up too much time that should have been used for database research. To reduce the support burden, the POSTGRES project at Berkeley officially ended at version 4.2.\n\n2.2. Postgres95\nIn 1994, Andrew Yu and Jolly Chen added an SQL language interpreter to POSTGRES. They subsequently released the source code under the new name Postgres95 on the Internet for everyone to use, becoming the open-source successor to the original POSTGRES Berkeley code.\n\nThe source code of Postgres95 is entirely ANSI C, and the code size was reduced by 25%. Many internal modifications improved performance and maintainability. The 1.0.x version of Postgres95 was about 30-50% faster than version 4.2 of POSTGRES when running the Wisconsin Benchmark test. In addition to fixing some bugs, the following are some major improvements:\n\nThe original query language PostQUEL was replaced by SQL (implemented on the server side). The interface library libpq was named after PostQUEL. Before PostgreSQL, subqueries were not supported (see below), but they could be simulated in Postgres95 by user-defined SQL functions. Aggregate functions were re-implemented. Support for GROUP BY query clauses was also added.\n\nA new program for interactive SQL queries using the Readline of GNU was added (psql). This program largely replaced the old monitor program.\n\nA new frontend library (libpgtcl) was added to support Tcl-based clients. A sample shell (pgtclsh) provided new Tcl commands for interaction between Tcl programs and the Postgres95 server.\n\nThe interface for large objects was completely rewritten. The inversion file system was removed, retaining inversion as the only mechanism for storing large objects.\n\nThe instance-level rule system was removed. However, rules still exist in the form of rewrite rules.\n\nA brief tutorial introducing SQL and Postgres95 features was added to the released source code.\n\nPostgres95 could be compiled using unpatched GCC (fixing double-precision data alignment issues).\n\n\n2.3. PostgreSQL\nBy 1996, it was clear that the name Postgres95 was outdated. We chose a new name, PostgreSQL, to reflect the relationship between the original POSTGRES and the latest version with SQL capabilities. The version number also started from 6.0, returning to the sequence initially started by the Berkeley POSTGRES project.\n\nMany people continue to use Postgres to refer to PostgreSQL (now rarely in all uppercase letters) due to tradition or easier pronunciation. This usage is widely accepted as a nickname or alias.\n\nThe development of Postgres95 focused on identifying and understanding existing issues in the backend code. The development of PostgreSQL shifted to some controversial features and functions, with work in all areas being carried out simultaneously.\n\n Changes to PostgreSQL since then can be found in Appendix E.'),
      ('Overview of MySQL Database Management System', 'MySQL is the most popular open-source SQL database management system, developed, released, and supported by MySQL AB. MySQL AB is a commercial company founded by several MySQL developers. It is a second-generation open-source company that combines open-source values, methods, and a successful business model.\n\nThe MySQL website (http://www.mysql.com/) provides the latest information about MySQL and MySQL.\n\n·         MySQL is a database management system.\n\nA database is a structured collection of data. It can be anything from a simple shopping list to an art gallery or a vast amount of information in a corporate network. To add data to a database or access and process data stored in a computer database, you need to use a database management system such as the MySQL server. Computers are ideal tools for handling large amounts of data, so database management systems play a critical central role in computing, either as standalone utilities or as components of other applications.\n\nMySQL is a relational database management system.\n\nA relational database stores data in different tables rather than putting all data in one large repository. This increases speed and flexibility. SQL in MySQL refers to Structured Query Language. SQL is the most commonly used standardized language for accessing databases, defined by the ANSI/ISO SQL standard. The SQL standard has evolved since 1986, with several versions. In this manual, SQL-92 refers to the standard released in 1992, SQL:1999 refers to the standard released in 1999, and SQL:2003 refers to the current version of the standard. We use the term SQL standard to denote the current version of the SQL standard.\n\nMySQL software is open-source software.\n\nOpen source means that anyone can use and change the software. Anyone can download MySQL software from the Internet without paying any fees. If you want, you can study the source code and make appropriate changes to meet your own needs. MySQL software is licensed under the GPL (GNU General Public License), http://www.mysql.com/company/legal/licensing/).\n\nThe MySQL database server is fast, reliable, and easy to use.\n\nIf it is what you are looking for, give it a try. The MySQL server also has a set of practical features developed through close cooperation with our users. Our benchmark homepage provides comparison results of the MySQL server and other database managers. See 7.1.4 MySQL Benchmark Suite.\n\nThe MySQL server was initially developed to handle large databases, and it is faster than existing solutions. Over the years, it has been successfully used in many demanding production environments. Although MySQL is constantly evolving, the MySQL server currently offers rich and useful features. It has good connectivity, speed, and security, making MySQL very suitable for accessing databases on the Internet.\n\nThe MySQL server operates in client/server mode or embedded systems.\n\nThe MySQL database software is a client/server system consisting of a multithreaded SQL server that supports different backends, several different client programs and libraries, numerous management tools, and extensive application programming interfaces (APIs).\n\nWe can also provide the MySQL server as an embedded multithreaded library that you can link to your application, resulting in a smaller, faster, and more manageable product.\n\nThere is a large amount of shared MySQL software available.\n\nIt is very likely that your favorite applications and languages support the MySQL database server.\n\nThe official pronunciation of MySQL is My Ess Que Ell (not my sequel), but we do not mind if you pronounce it as my sequel or in any other local way.'),
      ('What is SQL Server?', 'Microsoft SQL Server is a relational database management system (RDBMS). Applications and tools connect to SQL Server instances or databases and communicate using Transact-SQL (T-SQL).\n\nDeployment Options\nSQL Server can be installed on Windows or Linux, deployed in Linux containers, or deployed on Azure virtual machines or other virtual machine platforms. You may have previously referred to it as a boxed product.\n\nWhich SQL Server versions are supported depends on your licensing agreement, but for the purposes of this document, we refer to SQL Server 2016 (13.x) and later versions. For documentation on SQL Server 2014 (12.x) and earlier versions, see the documentation for previous versions of SQL Server. To find out which SQL Server versions are currently supported, see SQL Server End of Support Options.\n\nThe following products and services also use the underlying SQL Server database engine:\n\nAzure SQL Database\nAzure SQL Managed Instance\nMicrosoft Analytics Platform System (PDW)\nAzure Synapse Analytics\nAzure SQL Edge\nFor a list of features supported by various versions of SQL Server on Windows, see:\n\nSQL Server 2022 versions and their supported features\nSQL Server 2019 versions and their supported features\nVersions and features supported by SQL Server 2017\nVersions and features supported by SQL Server 2016\nSQL Server Components and Technologies\nThis section introduces some key technologies available in SQL Server.\n\nComponent\tDescription\nDatabase Engine\tThe database engine is the core service for storing, processing, and protecting data. The database engine provides controlled access and transaction processing to meet the requirements of the most demanding data consumption applications within an enterprise. The database engine also provides comprehensive support for business continuity and database recovery (SQL Server) to maintain business continuity.\nMachine Learning Services (MLS)\tSQL Server Machine Learning Services supports integrating machine learning into enterprise workflows using popular R and Python languages.\n\nMachine Learning Services (in-database) integrates R and Python with SQL Server, allowing users to easily generate, retrain models, and score models by calling stored procedures. Machine Learning Server provides enterprise-level support for R and Python, allowing users to use them without SQL Server.\nIntegration Services (SSIS)\tSQL Server Integration Services is a platform for building high-performance data integration solutions, including packages for extract, transform, and load (ETL) processing for data warehouses.\nAnalysis Services (SSAS)\tSQL Server Analysis Services is an analytical data platform and toolset for personal, team, and corporate business intelligence. Server and client designers support traditional OLAP solutions, new tabular modeling solutions, and self-service analytics and collaboration by using Power Pivot, Excel, and SharePoint Server environments. Analysis Services also includes data mining, allowing you to discover patterns and relationships hidden in large amounts of data.\nReporting Services (SSRS)\tSQL Server Reporting Services provides web-enabled enterprise-level reporting capabilities. It allows users to create reports that extract content from multiple data sources, publish reports in various formats, and centrally manage security and subscriptions.\nReplication\tSQL Server Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. Replication allows you to distribute data to different locations and distribute it to remote or mobile users over local area networks, wide area networks, dial-up connections, wireless connections, and the Internet.\nData Quality Services (DQS)\tData Quality Services provides a knowledge-driven data cleansing solution. DQS allows you to generate a knowledge base and then use this knowledge base, along with computer-assisted and interactive methods, to perform data correction and deduplication. You can use cloud-based reference data services and generate a data management solution that integrates DQS with SQL Server Integration Services and Master Data Services.\nMaster Data Services (MDS)\tMaster Data Services is the master data management solution for SQL Server. Solutions built on Master Data Services help ensure that reports and analyses are based on the appropriate information. With Master Data Services, you can create a central repository for master data and maintain an auditable, secure record of master data as it changes over time.\n')
    ;
  5. View the data in the chunk table. The data should include the text chunks split from the document data and the embeddings and tsvectors converted from the chunks.

    SELECT * FROM chunk;

Step 3: Convert document data into embeddings in batches

Skip this step if you create triggers for automatic text embedding following Substep 4 in Step 2: Prepare data for RAG.

In scenarios where document data is complex and difficult to split into chunks, we recommend you use RAG frameworks such as LangChain to develop a script for text embedding. Using this method, you can also implement text embedding on existing chunk tables in the database. You can invoke the rds_ai.embed function to recursively convert document data or invoke a custom function to convert document data in batches.

  • Convert document data recursively

    Invoke the rds_ai.embed function recursively to convert the text in each record into embeddings and save them in a column named embedding.

    -- Recursively invoke rds_ai.embed to convert all data in the table into embeddings.
    UPDATE chunk
    SET embedding = rds_ai.embed(text)::vector;
  • Convert document data in batches

    1. Invoke rds_ai.add_model to create the text-embedding-v3-batch model.

      SELECT rds_ai.add_model(
          'text-embedding-v3-batch',          -- Model name
          'POST',                             -- Request method 
          ARRAY[('Authorization', 'Bearer %s')]::http.http_header[],  -- Request header
          'https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings', -- Request URL
          'application/json',                 -- Request content type
          '{
              "model": "text-embedding-v3",
              "input": %s
          }',                               -- Request body
          'SELECT %L::jsonb->''data''' -- SQL to process the response
      );
    2. Configure the API key used to invoke the text-embedding-v3-batch model

      SELECT rds_ai.update_model(
          'text-embedding-v3-batch',
          'token',
          'sk-xxxxxx'
      );
    3. Test the model.

      SELECT * 
      FROM rds_ai.invoke_model(
          'text-embedding-v3-batch',
          ARRAY['["01","02","03"]']
      );

You can choose one of the preceding embedding methods based on the test results. However, we recommend that you recursively convert document data into embeddings. The following code provides an example:

DO $$
DECLARE
    batch_size INT := 20;         -- Batch size for embedding calls, can be adjusted based on the model
    pointer INT := 0;             -- Pointer to traverse the table
    record_count INT;             -- Total number of records in the table
    input_text TEXT;              -- Concatenated string of fields to be embedded in ["item1", "item2",..., "item n"] format
    json_results JSON;            -- JSON response from embedding API call
    json_item JSON;               -- Single element from the JSON array response containing index and embedding fields
    idx INT;                      -- Index from the JSON response
BEGIN
    -- Get the total number of records
    SELECT COUNT(*) INTO record_count FROM chunk;

    -- Loop through each batch
    WHILE pointer < record_count LOOP
        -- Construct input
        SELECT json_agg(text::TEXT) INTO input_text
        FROM (
            SELECT text
            FROM chunk
            ORDER BY id
            LIMIT batch_size OFFSET pointer
        ) AS subquery;

        -- Invoke the embedding model
        json_results := rds_ai.invoke_model('text-embedding-v3-batch', ARRAY[input_text]);

        -- Loop through the model results and update corresponding rows
        FOR idx IN 0..json_array_length(json_results) - 1 LOOP
            json_item := json_results->idx;
            UPDATE chunk
            SET embedding = (SELECT (json_item->>'embedding')::VECTOR(1024))
            WHERE id = (SELECT id FROM chunk ORDER BY id LIMIT 1 OFFSET pointer + idx);
        END LOOP;

        -- Update the pointer for the next batch
        pointer := pointer + batch_size;
        RAISE NOTICE '%/% has done.', pointer, record_count;  
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Step 4: Implement multimodal retrieval

Implement data retrieval based on embeddings and keywords. The following code provides an example:

CREATE OR REPLACE FUNCTION multi_retrieve(query TEXT)
RETURNS TABLE(ret_chunk text, score numeric, method text, rank int) AS $$
DECLARE
    rec RECORD;
BEGIN
    -- Retrieve by keyword
    FOR rec IN
        SELECT
            text,
            subquery.score,
            'retrieve_by_key_word' AS method,
            RANK() OVER (ORDER BY subquery.score DESC) AS rank_id
        FROM (
            SELECT
                text,
                ts_rank(
                    ts_vector_extra,
                    to_tsquery(replace(
                        text(plainto_tsquery('pg_catalog.english', query)), '&', '|'
                    ))
                ) AS score
            FROM chunk
            WHERE ts_vector_extra @@ to_tsquery(replace(
                text(plainto_tsquery('pg_catalog.english', query)), '&', '|'
            ))
        ) AS subquery
        ORDER BY subquery.score DESC
        LIMIT 5
    LOOP
        ret_chunk := rec.text;
        score := rec.score;
        method := rec.method;
        rank := rec.rank_id;
        RETURN NEXT;
    END LOOP;

    -- Retrieve by vector
    FOR rec IN
        SELECT
            *,
            'retrieve_by_vector' AS method,
            RANK() OVER (ORDER BY distance) AS rank_id
        FROM rds_ai.retrieve(
            query, 
            'public', 
            'chunk', 
            'text', 
            'embedding', 
            distance_type => 'cosine', 
            topn => 5
        )
    LOOP
        ret_chunk := rec.chunk;
        score := rec.distance;
        method := rec.method;
        rank := rec.rank_id;
        RETURN NEXT;
    END LOOP;
END;
$$ LANGUAGE plpgsql;
Note

To retrieve data only based on embeddings, call the rds_ai.retrieve function.

You can execute the following statement to test whether multimodal retrieval works.

SELECT * 
FROM multi_retrieve('Introduce the origins of the PostgreSQL database.') 
ORDER BY METHOD, RANK;

Step 5: Rerank the result

Rerank multimodal retrieval result

You can use the rds_ai.rank or RFF function to rerank the multimodal retrieval result.

Use rds_ai.rank for reranking

DROP FUNCTION get_reranked_results(text, integer);
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score FLOAT, value TEXT) AS $$
DECLARE
    result_array TEXT[];
BEGIN
    SELECT array_agg(ret_chunk)
    INTO result_array
    FROM multi_retrieve(query);

    RETURN QUERY
    SELECT *
    FROM rds_ai.rank(query, result_array)
    ORDER BY score_value DESC
    LIMIT top_n;
END $$ LANGUAGE plpgsql;

SELECT * FROM get_reranked_results('Introduce the origins of the PostgreSQL database.');

Use the RFF function for reranking

-- Create an aggregate function for common RFF functions.
CREATE TYPE score_agg_state AS (
    vector_score numeric,
    keyword_score numeric
);

CREATE OR REPLACE FUNCTION score_agg_transfn(state score_agg_state, rank numeric, method text)
RETURNS score_agg_state AS $$
BEGIN
    CASE method
        WHEN 'retrieve_by_vector' THEN
            state.vector_score := COALESCE(1 / (60+rank), 0);
        WHEN 'retrieve_by_key_word' THEN
            state.keyword_score := COALESCE(1 / (60+rank), 0);
    END CASE;
    RETURN state;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION score_agg_finalfn(state score_agg_state)
RETURNS numeric AS $$
BEGIN
    RETURN COALESCE(state.vector_score, 0) + COALESCE(state.keyword_score, 0);
END;
$$ LANGUAGE plpgsql;

CREATE AGGREGATE rff_function(numeric, text)(
    SFUNC = score_agg_transfn,
    STYPE = score_agg_state,
    FINALFUNC = score_agg_finalfn,
    INITCOND = '(0, 0)'
);

Invoke the function to rerank the result.

CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score numeric, value TEXT) AS $$
BEGIN
    RETURN QUERY
    SELECT 
        rff_function(rank, method) AS score,  --  Use rff_function(rank, method) to calculate the score
        ret_chunk AS value
    FROM 
        multi_retrieve(query)
    GROUP BY 
        value     
    ORDER BY 
        score DESC
    LIMIT top_n;
END $$ LANGUAGE plpgsql;

Rerank embedding-based retrieval result

You can execute the following SQL statement to invoke the rds_ai.rank function to rerank the result of embedding-based retrieval:

DROP FUNCTION get_reranked_results(text, integer);
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score FLOAT, value TEXT) AS $$
DECLARE
    result_array TEXT[];
BEGIN
    SELECT array_agg(chunk)
    INTO result_array
    FROM rds_ai.retrieve(query, 'public', 'chunk', 'text', 'embedding');
    RETURN QUERY
    SELECT *
    FROM rds_ai.rank(query, result_array)
    ORDER BY score_value DESC
    LIMIT top_n;
END $$ LANGUAGE plpgsql;

-- Test
SELECT * FROM get_reranked_results('Introduce the origins of the PostgreSQL database.');

Step 6: Invoke LLMs to return the result

Return multimodal retrieval result

  1. Create the escape_for_json function to escape the special characters such as line feeds in the result within the database. This step is essential because the result returned by invoking the rds_ai.prompt function is a JSON object assembled by parameters.

    CREATE OR REPLACE FUNCTION escape_for_json(input TEXT)
    RETURNS TEXT AS $$
    BEGIN
        RETURN replace(replace(input, '"', '\\"'), E'\n', '\\n');
    END;
    $$ LANGUAGE plpgsql;
  2. Create a function named rag to return the multimodal retrieval result. The following code provides an example:

    CREATE OR REPLACE FUNCTION rag(query TEXT) RETURNS TEXT AS $$
    DECLARE
      prompt_content TEXT;
      result TEXT;
    BEGIN
      -- Use the get_reranked_results function to obtain the reranked results and aggregate them into an array
      WITH rank_result AS (
        SELECT string_agg(value, ',') AS prompt_content_pre 
        FROM get_reranked_results(query)
      )
      -- Generate a prompt from the rank_result query and obtain the model's answer
      SELECT 
        'Based on the following content' || prompt_content_pre || 'answer my question,' || query
      INTO prompt_content
      FROM rank_result;
    
      -- Use the rds_ai.prompt function to obtain the AI model's answer
      SELECT 
        rds_ai.prompt(escape_for_json(prompt_content))
      INTO result;
    
      -- Return the result
      RETURN result;
    END;
    $$ LANGUAGE plpgsql;
  3. Invoke the rag function to get the answer obtained by using multimodal retrieval.

    SELECT rag('Introduce the origins of the PostgreSQL database.');

Return embedding-based retrieval result

To get the embedding-based retrieval result, execute the following statement to invoke the rds_ai.rag function:

SELECT *
FROM rds_ai.rag (
    'Introduce the origins of the PostgreSQL database.', 
    'public', 
    'chunk', 
    'text', 
    'embedding'
);