All Products
Search
Document Center

ApsaraDB RDS:Implementasikan RAG menggunakan ekstensi rds_ai

Last Updated:Mar 29, 2026

Ekstensi rds_ai untuk ApsaraDB RDS for PostgreSQL mengintegrasikan model Alibaba Cloud Model Studio — termasuk Qwen dan model text embedding — langsung ke dalam database Anda. Tutorial ini memandu Anda membangun pipeline Retrieval-Augmented Generation (RAG) lengkap sepenuhnya dalam SQL, tanpa perlu menulis kode aplikasi atau mengelola framework RAG eksternal.

Prasyarat

Sebelum memulai, pastikan Anda telah memiliki:

Cara kerja

Pipeline ini berjalan sepenuhnya di dalam instans RDS Anda melalui lima tahap:

  1. Document chunking — Membagi dokumen menjadi chunk berukuran tetap menggunakan SQL. (Berjalan secara lokal di instans PostgreSQL Anda.)

  2. Text embedding — Mengonversi chunk menjadi vektor embedding dan tsvectors menggunakan rds_ai. (Memanggil model embedding jarak jauh.)

  3. Multimodal retrieval — Mengambil kandidat berdasarkan kemiripan vektor dan pencocokan kata kunci. (Berjalan secara lokal.)

  4. Result reranking — Menilai ulang kandidat untuk menampilkan hasil yang paling relevan. (Algoritma lokal atau model reranking jarak jauh.)

  5. LLM invocation — Mengirim hasil teratas ke LLM dan mengembalikan jawaban. (Memanggil model chat jarak jauh.)

Fungsi utama

FunctionDescription
rds_ai.update_model(model_name, 'token', 'sk-...')Atur Kunci API untuk suatu model
rds_ai.embed(text)::vectorKonversi teks menjadi vektor embedding (pemanggilan jarak jauh)
rds_ai.retrieve(query, schema, table, text_col, embedding_col)Ambil chunk berdasarkan kemiripan vektor
rds_ai.rank(query, result_array)Susun ulang chunk yang diambil berdasarkan skor relevansi (pemanggilan jarak jauh)
rds_ai.prompt(prompt_text)Kirim prompt ke LLM default dan kembalikan respons (pemanggilan jarak jauh)
rds_ai.rag(query, schema, table, text_col, embedding_col)Jalankan pengambilan vektor dan pemanggilan LLM dalam satu panggilan
rds_ai.add_model(...)Daftarkan model kustom
http.http_set_curlopt(option, value)Atur opsi curl HTTP untuk sesi saat ini

Langkah 1: Instal dan konfigurasi rds_ai

Instal ekstensi

  1. Buka halaman Instances di Konsol ApsaraDB RDS. Di bilah navigasi atas, pilih wilayah tempat instans Anda berada, lalu klik ID instans.

  2. Di panel navigasi sebelah kiri, klik Plug-ins.

  3. Di tab Extension Marketplace, klik Install di bagian rds_ai.

  4. Pada dialog yang muncul, pilih database dan akun Anda, lalu klik Install.

  5. Tunggu hingga status instans berubah dari Maintaining Instance menjadi Running.

Catatan

Instalasi memerlukan waktu sekitar satu menit. Muat ulang halaman untuk memeriksa status terkini.

Konfigurasi ekstensi

Hubungkan ke instans RDS for PostgreSQL Anda dan jalankan pernyataan berikut di skema public.

  1. Atur Kunci API untuk semua model default:

    -- Konfigurasikan Kunci API untuk semua model yang tercantum di rds_ai.model_list.
    SELECT rds_ai.update_model(model_name, 'token', 'sk-****') FROM rds_ai.model_list;
  2. Konfigurasi timeout permintaan. Ekstensi rds_ai memanggil model melalui HTTP menggunakan ekstensi pgsql-http, sehingga pemanggilan yang memakan waktu lama memerlukan timeout eksplisit.

    Penting

    Pengaturan ini hanya berlaku untuk sesi saat ini. Konfigurasi ulang setiap kali Anda membuka sesi baru.

    -- Atur timeout permintaan dalam milidetik.
    SET http.timeout_msec TO 200000;
    
    SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000');
    
    -- Atur timeout koneksi dalam milidetik.
    SELECT http.http_set_curlopt('CURLOPT_CONNECTTIMEOUT_MS', '200000');

Langkah 2: Siapkan data untuk RAG

Langkah ini menyiapkan tabel, logika chunking, dan trigger yang secara otomatis membuat embedding dokumen saat dimasukkan.

  1. Buat tabel doc untuk dokumen mentah dan tabel chunk untuk chunk yang diproses, lalu tambahkan indeks:

    CREATE TABLE doc (
        id SERIAL PRIMARY KEY,
        title VARCHAR(255),
        content TEXT
    );
    
    CREATE TABLE chunk (
        id SERIAL PRIMARY KEY,
        doc_id INTEGER NOT NULL,
        text TEXT,
        embedding VECTOR(1024),
        ts_vector_extra tsvector
    );
    
    -- Indeks untuk menggabungkan chunk kembali ke dokumen sumbernya.
    CREATE INDEX idx_doc_id ON chunk (doc_id);
    
    -- Indeks HNSW untuk pencarian vektor tetangga terdekat aproksimasi.
    -- Gunakan indeks default untuk sebagian besar beban kerja.
    -- Gunakan indeks dengan parameter m dan ef_construction eksplisit
    -- untuk menyesuaikan pertukaran antara kecepatan dan recall pada dataset besar.
    CREATE INDEX ON chunk USING hnsw (embedding vector_cosine_ops);
    CREATE INDEX ON chunk USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
    
    -- Indeks GIN untuk pencarian kata kunci berbasis full-text.
    CREATE INDEX chunk_text_gin ON chunk USING gin (ts_vector_extra);
  2. Buat fungsi split_text untuk membagi teks panjang menjadi chunk berukuran tetap dengan overlap yang dapat dikonfigurasi. Untuk dokumen kompleks, gunakan framework chunking teks khusus sebagai gantinya.

    CREATE OR REPLACE FUNCTION split_text(
        input_text TEXT,
        chunk_size INT,
        chunk_overlap INT
    ) RETURNS SETOF TEXT AS $$
    DECLARE
        current_idx INT;
        start_idx INT;
        chunk TEXT;
    BEGIN
        -- Validasi bahwa overlap lebih kecil dari ukuran chunk.
        IF chunk_overlap >= chunk_size THEN
            RAISE EXCEPTION 'chunk_overlap must be less than chunk_size';
        END IF;
    
        current_idx := 1;
    
        LOOP
            -- Terapkan overlap dengan memindahkan posisi awal ke belakang.
            start_idx := current_idx - chunk_overlap;
    
            IF start_idx < 1 THEN
                start_idx := 1;
            END IF;
    
            chunk := substr(input_text, start_idx, chunk_size);
    
            IF chunk IS NULL OR length(chunk) = 0 THEN
                EXIT;
            END IF;
    
            RETURN NEXT chunk;
            current_idx := current_idx + chunk_size - chunk_overlap;
        END LOOP;
    END;
    $$ LANGUAGE plpgsql;
  3. Buat trigger untuk mengotomatiskan chunking, embedding, dan pembuatan tsvector setiap kali dokumen dimasukkan:

    Catatan

    Lewati langkah ini dan langsung ke Langkah 3 jika dokumen Anda kompleks dan sulit dibagi menggunakan fungsi berukuran tetap.

    -- Bagi dokumen yang dimasukkan menjadi chunk dan tulis ke tabel chunk.
    CREATE OR REPLACE FUNCTION insert_into_chunk()
    RETURNS TRIGGER AS $$
    BEGIN
        INSERT INTO chunk (doc_id, text)
        SELECT NEW.id, result
        FROM split_text(NEW.content, 300, 50) AS result;
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER after_insert_doc
    AFTER INSERT ON doc
    FOR EACH ROW
    EXECUTE FUNCTION insert_into_chunk();
    
    -- Hapus chunk saat dokumen sumbernya dihapus.
    CREATE OR REPLACE FUNCTION delete_from_chunk()
    RETURNS TRIGGER AS $$
    BEGIN
        DELETE FROM chunk WHERE doc_id = OLD.id;
        RETURN OLD;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER after_delete_doc
    AFTER DELETE ON doc
    FOR EACH ROW
    EXECUTE FUNCTION delete_from_chunk();
    
    -- Panggil rds_ai.embed untuk menghasilkan vektor embedding untuk setiap chunk baru atau yang diperbarui.
    CREATE OR REPLACE FUNCTION update_chunk_embedding()
    RETURNS TRIGGER AS $$
    BEGIN
        NEW.embedding := rds_ai.embed(NEW.text)::vector;
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    
    CREATE TRIGGER set_chunk_embedding
    BEFORE INSERT OR UPDATE ON chunk
    FOR EACH ROW
    EXECUTE FUNCTION update_chunk_embedding();
    
    -- Bangun tsvectors untuk pengambilan berbasis kata kunci.
    CREATE TRIGGER embedding_tsvector_update
    BEFORE UPDATE OR INSERT ON chunk
    FOR EACH ROW
    EXECUTE PROCEDURE tsvector_update_trigger(
        'ts_vector_extra',
        'pg_catalog.english',
        'text'
    );
  4. Masukkan dokumen ke dalam tabel doc. Trigger yang dibuat pada langkah sebelumnya secara otomatis membagi, membuat embedding, dan mengindeks setiap dokumen. Data sampel berikut hanya sebagai referensi:

    INSERT INTO doc (title, content) VALUES
      ('A Brief History of PostgreSQL', 'The object-relational database management system now known as PostgreSQL is derived from the POSTGRES package written at the University of California, Berkeley. After more than two decades of development, PostgreSQL is the most advanced open-source database available in the world.\n\n2.1. The POSTGRES Project at Berkeley\nThe POSTGRES project, led by Professor Michael Stonebraker, was sponsored by the Defense Advanced Research Projects Agency (DARPA), the Army Research Office (ARO), the National Science Foundation (NSF), and ESL, Inc. The implementation of POSTGRES began in 1986. The original concepts of the system are detailed in [ston86]. The initial data model definition is found in [rowe87]. The design of the rule system at that time is described in [ston87a]. The theoretical foundation and architecture of the storage manager are described in detail in [ston87b].\n\nSince then, POSTGRES has undergone several major version updates. The first demonstration system was available in 1987 and was exhibited at the ACM-SIGMOD conference in 1988. Version 1 was released in June 1989 (see [ston90a]) for some external users. In response to criticism of the first rule system ([ston89]), the rule system was redesigned ([ston90b]), and version 2, using the new rule system, was released in June 1990. Version 3 appeared in 1991, adding support for multiple storage managers and improving the query executor and rewriting the rule system. Subsequent versions, until the release of Postgres95 (see below), focused mainly on portability and reliability.\n\nPOSTGRES has been used to implement many different research and production applications. These applications include a financial data analysis system, a jet engine performance monitoring package, an asteroid tracking database, a medical information database, and some geographic information systems. POSTGRES has also been used by many universities for teaching purposes. Finally, Illustra Information Technologies (later merged into Informix, which is now owned by IBM) commercialized the code. In late 1992, POSTGRES became the primary data manager for the Sequoia 2000 scientific computing project.\n\nIn 1993, the number of external users nearly doubled. As the number of users increased, the time spent on source code maintenance increased and took up too much time that should have been used for database research. To reduce the support burden, the POSTGRES project at Berkeley officially ended at version 4.2.\n\n2.2. Postgres95\nIn 1994, Andrew Yu and Jolly Chen added an SQL language interpreter to POSTGRES. They subsequently released the source code under the new name Postgres95 on the Internet for everyone to use, becoming the open-source successor to the original POSTGRES Berkeley code.\n\nThe source code of Postgres95 is entirely ANSI C, and the code size was reduced by 25%. Many internal modifications improved performance and maintainability. The 1.0.x version of Postgres95 was about 30-50% faster than version 4.2 of POSTGRES when running the Wisconsin Benchmark test. In addition to fixing some bugs, the following are some major improvements:\n\nThe original query language PostQUEL was replaced by SQL (implemented on the server side). The interface library libpq was named after PostQUEL. Before PostgreSQL, subqueries were not supported (see below), but they could be simulated in Postgres95 by user-defined SQL functions. Aggregate functions were re-implemented. Support for GROUP BY query clauses was also added.\n\nA new program for interactive SQL queries using the Readline of GNU was added (psql). This program largely replaced the old monitor program.\n\nA new frontend library (libpgtcl) was added to support Tcl-based clients. A sample shell (pgtclsh) provided new Tcl commands for interaction between Tcl programs and the Postgres95 server.\n\nThe interface for large objects was completely rewritten. The inversion file system was removed, retaining inversion as the only mechanism for storing large objects.\n\nThe instance-level rule system was removed. However, rules still exist in the form of rewrite rules.\n\nA brief tutorial introducing SQL and Postgres95 features was added to the released source code.\n\nPostgres95 could be compiled using unpatched GCC (fixing double-precision data alignment issues).\n\n\n2.3. PostgreSQL\nBy 1996, it was clear that the name Postgres95 was outdated. We chose a new name, PostgreSQL, to reflect the relationship between the original POSTGRES and the latest version with SQL capabilities. The version number also started from 6.0, returning to the sequence initially started by the Berkeley POSTGRES project.\n\nMany people continue to use Postgres to refer to PostgreSQL (now rarely in all uppercase letters) due to tradition or easier pronunciation. This usage is widely accepted as a nickname or alias.\n\nThe development of Postgres95 focused on identifying and understanding existing issues in the backend code. The development of PostgreSQL shifted to some controversial features and functions, with work in all areas being carried out simultaneously.\n\n Changes to PostgreSQL since then can be found in Appendix E.'),
      ('Overview of MySQL Database Management System', 'MySQL is the most popular open-source SQL database management system, developed, released, and supported by MySQL AB. MySQL AB is a commercial company founded by several MySQL developers. It is a second-generation open-source company that combines open-source values, methods, and a successful business model.\n\nThe MySQL website (http://www.mysql.com/) provides the latest information about MySQL and MySQL.\n\n·         MySQL is a database management system.\n\nA database is a structured collection of data. It can be anything from a simple shopping list to an art gallery or a vast amount of information in a corporate network. To add data to a database or access and process data stored in a computer database, you need to use a database management system such as the MySQL server. Computers are ideal tools for handling large amounts of data, so database management systems play a critical central role in computing, either as standalone utilities or as components of other applications.\n\nMySQL is a relational database management system.\n\nA relational database stores data in different tables rather than putting all data in one large repository. This increases speed and flexibility. SQL in MySQL refers to Structured Query Language. SQL is the most commonly used standardized language for accessing databases, defined by the ANSI/ISO SQL standard. The SQL standard has evolved since 1986, with several versions. In this manual, SQL-92 refers to the standard released in 1992, SQL:1999 refers to the standard released in 1999, and SQL:2003 refers to the current version of the standard. We use the term SQL standard to denote the current version of the SQL standard.\n\nMySQL software is open-source software.\n\nOpen source means that anyone can use and change the software. Anyone can download MySQL software from the Internet without paying any fees. If you want, you can study the source code and make appropriate changes to meet your own needs. MySQL software is licensed under the GPL (GNU General Public License), http://www.mysql.com/company/legal/licensing/).\n\nThe MySQL database server is fast, reliable, and easy to use.\n\nIf it is what you are looking for, give it a try. The MySQL server also has a set of practical features developed through close cooperation with our users. Our benchmark homepage provides comparison results of the MySQL server and other database managers. See 7.1.4 MySQL Benchmark Suite.\n\nThe MySQL server was initially developed to handle large databases, and it is faster than existing solutions. Over the years, it has been successfully used in many demanding production environments. Although MySQL is constantly evolving, the MySQL server currently offers rich and useful features. It has good connectivity, speed, and security, making MySQL very suitable for accessing databases on the Internet.\n\nThe MySQL server operates in client/server mode or embedded systems.\n\nThe MySQL database software is a client/server system consisting of a multithreaded SQL server that supports different backends, several different client programs and libraries, numerous management tools, and extensive application programming interfaces (APIs).\n\nWe can also provide the MySQL server as an embedded multithreaded library that you can link to your application, resulting in a smaller, faster, and more manageable product.\n\nThere is a large amount of shared MySQL software available.\n\nIt is very likely that your favorite applications and languages support the MySQL database server.\n\nThe official pronunciation of MySQL is My Ess Que Ell (not my sequel), but we do not mind if you pronounce it as my sequel or in any other local way.'),
      ('What is SQL Server?', 'Microsoft SQL Server is a relational database management system (RDBMS). Applications and tools connect to SQL Server instances or databases and communicate using Transact-SQL (T-SQL).\n\nDeployment Options\nSQL Server can be installed on Windows or Linux, deployed in Linux containers, or deployed on Azure virtual machines or other virtual machine platforms. You may have previously referred to it as a boxed product.\n\nWhich SQL Server versions are supported depends on your licensing agreement, but for the purposes of this document, we refer to SQL Server 2016 (13.x) and later versions. For documentation on SQL Server 2014 (12.x) and earlier versions, see the documentation for previous versions of SQL Server. To find out which SQL Server versions are currently supported, see SQL Server End of Support Options.\n\nThe following products and services also use the underlying SQL Server database engine:\n\nAzure SQL Database\nAzure SQL Managed Instance\nMicrosoft Analytics Platform System (PDW)\nAzure Synapse Analytics\nAzure SQL Edge\nFor a list of features supported by various versions of SQL Server on Windows, see:\n\nSQL Server 2022 versions and their supported features\nSQL Server 2019 versions and their supported features\nVersions and features supported by SQL Server 2017\nVersions and features supported by SQL Server 2016\nSQL Server Components and Technologies\nThis section introduces some key technologies available in SQL Server.\n\nComponent\tDescription\nDatabase Engine\tThe database engine is the core service for storing, processing, and protecting data. The database engine provides controlled access and transaction processing to meet the requirements of the most demanding data consumption applications within an enterprise. The database engine also provides comprehensive support for business continuity and database recovery (SQL Server) to maintain business continuity.\nMachine Learning Services (MLS)\tSQL Server Machine Learning Services supports integrating machine learning into enterprise workflows using popular R and Python languages.\n\nMachine Learning Services (in-database) integrates R and Python with SQL Server, allowing users to easily generate, retrain models, and score models by calling stored procedures. Machine Learning Server provides enterprise-level support for R and Python, allowing users to use them without SQL Server.\nIntegration Services (SSIS)\tSQL Server Integration Services is a platform for building high-performance data integration solutions, including packages for extract, transform, and load (ETL) processing for data warehouses.\nAnalysis Services (SSAS)\tSQL Server Analysis Services is an analytical data platform and toolset for personal, team, and corporate business intelligence. Server and client designers support traditional OLAP solutions, new tabular modeling solutions, and self-service analytics and collaboration by using Power Pivot, Excel, and SharePoint Server environments. Analysis Services also includes data mining, allowing you to discover patterns and relationships hidden in large amounts of data.\nReporting Services (SSRS)\tSQL Server Reporting Services provides web-enabled enterprise-level reporting capabilities. It allows users to create reports that extract content from multiple data sources, publish reports in various formats, and centrally manage security and subscriptions.\nReplication\tSQL Server Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases to maintain consistency. Replication allows you to distribute data to different locations and distribute it to remote or mobile users over local area networks, wide area networks, dial-up connections, wireless connections, and the Internet.\nData Quality Services (DQS)\tData Quality Services provides a knowledge-driven data cleansing solution. DQS allows you to generate a knowledge base and then use this knowledge base, along with computer-assisted and interactive methods, to perform data correction and deduplication. You can use cloud-based reference data services and generate a data management solution that integrates DQS with SQL Server Integration Services and Master Data Services.\nMaster Data Services (MDS)\tMaster Data Services is the master data management solution for SQL Server. Solutions built on Master Data Services help ensure that reports and analyses are based on the appropriate information. With Master Data Services, you can create a central repository for master data and maintain an auditable, secure record of master data as it changes over time.\n')
    ;
  5. Verifikasi bahwa tabel chunk berisi chunk yang telah dibagi, vektor embedding-nya, dan tsvectors:

    SELECT * FROM chunk;

Langkah 3: Hasilkan embedding secara batch

Lewati langkah ini jika Anda telah membuat trigger embedding di Langkah 2. Trigger tersebut secara otomatis menghasilkan embedding saat dokumen dimasukkan.

Untuk dokumen kompleks yang tidak dapat dibagi oleh fungsi berukuran tetap, gunakan framework seperti LangChain untuk menyiapkan chunk terlebih dahulu. Setelah chunk berada di tabel chunk tanpa embedding, gunakan salah satu metode berikut untuk menghasilkannya.

Penting

Gunakan model embedding yang sama di sini seperti yang Anda gunakan saat membuat chunk. Membandingkan embedding yang dihasilkan oleh model berbeda menghasilkan hasil pengambilan yang tidak bermakna.

Opsi 1: Rekursif (disarankan)

Perbarui semua baris dalam satu pernyataan. rds_ai memanggil API embedding sekali per baris:

UPDATE chunk
SET embedding = rds_ai.embed(text)::vector;

Opsi 2: Batch

Gunakan mode batch saat Anda memerlukan kontrol lebih ketat atas batching pemanggilan API.

  1. Daftarkan model embedding batch:

    SELECT rds_ai.add_model(
        'text-embedding-v3-batch',          -- Nama model
        'POST',                             -- Metode permintaan
        ARRAY[('Authorization', 'Bearer %s')]::http.http_header[],  -- Header permintaan
        'https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings', -- URL permintaan
        'application/json',                 -- Jenis konten
        '{
            "model": "text-embedding-v3",
            "input": %s
        }',                                 -- Isi permintaan
        'SELECT %L::jsonb->''data'''        -- SQL untuk memproses respons
    );
  2. Atur Kunci API untuk model batch:

    SELECT rds_ai.update_model(
        'text-embedding-v3-batch',
        'token',
        'sk-xxxxxx'
    );
  3. Uji model:

    SELECT *
    FROM rds_ai.invoke_model(
        'text-embedding-v3-batch',
        ARRAY['["01","02","03"]']
    );
  4. Jalankan loop embedding batch:

    DO $$
    DECLARE
        batch_size INT := 20;         -- Jumlah baris per panggilan API; sesuaikan berdasarkan batas model
        pointer INT := 0;             -- Offset saat ini ke dalam tabel
        record_count INT;             -- Total jumlah baris yang akan diproses
        input_text TEXT;              -- Array JSON teks: ["item1", "item2", ..., "item n"]
        json_results JSON;            -- Respons API embedding
        json_item JSON;               -- Elemen tunggal dari array respons
        idx INT;                      -- Indeks dalam batch saat ini
    BEGIN
        SELECT COUNT(*) INTO record_count FROM chunk;
    
        WHILE pointer < record_count LOOP
            -- Bangun array input untuk batch ini.
            SELECT json_agg(text::TEXT) INTO input_text
            FROM (
                SELECT text
                FROM chunk
                ORDER BY id
                LIMIT batch_size OFFSET pointer
            ) AS subquery;
    
            -- Panggil model embedding.
            json_results := rds_ai.invoke_model('text-embedding-v3-batch', ARRAY[input_text]);
    
            -- Tulis setiap embedding kembali ke baris yang sesuai.
            FOR idx IN 0..json_array_length(json_results) - 1 LOOP
                json_item := json_results->idx;
                UPDATE chunk
                SET embedding = (SELECT (json_item->>'embedding')::VECTOR(1024))
                WHERE id = (SELECT id FROM chunk ORDER BY id LIMIT 1 OFFSET pointer + idx);
            END LOOP;
    
            pointer := pointer + batch_size;
            RAISE NOTICE '%/% rows processed.', pointer, record_count;
        END LOOP;
    END;
    $$ LANGUAGE plpgsql;
Catatan

Anda dapat memilih metode embedding mana pun sesuai kebutuhan. Namun, metode rekursif (Opsi 1) disarankan untuk sebagian besar kasus penggunaan.

Langkah 4: Implementasikan pengambilan multimodal

Pencarian vektor menemukan konten yang terkait secara semantis meskipun kata-kata persisnya berbeda. Pencarian kata kunci menemukan hasil yang mengandung istilah spesifik dalam kueri. Menggabungkan keduanya meningkatkan recall untuk berbagai jenis kueri: untuk pertanyaan tentang kode kesalahan tertentu, pencarian kata kunci menemukan istilah persisnya; untuk pertanyaan konseptual di mana redaksinya bervariasi, pencarian vektor menemukan bagian yang relevan.

Buat fungsi multi_retrieve yang menjalankan kedua metode pengambilan dan mengembalikan hasil gabungannya:

CREATE OR REPLACE FUNCTION multi_retrieve(query TEXT)
RETURNS TABLE(ret_chunk text, score numeric, method text, rank int) AS $$
DECLARE
    rec RECORD;
BEGIN
    -- Pengambilan berbasis kata kunci menggunakan pencarian full-text tsvector.
    FOR rec IN
        SELECT
            text,
            subquery.score,
            'retrieve_by_key_word' AS method,
            RANK() OVER (ORDER BY subquery.score DESC) AS rank_id
        FROM (
            SELECT
                text,
                ts_rank(
                    ts_vector_extra,
                    to_tsquery(replace(
                        text(plainto_tsquery('pg_catalog.english', query)), '&', '|'
                    ))
                ) AS score
            FROM chunk
            WHERE ts_vector_extra @@ to_tsquery(replace(
                text(plainto_tsquery('pg_catalog.english', query)), '&', '|'
            ))
        ) AS subquery
        ORDER BY subquery.score DESC
        LIMIT 5
    LOOP
        ret_chunk := rec.text;
        score := rec.score;
        method := rec.method;
        rank := rec.rank_id;
        RETURN NEXT;
    END LOOP;

    -- Pengambilan vektor menggunakan jarak kosinus pada vektor embedding.
    FOR rec IN
        SELECT
            *,
            'retrieve_by_vector' AS method,
            RANK() OVER (ORDER BY distance) AS rank_id
        FROM rds_ai.retrieve(
            query,
            'public',
            'chunk',
            'text',
            'embedding',
            distance_type => 'cosine',
            topn => 5
        )
    LOOP
        ret_chunk := rec.chunk;
        score := rec.distance;
        method := rec.method;
        rank := rec.rank_id;
        RETURN NEXT;
    END LOOP;
END;
$$ LANGUAGE plpgsql;
Catatan

Untuk mengambil hanya berdasarkan kemiripan vektor, panggil rds_ai.retrieve secara langsung alih-alih menggunakan multi_retrieve.

Uji pengambilan dengan kueri sampel:

SELECT *
FROM multi_retrieve('Introduce the origins of the PostgreSQL database.')
ORDER BY method, rank;

Langkah 5: Susun ulang hasil

Penyusunan ulang menilai kembali hasil pengambilan gabungan untuk menampilkan chunk yang paling relevan sebelum diteruskan ke LLM. Tersedia dua pendekatan.

Gunakan rds_ai.rank (penyusunan ulang berbasis model)

rds_ai.rank memanggil model reranking jarak jauh untuk memberi skor setiap chunk kandidat terhadap kueri. Pendekatan ini lebih akurat tetapi menambahkan panggilan API jarak jauh:

DROP FUNCTION get_reranked_results(text, integer);
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score FLOAT, value TEXT) AS $$
DECLARE
    result_array TEXT[];
BEGIN
    SELECT array_agg(ret_chunk)
    INTO result_array
    FROM multi_retrieve(query);

    RETURN QUERY
    SELECT *
    FROM rds_ai.rank(query, result_array)
    ORDER BY score_value DESC
    LIMIT top_n;
END $$ LANGUAGE plpgsql;

SELECT * FROM get_reranked_results('Introduce the origins of the PostgreSQL database.');

Gunakan fungsi RRF (penyusunan ulang berbasis algoritma)

Reciprocal Rank Fusion (RRF) menggabungkan peringkat vektor dan kata kunci menggunakan formula skor, tanpa panggilan API jarak jauh. Gunakan pendekatan ini saat latensi rendah lebih penting daripada akurasi penyusunan ulang.

Skor RRF untuk setiap chunk adalah 1/(60 + vector_rank) + 1/(60 + keyword_rank). Chunk yang mendapat peringkat tinggi di kedua metode pengambilan akan mendapatkan skor gabungan yang lebih tinggi.

Buat fungsi agregat:

CREATE TYPE score_agg_state AS (
    vector_score numeric,
    keyword_score numeric
);

CREATE OR REPLACE FUNCTION score_agg_transfn(state score_agg_state, rank numeric, method text)
RETURNS score_agg_state AS $$
BEGIN
    CASE method
        WHEN 'retrieve_by_vector' THEN
            state.vector_score := COALESCE(1 / (60+rank), 0);
        WHEN 'retrieve_by_key_word' THEN
            state.keyword_score := COALESCE(1 / (60+rank), 0);
    END CASE;
    RETURN state;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION score_agg_finalfn(state score_agg_state)
RETURNS numeric AS $$
BEGIN
    RETURN COALESCE(state.vector_score, 0) + COALESCE(state.keyword_score, 0);
END;
$$ LANGUAGE plpgsql;

CREATE AGGREGATE rff_function(numeric, text)(
    SFUNC = score_agg_transfn,
    STYPE = score_agg_state,
    FINALFUNC = score_agg_finalfn,
    INITCOND = '(0, 0)'
);

Terapkan fungsi RRF untuk menyusun ulang hasil pengambilan:

CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score numeric, value TEXT) AS $$
BEGIN
    RETURN QUERY
    SELECT
        rff_function(rank, method) AS score,
        ret_chunk AS value
    FROM
        multi_retrieve(query)
    GROUP BY
        value
    ORDER BY
        score DESC
    LIMIT top_n;
END $$ LANGUAGE plpgsql;

Susun ulang hasil pengambilan berbasis embedding saja

Jika Anda menggunakan pengambilan vektor tanpa komponen kata kunci, panggil rds_ai.retrieve secara langsung dan teruskan hasilnya ke rds_ai.rank:

DROP FUNCTION get_reranked_results(text, integer);
CREATE OR REPLACE FUNCTION get_reranked_results(query TEXT, top_n INT DEFAULT 1)
RETURNS TABLE(score FLOAT, value TEXT) AS $$
DECLARE
    result_array TEXT[];
BEGIN
    SELECT array_agg(chunk)
    INTO result_array
    FROM rds_ai.retrieve(query, 'public', 'chunk', 'text', 'embedding');

    RETURN QUERY
    SELECT *
    FROM rds_ai.rank(query, result_array)
    ORDER BY score_value DESC
    LIMIT top_n;
END $$ LANGUAGE plpgsql;

-- Uji
SELECT * FROM get_reranked_results('Introduce the origins of the PostgreSQL database.');

Langkah 6: Kembalikan jawaban dari LLM

Dengan pengambilan multimodal

Gunakan hasil multimodal yang telah disusun ulang sebagai konteks untuk tanya jawab berbasis LLM.

  1. Buat fungsi pembantu untuk meng-escape karakter khusus dalam teks sebelum menyematkannya dalam prompt JSON. Ini diperlukan karena rds_ai.prompt menyusun objek JSON dari parameternya:

    CREATE OR REPLACE FUNCTION escape_for_json(input TEXT)
    RETURNS TEXT AS $$
    BEGIN
        RETURN replace(replace(input, '"', '\\"'), E'\n', '\\n');
    END;
    $$ LANGUAGE plpgsql;
  2. Buat fungsi rag, yang menggabungkan hasil yang telah disusun ulang menjadi prompt dan meneruskannya ke LLM:

    CREATE OR REPLACE FUNCTION rag(query TEXT) RETURNS TEXT AS $$
    DECLARE
      prompt_content TEXT;
      result TEXT;
    BEGIN
      -- Gabungkan chunk teratas yang telah disusun ulang menjadi satu prompt.
      WITH rank_result AS (
        SELECT string_agg(value, ',') AS prompt_content_pre
        FROM get_reranked_results(query)
      )
      SELECT
        'Based on the following content' || prompt_content_pre || 'answer my question,' || query
      INTO prompt_content
      FROM rank_result;
    
      -- Kirim prompt ke LLM dan kembalikan jawaban.
      SELECT
        rds_ai.prompt(escape_for_json(prompt_content))
      INTO result;
    
      RETURN result;
    END;
    $$ LANGUAGE plpgsql;
  3. Uji fungsi RAG:

    SELECT rag('Introduce the origins of the PostgreSQL database.');

Dengan pengambilan vektor saja

Untuk pipeline yang lebih sederhana berbasis pengambilan vektor saja, gunakan fungsi bawaan rds_ai.rag:

SELECT *
FROM rds_ai.rag(
    'Introduce the origins of the PostgreSQL database.',
    'public',
    'chunk',
    'text',
    'embedding'
);