All Products
Search
Document Center

AnalyticDB:What Is RAG Service?

Last Updated:Mar 28, 2026

AnalyticDB for PostgreSQL provides Retrieval-Augmented Generation (RAG) as a managed service. It handles document ingestion, embedding, hybrid search, and reranking so you can build RAG applications without managing the underlying infrastructure.

At a glance:

  • Ingest documents in 10+ formats — PDF, DOCX, images with OCR, and more — with automatic chunking and embedding.

  • Run hybrid search (dense vectors, sparse vectors, and full-text search in a single query) to maximize recall.

  • Rerank results with BAAI General Embedding (BGE) models or large language models (LLMs) before the generation step.

  • Integrate with LlamaIndex and LangChain, or call the APIs directly from Python, Java, Go, Node.js, PHP, or C#.

  • Store all document data in your own AnalyticDB for PostgreSQL instance with disk encryption, SSL encryption, and namespace-level multi-tenancy isolation.

How the service works

The RAG service consists of three modules:

  1. Augmented data processing — Documents are parsed by format-specific extractors and split into chunks. Each chunk is converted into vectors (dense, sparse, or multimodal) and stored in your instance.

  2. Augmented semantic search — Incoming queries are analyzed semantically and matched against stored chunks using hybrid search, two-way retrieval, or fusion query, depending on your configuration.

  3. Augmented retrieval — Retrieved chunks are scored and reranked using fine-grained ranking algorithms to improve relevance and diversity before being returned to the generation step.

image

Key benefits

  1. Quality of augmented generation — The RAG service uses search mechanisms in addition to language model generation technologies to ensure that generated content is more accurate and closer to actual data, significantly improving the quality and credibility of answers.

  2. Large-scale knowledge fusion — The RAG service accesses and incorporates a variety of data sources, such as enterprise knowledge bases and public network resources, to ensure that generated content covers a wider range of knowledge and meets information requirements for different scenarios.

  3. Flexible APIs — The RAG service provides easy-to-use APIs that can be integrated into existing systems without the need to understand complex AI technologies, enabling scenarios such as intelligent chatbots, automatic report generation, and content creation.

  4. Continuous optimization and learning — The RAG service continuously learns the latest data and customer feedback and automatically adjusts optimization policies to adapt to changing requirements and environments, ensuring long-term service quality and user experience.

Key capabilities

Compatibility

Connect to the RAG service using the method that fits your stack:

  • Direct API access — Read and write vector data, upload pre-split documents, or upload raw documents for automatic processing.

  • SDK support — Python, Java, Go, Node.js, PHP, and C#.

  • RAG framework support — LlamaIndex and LangChain.

Document processing

The service extracts text using format-specific extractors, then splits each document into chunks for embedding.

Supported input formats:

FormatExtractor
PNG, JPG, JPEG, BMPOptical Character Recognition (OCR)
Scanned PDFOCR (image-with-text metadata is added automatically)
PDF (non-scanned)Python bindings such as PyMuPDF
HTML, MARKDOWN, JSON, CSV, DOCX, PPTX, TXTText extractor

Chunking options:

  • Control chunk size and overlap using the ChunkSize and ChunkOverlap parameters to avoid embedding inefficiencies caused by oversized tokens.

  • Specify custom separators to split text at logical boundaries.

Embedding

Text embedding models: M3E, Text2Vec, Tongyi

Supported dimensions: 512, 768, 1024, 1536

Multimodal image embedding models: CLIP, Tongyi

Supported dimensions: 512, 640, 768, 1024, 1536

Search

Basic search modes:

ModeDescription
Hybrid searchSearches dense vectors and sparse vectors simultaneously
Two-way retrievalRuns full-text search and vector search simultaneously
Fusion queryFilters by conditions first, then performs vector search

Multi-way ranking algorithms:

AlgorithmHow it ranks
Reciprocal rank fusion (RRF)Uses result positions, not scores
WeightUses scores, not positions
CascadedUses full-text search as a filter, then runs top-K vector search

Augmented search:

  • Fine-grained ranking — Retrieves more than top-K chunks, then scores and reranks them using BGE models or LLMs.

  • Window retrieval — Returns several chunks before and after each matched chunk to preserve context that chunking may have split across boundaries.

Security

AreaDetails
Data privacyDocument data is stored in your AnalyticDB for PostgreSQL instance. Protect it with disk encryption, SSL encryption, and IP address whitelists. Destroy or disable access to the data at any time.
Multi-tenancy isolationUse namespaces — similar to database schemas — to isolate document collections within a single instance across organizational boundaries.
AuthenticationResource Access Management (RAM) authentication and instance username/password authentication.

Use cases

  • Intelligent customer service — Generate professional, personalized answers by grounding responses in your enterprise knowledge base.

  • Content creation — Produce articles, news summaries, and product descriptions based on preset themes or styles.

  • Knowledge management — Automatically organize and summarize documents to surface easy-to-learn knowledge points and accelerate team knowledge sharing.

  • Education and training — Generate custom teaching materials — exercises, case studies — tailored to student needs and course content.

  • Patent search — Apply text processing optimizers to patent documents for high-quality patent similarity search.