Retrieval augmented content generation using multimodal data processing semantic search-RAG service - AnalyticDB

AnalyticDB for PostgreSQL provides Retrieval-Augmented Generation (RAG) as a managed service. It handles document ingestion, embedding, hybrid search, and reranking so you can build RAG applications without managing the underlying infrastructure.

At a glance:

Ingest documents in 10+ formats — PDF, DOCX, images with OCR, and more — with automatic chunking and embedding.
Run hybrid search (dense vectors, sparse vectors, and full-text search in a single query) to maximize recall.
Rerank results with BAAI General Embedding (BGE) models or large language models (LLMs) before the generation step.
Integrate with LlamaIndex and LangChain, or call the APIs directly from Python, Java, Go, Node.js, PHP, or C#.
Store all document data in your own AnalyticDB for PostgreSQL instance with disk encryption, SSL encryption, and namespace-level multi-tenancy isolation.

How the service works

The RAG service consists of three modules:

Augmented data processing — Documents are parsed by format-specific extractors and split into chunks. Each chunk is converted into vectors (dense, sparse, or multimodal) and stored in your instance.
Augmented semantic search — Incoming queries are analyzed semantically and matched against stored chunks using hybrid search, two-way retrieval, or fusion query, depending on your configuration.
Augmented retrieval — Retrieved chunks are scored and reranked using fine-grained ranking algorithms to improve relevance and diversity before being returned to the generation step.

Key benefits

Quality of augmented generation — The RAG service uses search mechanisms in addition to language model generation technologies to ensure that generated content is more accurate and closer to actual data, significantly improving the quality and credibility of answers.
Large-scale knowledge fusion — The RAG service accesses and incorporates a variety of data sources, such as enterprise knowledge bases and public network resources, to ensure that generated content covers a wider range of knowledge and meets information requirements for different scenarios.
Flexible APIs — The RAG service provides easy-to-use APIs that can be integrated into existing systems without the need to understand complex AI technologies, enabling scenarios such as intelligent chatbots, automatic report generation, and content creation.
Continuous optimization and learning — The RAG service continuously learns the latest data and customer feedback and automatically adjusts optimization policies to adapt to changing requirements and environments, ensuring long-term service quality and user experience.

Key capabilities

Compatibility

Connect to the RAG service using the method that fits your stack:

Direct API access — Read and write vector data, upload pre-split documents, or upload raw documents for automatic processing.
SDK support — Python, Java, Go, Node.js, PHP, and C#.
RAG framework support — LlamaIndex and LangChain.

Document processing

The service extracts text using format-specific extractors, then splits each document into chunks for embedding.

Supported input formats:

Format	Extractor
PNG, JPG, JPEG, BMP	Optical Character Recognition (OCR)
Scanned PDF	OCR (image-with-text metadata is added automatically)
PDF (non-scanned)	Python bindings such as PyMuPDF
HTML, MARKDOWN, JSON, CSV, DOCX, PPTX, TXT	Text extractor

Chunking options:

Control chunk size and overlap using the ChunkSize and ChunkOverlap parameters to avoid embedding inefficiencies caused by oversized tokens.
Specify custom separators to split text at logical boundaries.

Embedding

Text embedding models: M3E, Text2Vec, Tongyi

Supported dimensions: 512, 768, 1024, 1536

Multimodal image embedding models: CLIP, Tongyi

Supported dimensions: 512, 640, 768, 1024, 1536

Search

Basic search modes:

Mode	Description
Hybrid search	Searches dense vectors and sparse vectors simultaneously
Two-way retrieval	Runs full-text search and vector search simultaneously
Fusion query	Filters by conditions first, then performs vector search

Multi-way ranking algorithms:

Algorithm	How it ranks
Reciprocal rank fusion (RRF)	Uses result positions, not scores
Weight	Uses scores, not positions
Cascaded	Uses full-text search as a filter, then runs top-K vector search

Augmented search:

Fine-grained ranking — Retrieves more than top-K chunks, then scores and reranks them using BGE models or LLMs.
Window retrieval — Returns several chunks before and after each matched chunk to preserve context that chunking may have split across boundaries.

Security

Area	Details
Data privacy	Document data is stored in your AnalyticDB for PostgreSQL instance. Protect it with disk encryption, SSL encryption, and IP address whitelists. Destroy or disable access to the data at any time.
Multi-tenancy isolation	Use namespaces — similar to database schemas — to isolate document collections within a single instance across organizational boundaries.
Authentication	Resource Access Management (RAM) authentication and instance username/password authentication.

Use cases

Intelligent customer service — Generate professional, personalized answers by grounding responses in your enterprise knowledge base.
Content creation — Produce articles, news summaries, and product descriptions based on preset themes or styles.
Knowledge management — Automatically organize and summarize documents to surface easy-to-learn knowledge points and accelerate team knowledge sharing.
Education and training — Generate custom teaching materials — exercises, case studies — tailored to student needs and course content.
Patent search — Apply text processing optimizers to patent documents for high-quality patent similarity search.