The AI data preparation feature of Data Transmission Service (DTS) transfers unstructured and structured data to Data+AI infrastructure, such as vector databases and data lakehouses. This feature prepares data for retrieval-augmented generation (RAG) applications such as enterprise knowledge bases, AI-assisted content creation, and intelligent customer service.
Scenarios
Prepare data for retrieval-augmented generation (RAG) applications in production.
Fetch full and incremental data from the source database.
Handle and associate unstructured and structured data.
Use an integrated process to parse, segment, and vectorize complex data formats.
Connect directly to data sources to fetch data, which eliminates the need for downloads and uploads.
Billing
For more information, see AI data preparation billing methods.
Limits
Data preparation tasks
Cross-region tasks are not supported.
You must create the required table schemas in the destination database beforehand.
Overwriting existing data in the destination database is not supported.
RAGFlow knowledge bases
Only the virtual private cloud (VPC) network type is supported.
The VPC, vector database, and OSS Bucket must be in the same region.
Supported regions
Data preparation tasks: For more information, see List of supported regions.
RAGFlow knowledge bases: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), and China (Hong Kong).
Supported data flows
Data preparation tasks
Source database type | Destination database type | References |
MySQL | AnalyticDB for PostgreSQL | Transmit data from RDS MySQL to cloud-native AnalyticDB for PostgreSQL |
RAGFlow knowledge base
Vector database | Configuration document | Tutorials |
AnalyticDB for PostgreSQL | ||
Lindorm |