Large language models (LLMs) lack private and real-time knowledge, which can be solved by Retrieval-Augmented Generation (RAG). RAG retrieves information from external sources based on user input to enhance the accuracy of LLM responses. Model Studio provides the knowledge base feature that uses RAG capabilities to retrieve private and real-time knowledge.
Only users who created Model Studio applications before April 21, 2025 can access the Applications tab and use all its features, including applications (Agent application, Workflow application, Agent orchestration application), components (Prompt, Plug-in), data (Knowledge base, Application data) and related APIs.
Application without a private knowledge base Without a private knowledge base, the LLM cannot accurately answer questions about "Bailian phone". | Application with a private knowledge base With a private knowledge base, the LLM can provide accurate answers to questions about "Bailian phone". |
Supported formats
The knowledge base currently supports the following document formats as knowledge sources: pdf, docx, doc, txt, markdown, pptx, ppt, xlsx, xls, png, jpg, jpeg, bmp, and gif. This list may be incomplete. Refer to the description on the Data Import page.
You can import data from local uploads, Alibaba Cloud Object Storage Service (OSS), or Alibaba Cloud ApsaraDB RDS (data sources outside Alibaba Cloud such as online webpages, GitHub, and Notion are not currently supported).
Supported models
You can use knowledge bases in applications that use the following models.
Qwen-Max/Plus/Turbo
Open source Qwen2.5
Open source Qwen2
Currently, the DeepSeek models are not available. Stay tuned for future updates.
The above list is not exhaustive and may change at any time. For the actual information, refer to the models that can be selected when editing models in My Applications.
Create and use a knowledge base
Step 1: Import data
Before creating a knowledge base, import your documents into Model Studio as knowledge sources.
Use API: You can use API to import unstructured data. To import structured data, you must use the console. To automatically update a structured knowledge base, you can build it based on an ApsaraDB RDS data table.
Import from RDS: If you want to build a knowledge base based on an RDS data table, see Create a knowledge base.
The Model Studio console supports importing Unstructured Data and Structured Data. Unstructured Data is not organized based on predefined table structure, while Structured Data is organized based on a predefined table structure.
Select Unstructured Data for:
Documents in formats such as PDF, DOCX, DOC, TXT, Markdown, PPTX, PPT, PNG, JPG, JPEG, BMP, or GIF.
Multiple XLSX or XLS documents, but their table structures may be different.
Importing documents from Object Storage Service (OSS).
Select Structured Data for:
Multiple XLSX or XLS documents with identical table structures.
Documents in XLSX or XLS format that will be used for FAQ scenarios. For example, an Excel document contains two columns:
question
andanswer
. A structured knowledge base allows you to limitQuestion
column for retrieval, andAnswer
column for reference. Unstructured knowledge base can hardly achieve this effect.
Unstructured data
Go to the Application Data page and select the Unstructured Data tab.
Under Category Management on the left, select the desired category for data import.
Select the default category or click
to create a new one. Each workspace can have up to 500 categories.
Each workspace can have up to 100,000 documents.
Click Import Data to go to the Import Data page.
For Document Recoognition, the default is Intelligent Document Parsing (currently cannot be changed). However, you can configure parsing rules for different document formats through Data Parsing Settings for better effect.
(Optional) Configure Tags for documents.
When calling applications through API, you can specify tags in the request parameter
tags
. When the application retrieves the knowledge base, it first filters documents based on tags, thereby improving efficiency. For agent applications, you can also set tags when editing the application in the console ( ).Click Confirm. The system will begin parsing and importing the documents. This may take some time.
Document parsing converts uploaded documents into a format that Model Studio can process. During peak periods, it may take longer time.
After parsing and importing are complete, click Details to the right of the corresponding document to view the imported document.
You can view documents imported within 90 days. Your documents will not be deleted after this period, but you cannot view them.
Structured data
Go to the Application Data page and select the Structured Data tab.
Create a new data table or select an existing one.
Each workspace can have up to 1,000 data tables, and each table can have up to 100,000 rows (including the header). Exceeding this limit will result in a failed import, so you may need to split the data in advance.
Create a new data table
Click
to create a data table.
Enter a Table Name.
Configure the table structure by selecting Upload Excel File or Custom Header.
Option
Description
Upload Excel File
Model Studio will automatically identify the header in the uploaded document to create the data table structure accordingly. Then, it will import the remaining content as data records into the table.
Custom Header
Column Name and Type are required. Description is optional.
ImportantOnce the data table is created, you cannot modify the Column Name, Description, or Type.
Make sure the table schema matches the schema of the data to be imported. For example, if the data table to be imported has 2 columns, the structure here must also have 2 fields with corresponding column names. Click New Columns or Delete in the Actions column to adjust the fields.
Upload your documents.
Click
to select and upload documents (XLSX or XLS format).
The documents must have a header that matches the structure of the data table. Otherwise, the import will fail.
Then, click Preview to view the imported data.
Click Confirm. The new data table will appear under Table Management on the left.
Select an existing data table
Select an existing data table under Table Management on the left and click Import Data.
For Import Type, select Upload and Overwrite or Incremental Upload.
You can click Download Template to download a blank document with the table header. Then, insert data to the template and upload it directly.
Click
to select and upload documents (XLSX or XLS format).
The documents must have a header that matches the structure of the data table. Otherwise, the import will fail.
Then, click Preview to view the imported data.
Step 2: Create a knowledge base
Maximum number: Each Alibaba Cloud account can create up to 5 knowledge bases associated with ApsaraDB RDS, with no other limitations.
For RAM user: If your RAM user needs to use the full functionality of knowledge bases, you must first grant data permissions to it (
AliyunBailianDataFullAccess
). If you are not familiar with the concept of RAM user, read Permissions first.
Console
Go to the Knowledge Base page. Click Create Knowledge Base.
Enter a Name and Description. For Data Type, select Unstructured Data or Structured Data.
After the knowledge base is created, the data type cannot be changed. A single knowledge base cannot support both unstructured and structured data.
Unstructured data
Configure the knowledge base.
Parameter
Description
Configuration Mode
You can select Recommended, which is based on Model Studio's best practices. If you select Custom, you can configure parameters for retrieval and recall.
After the knowledge base is created, all parameters in Configuration Mode, except for Similarity Threshold, cannot be changed.
Click Next Step to select documents to import.
If you have already imported documents, directly select them here. Otherwise, you must first go to the Unstructured Data and import the documents.
Select Category: Import all documents under this category. You can select multiple categories.
Select File: Select the files you want to import.
You can select up to 50 documents at a time. Each document can be up to 100 MB in size or contain up to 1,000 pages. If a document exceeds these limits, you need to split it into multiple documents before importing.
Click Next Step to configure the Data Processing strategy.
Parameter
Description
Metadata Extraction (Optional)
Metadata is a series of additional attributes related to the content of unstructured documents, integrated into the chunks as key-value pairs.
Role of metadata: Metadata provides context for documents, significantly enhancing the precision of knowledge base retrieval. For example, search for "Feature Overview of Product A" in the knowledge base. If all documents include "Feature Overview" but none mention "Product A", the knowledge base may recall numerous unrelated chunks. However, if you associate product name as metadata with all documents and their related chunks, the knowledge base can accurately filter out chunks related to "Product A" and containing "Feature Overview", thereby improving retrieval accuracy and reducing input token consumption.
How to use: When calling applications through API, specify
metadata_filter
in the request. When retrieving from the knowledge base, the application will first filter relevant documents based on the metadata.Note: You cannot configure Metadata Extraction again after the knowledge base is created.
Table Header Assemble for Excel Files (Optional)
We recommend that you enable this when all imported documents are of the xlsx or xls formats and contain table headers.
When enabled, the knowledge base considers the first rows of all xlsx or xls documents as headers. The headers are then appended to all chunks (rows). This prevents the LLM from mistakenly processing headers as ordinary data rows.
Document Splitting
Select Intelligent Splitting (recommended) or Custom Splitting.
Role of document splitting: The knowledge base splits your documents into chunks and converts these chunks into vectors through the embedding model. The chunks and vectors are then stored in a vector database as key-value pairs. View the content of each chunk in the knowledge base.
Note: You cannot configure Document Splitting after the knowledge base is created. An inappropriate splitting strategy may reduce retrieval and recall performance. Check text chunk quality.
Intelligent Splitting: Uses the built-in chunking strategy, evaluated to deliver the best retrieval performance for most documents.
Custom Splitting: If intelligent splitting does not work properly, you can customize the document splitting strategy.
Structured data
Configure the knowledge base.
Parameter
Description
Configuration Mode
You can select Recommended, which is based on Model Studio's best practices. If you select Custom, you can configure parameters for retrieval and recall.
After the knowledge base is created, all parameters in Configuration Mode, except for Similarity Threshold, cannot be changed.
Click Next Step to select Data Source of structured data.
If you import structured data on the Structured Data tab, you will need to manually synchronize updates to the knowledge base.
If you import structured data through ApsaraDB RDS, data updates in the RDS table will be automatically synchronized to the knowledge base (generally within seconds, but slight delays may occur during peak periods).
Data Source
Description
Application Data
Select the table you want to import from Application Data. If no table is available, you must first import your table to Application Data.
Associate RDS
Synchronize data from specific data tables in the RDS instance to your knowledge base.
Instance limitations:
Only RDS instances with MySQL Engine (no version restrictions) are supported. PostgreSQL and other engines are not supported.
No restrictions on instance regions.
Only Basic Edition and High-availability Edition are supported.
Database proxy is not supported.
Database and table limitations:
The knowledge base has no limit on the amount of data in the associated RDS database and data table, but the size of each row must be less than 10 MB.
DDL operations on the source table are not recommended after creating the knowledge base (For example, DROP TABLE, RENAME TABLE, TRUNCATE TABLE, ADD COLUMN, DROP COLUMN), because they may cause data synchronization failures between RDS and the knowledge base. For more information, see DDL operations.
To ensure the knowledge base can import data from RDS, you need to configure whitelist for the RDS instance.
You must add all OP addresses of DTS and Model Studio to the whitelist of your RDS instance. Otherwise, the indexing will fail.
Click Next Step to configure the index. Index Configuration cannot be changed after the knowledge base is created.
Parameter
Description
Used for Retrieval
When enabled, the knowledge base is allowed to search within this column's data.
Used for Model Reply
When enabled, the retrieval results from this column will be used as references for the LLM. In the example below, Used for Retrieval is enabled for "Name", "Sex", "Position", and "Age". Used for Model Reply is enabled for "Name" and "Position." The knowledge base will search across all column data, but only provide the "Name" and "Position" columns to the LLM as references.
Because "Age" is not used for reply, the model cannot answer questions about Zhang's age.
Click Import.
API
We recommend that you use the latest version of GenAI Service Platform to call the following APIs. You can debug the APIs online and generate code samples in multiple languages, such as Java and Node.js.
RAM users must first obtain data permissions
Alibaba Cloud accounts do not need authorization and can skip this note. If you are not familiar with concepts such as Alibaba Cloud account and RAM user, read Permissions first.
Before your RAM user calls the following APIs (CreateIndex, SubmitIndexJob, and others), make sure that the user has the AliyunBailianDataFullAccess
system policy and belongs to at least one Model Studio workspace. For more information, see Grant data permissions to a RAM user.
Take the following steps to create a unstructured knowledge base:
You cannot use API to create structured databases. Use the console instead.
Call CreateIndex.
The returned
Data.Id
is the knowledge base ID. Keep this value safe because it will be used for all subsequent API operations related to the knowledge base.In the
StructureType
field, specify the data structure type used to create the knowledge base. For unstructured data, enter "unstructured".In the
RerankModelName
field, specify the ranking model name. For official ranking, enter "gte-rerank-hybrid".The rerank model is used to reorder the knowledge text results recalled from the knowledge base based on semantic relevance. Official Reranking is recommended.
In the
SinkType
field, specify the vector storage type for the knowledge base.The built-in vector database can meet basic needs. For advanced features such as management, auditing, and monitoring, use ADB-PG (AnalyticDB for PostgreSQL).
To specify the Built-in vector database, enter "BUILT_IN".
To specify ADB-PG database, enter "ADB".
The previous CreateIndex only initializes the knowledge base construction. You need to call SubmitIndexJob to complete the process. Otherwise, you will get an empty knowledge base. The task takes some time. To query the task status, call GetIndexJobStatus. If
Data.Status
in the response is "COMPLETED", the knowledge base creation is complete.
After the knowledge base is created, you can associate it with your agent or workflow applications under the same workspace in My Applications. If you use API to call your application, you can pass the knowledge base ID through the rag_options
parameter.
Also, you can call the Retrieve operation to retrieve from the knowledge base directly without using applications.
Step 3: (Optional) Test the knowledge base
Hit Test is used to evaluate the semantic retrieval performance of a knowledge base under a given Similarity Threshold, for example, to check whether chunks are correctly recalled. The test helps determine whether further adjustment of the similarity threshold is needed to ensure that the LLM can obtain valid knowledge from the knowledge base. To perform a hit test, expand Hit Test (Optional) and take the steps.
Similarity Threshold: The minimum similarity score required for recalled chunks, used to filter the chunks returned by the ranking model. Lowering this threshold may recall more chunks, including less relevant ones. Increasing it reduces the number of recalled chunks, and may discard relevant ones.
RAM users must first obtain data permissions
Alibaba Cloud accounts do not need authorization and can skip this note. If you are not familiar with concepts such as Alibaba Cloud account and RAM user, read Permissions first.
Before hit testing, make sure that the user has the AliyunBailianDataFullAccess
system policy and belongs to at least one Model Studio workspace. For more information, see Grant data permissions to a RAM user.
Step 4: Use the knowledge base
Now you can associate your created knowledge base with agent or workflow applications under the same workspace in My Applications. If you use API to call your application, you can pass the knowledge base ID through the rag_options
parameter. Both applications support multiple (up to 5) knowledge bases simultaneously based on the multi-channel recall strategy. Currently, custom retrieval order is not supported.
Multi-channel recall strategy: If the application is associated with three knowledge bases, the system retrieves chunks related to the input from these bases, ranks them with the Rank model, and selects the top K most relevant ones as the reference for the LLM.
Also, you can call the Retrieve operation to retrieve from the knowledge base directly without using applications.
Agent application
Scenario
This is an example of a Q&A agent application based on a knowledge base. The knowledge base effectively provides private and the latest information for the LLM. Such an application is suitable in scenarios such as personal assistants, customer service, and technical support.
Use a knowledge base in an agent application
Go to My Applications and click Manage on the desired agent application card. Enable Knowledge Base Retrieval Augmentation. The corresponding prompt is automatically filled in Prompt. Click Configure Knowledge Base and add one or more knowledge bases.
Retrieve Configuration (Optional)
Workflow application
Scenario
This is an example of a Q&A workflow application based on a knowledge base. The execution logic of the process is: First, perform knowledge retrieval in the knowledge base based on user query. The recalled chunks are then passed into the LLM node along with the query for answer generation.
Use a knowledge base in a workflow application
Go to My Applications and click Manage on the desired workflow application card.
Configure upstream node: Create a Knowledge Base node and connect it to the Start node.
Select query variable: In the Input dropdown list of the Knowledge Base node, select
query
.For Q&A workflow applications, the
sys.query
variable of the Start node is usually selected as the query variable.Select Knowledge Base: In the Select Knowledge Base dropdown list, select the knowledge base to be referenced.
(Optionl) Adjust topK: The K value of the multi-channel recall strategy. It specifies the quantity of text chunks that the Rank model passes to the LLM. The value must not exceed the maximum length. Increasing the K value can enhance the precision of LLM responses at the cost of increased token consumption.
Configure downstream node: Create an LLM node and set it as the downstream node of the Knowledge Base node. In the Prompt of the large model node, guide the LLM to refer to the knowledge base.
System Prompt: # Knowledge base Remember the following materials that may help you answer questions:${Retrieval_xxxx.result} User Prompt ${sys.query}
Enter
/
to replace {{Retrieval_xxxx.result}} and {{sys.query}} to the actual variables in your workflow.Click Test or Publish. When the user asks a question, if the knowledge base node matches related chunks, the chunks are filled into the system variable
sys.query
to assist the LLM node in generating a response. If no related chunks are matched, the LLM node directly respond to the system variablesys.query
.
Manage and maintain knowledge base
View knowledge base
Modify knowledge base
Update knowledge base
Delete knowledge base
Billing details
The knowledge base feature is free. However, when you call applications associated with knowledge bases, fees may occur.
Step | Billing | |
Free. | ||
Free. | ||
Free. | ||
When calling applications, the chunks recalled from the knowledge bases increases the input token count. This may result in an increase of model inference fees. Note: If you are only retrieving from knowledge bases by calling Retrieve, no fees will be incurred. | ||
Free. |
API reference
For a complete list of knowledge base APIs and parameters, see API Catalog (Knowledge index). We recommend that you use the latest version of GenAI Service Platform. You can debug the APIs online and generate code samples in multiple languages, such as Java and Node.js.
RAM users must first obtain data permissions
Alibaba Cloud accounts do not need authorization and can skip this note. If you are not familiar with concepts such as Alibaba Cloud account and RAM user, read Permissions first.
Before your RAM user calls APIs to import data, create knowledge base, or retrieve knowledge base, make sure that the user has the AliyunBailianDataFullAccess
system policy and belongs to at least one Model Studio workspace. For more information, see Grant data permissions to a RAM user.
Related operations:
Use API to import data: Import your original document from local storage or OSS to Model Studio as the knowledge source of knowledge bases.
The API does not support structured data. Use the console instead.
Use API to create knowledge base.
The API does not support structured knowledge base. Use the console instead.
Use API to retrieve from knowledge base, use one of the following methods:
When calling an application, pass the knowledge base ID in
rag_options
.Without using applications, call Retrieve directly.
Use API to update knowledge base.
To implement automatic update of structured knowledge base, associate it with RDS data table.
FAQ
RAG optimization
If you encounter issues such as incomplete knowledge recall or inaccurate content while using the RAG feature of Model Studio, see Optimize RAG performance.