All Products
Search
Document Center

Alibaba Cloud Model Studio:Knowledge base

Last Updated:Jun 24, 2026

A knowledge base supplements an LLM with private data and up-to-date information. Using retrieval-augmented generation (RAG), the LLM retrieves relevant content from the knowledge base to generate more accurate answers.

Important

Application without a dedicated knowledge base

An LLM without a dedicated knowledge base cannot answer domain-specific questions accurately.

无

Application with a dedicated knowledge base

An LLM with a dedicated knowledge base can answer domain-specific questions accurately.

有

Supported models

The following models support knowledge bases. Configuring a knowledge base for Qwen

  • Qwen-Max/Plus/Turbo

  • QwenVL-Max/Plus

  • Qwen open-source version (e.g., Qwen2.5)

This list is subject to change. For the latest list, see the Application Management page when you create an application.

Quick start

This section shows how to quickly build an LLM Q&A application that answers domain-specific questions without writing any code. This guide uses "Alibaba Cloud Model Studio phones" as an example.

1. Build a knowledge base

  1. Go to the knowledge base page, click Create Knowledge Base, fill in the Name and Description, leave the other settings as default, and click Next Step.

  2. Select the Default Category and upload the Alibaba Cloud Model Studio Phone Series Product Introduction.docx file. Click Next Step, then click Complete.

2. Integrate with business applications

After creating a knowledge base, associate it with an Alibaba Cloud Model Studio application or an external application in the same workspace to process retrieval requests.

Agent application

  1. Go to the App Center page, find the target agent application, click Configure on its card, and select a model for the application.

  2. Click the + button to the right of Document Knowledge Base to add the knowledge base you created. You can leave the similarity threshold and weight at their default values.

    (Optional) Similarity threshold: Filter retrieval results

    A knowledge base uses semantic search to find text in your private data or files that is semantically relevant to a query, even if the keywords are completely different.

    For example, a user submits the following query: Which Alibaba Cloud phone is best for photography?

    The most relevant result (for example, text about the Qwen Vivid 7) may not contain any keywords from the query.

    In the table below, keyword similarity is calculated using the Jaccard index, and semantic similarity is the cosine similarity calculated by the text-embedding-v4 model.

    Retrieved text

    Keyword similarity

    Semantic similarity

    Qwen Vivid 7: A new experience in smart photography

    0

    0.43

    Alibaba Cloud Model Studio Ace Ultra: The choice for gamers

    0.17

    0.32

    Alibaba Cloud Model Studio Flex Fold+: A new era of foldable screens

    0.25

    0.24

    Similarity threshold: Only text chunks with a semantic similarity score higher than this threshold are retrieved. Setting the threshold too high can filter out relevant text chunks.

    (Optional) Weight: Influences the retrieval order for multiple knowledge bases

    When an agent application is associated with multiple knowledge bases, you can assign weights to them based on the importance of each information source. During multi-path retrieval, if text chunks from different knowledge bases have the same similarity score, the system prioritizes the text chunks from the knowledge base with a higher weight.

    • Key limitation: Weights only take effect between knowledge bases of the same type. For example, the weight of a document search knowledge base does not affect the retrieval order of a data query knowledge base, and vice versa.

    • How it works: The system first calculates the relevance between the user's query and the content in each knowledge base to filter for the most relevant text chunks. It then multiplies the similarity score of each text chunk by the weight of its corresponding knowledge base. After weighted reranking, the results are passed to the LLM as context. Text chunks with higher weighted scores are prioritized by the LLM.

  3. In the input box on the right side of the page, enter a question. The LLM will use the knowledge base you created to generate an answer.

    For example: "Help me choose the Alibaba Cloud Model Studio phone with the best camera for under 3,000 CNY."

Workflow application

  1. Go to the App Center page, find the target workflow application, and click Configure on its card. Drag a knowledge base node onto the canvas and connect it after the Start.

  2. Configure the knowledge base node:

    1. Input: To the right of the content variable, click the Value drop-down list and select Built-in Variable. You may need to expand the "Built-in Variable" group to find the query variable.

    2. Select Knowledge Base: The knowledge base node offers two selection methods:

      • Select a fixed knowledge base: Select the knowledge base that you created from the drop-down menu. Use this method when the same knowledge base is required for every call.

      • Dynamic Selection: Configure the CodeList variable to dynamically specify which knowledge bases to use based on the output of upstream nodes. Use this method to retrieve from different knowledge bases based on varying inputs.

    3. Set TopK (Optional): Determines the number of text chunks returned to downstream nodes (typically LLM nodes).

      Increasing this value usually improves the accuracy of the LLM's answers but also increases the LLM's input token consumption.
  3. Drag an LLM node onto the canvas and connect it after the knowledge base node and before the end node.

  4. Configure the LLM node:

    1. In the Model configuration list, select a model for the node.

    2. In the Prompt field, enter a prompt that instructs the LLM to use the knowledge base. Enter "/" to insert the result variable, which represents the results from the knowledge base retrieval.

      image

  5. Configure the end node: Enter / and select to set the LLM's response as the final output.

  6. Click Test in the upper-right corner of the page. In the input box on the right, enter a question. The LLM will use the knowledge base you created to generate an answer.

    For example: "Help me choose the Alibaba Cloud Model Studio phone with the best camera for under 3,000 CNY."

External application

Besides building applications in Alibaba Cloud Model Studio, you can use the Alibaba Cloud Model Studio SDK to integrate knowledge base retrieval into external AI applications.

For detailed integration steps, see the Knowledge Base API Guide.

3. Optimize RAG performance (Optional)

If retrieval results are incomplete or inaccurate during the Q&A process, see RAG performance optimization.

Actions

On the knowledge base page, you can view and manage all knowledge bases in the current workspace.

Knowledge base ID: The value of the ID field on each knowledge base card, used for API calls.

Create a knowledge base

Click Create Knowledge Base, follow three steps: provide basic information and select a knowledge base type, configure a data source, and set indexing parameters.

  1. From the knowledge base page, click Create Knowledge Base.

  2. Provide basic information

    Select the Knowledge Base Type based on your use case. Each knowledge base supports only one type. If you select the document search type, you must also select a use case: basic document Q&A, rich-text reply:

    • Basic document Q&A: Ideal for semantic retrieval of plain-text documents.

    • Rich-text reply: Ideal for responses that contain rich text.

    The knowledge base type cannot be changed after creation.
    • Document search (retrieval scenario)

      • Use cases

      • Data source: You can upload local files or import them from Object Storage Service (OSS).

        Creation instructions (document search)

        1. Select data: Specify a data source, which can include files or content, to import into the knowledge base for retrieval. You can use local upload or cloud import (by selecting an existing category or file).

          • Local upload: Upload files directly from your computer. Expand the collapsible panel below to learn how to select a parsing method.

            Parsing methods (custom settings)

            Configure the parsing strategy as needed. If you are unsure which to choose, we recommend using the default settings.

            • Digital Parsing: Does not parse illustrations or charts in files. This is the fastest parsing method, typically taking a few seconds to a minute for a 10- to 20-page plain-text document.

            • Intelligent Document Parsing: Recognizes and extracts text from illustrations to generate summaries. These summaries, along with other non-image content, are chunked and vectorized for knowledge base retrieval. This method is relatively fast, typically taking one to five minutes for a 10- to 20-page document with illustrations.

            • LLM Parsing: Applications that use the Qwen-VL model can answer questions about the content of illustrations and charts. To recognize and understand this content, select LLM Parsing. Because this method requires calls to an LLM for in-depth understanding, it typically takes two to ten minutes for a 10- to 20-page document with charts.

            • Qwen-VL parsing: This method is designed for image files. You can specify a Qwen-VL model and provide a prompt to guide the recognition and extraction of the image layout and elements. Parsing a single image typically takes a few seconds to a minute.

          • Cloud import: Import existing files from Object Storage Service (OSS).

        2. Index configuration: Define how imported data is processed and stored, which directly affects retrieval performance.

          Among the following settings, only vector storage with AnalyticDB for PostgreSQL (ADB-PG) may incur fees. All other configurations are free.

          Metadata extraction

          Metadata consists of additional attributes associated with unstructured data. These attributes are integrated into chunks as key-value pairs.

          • Purpose: Metadata provides important context for chunks and can significantly improve retrieval accuracy. For example, consider a knowledge base that contains thousands of product introduction files where the file name is the product name. If a user searches for "functional overview of Product A," and the body of every file contains "functional overview" but none mention "Product A," the knowledge base might retrieve many irrelevant chunks. However, if you add the product name as metadata to all chunks, the knowledge base can accurately filter for chunks that are related to "Product A" and also contain "functional overview." This improves retrieval accuracy and reduces the LLM's input token consumption.

          • Usage: When you call an application via an API, you can specify metadata in the metadata_filter request parameter. When the application retrieves information from the knowledge base, it first filters for relevant files based on the specified metadata.

          • Note: You cannot configure metadata extraction after a knowledge base is created.

          How to configure metadata

          Enable Metadata extraction, and then click Settings to attach uniform or personalized metadata to all files in the knowledge base. During chunking, the metadata for each file is integrated into its respective chunks. The following figure shows the metadata template used in the preceding example:

          image

          Create a metadata template

          Value extraction methods:

          • Constant: Attaches a fixed attribute to all files in the knowledge base.

            As shown in the preceding example, if all files in the knowledge base have the same author, you can set a constant for a field named author.
          • Variable: Attaches a variable attribute to each file in the knowledge base. The currently supported attributes are file_name and cat_name. If you select file_name, Alibaba Cloud Model Studio attaches the name of the file to its metadata, as shown in the preceding example. If you select cat_name, Alibaba Cloud Model Studio attaches the name of the category that contains the file to the file's metadata.

          • LLM: The system matches the text content of each file in the knowledge base against the configured Entity Description rule to automatically identify and extract relevant information, which is then attached as attributes to the file's metadata.

            As shown in the metadata template in the preceding example, to extract all years that appear in each file as file attributes, you can configure an LLM field named date. The entity description is configured as follows:

            image

          • RegEx: The system matches the text content of each file in the knowledge base against the specified regular expression. Content that matches the expression is extracted and added as an attribute to the file's metadata.

            As shown in the meta information template in the example above, if you want to extract all references that appear in each file (assuming that the references start with 《 and end with 》), you can configure a regular expression field named reference. The regular expression is configured as follows:

            image

          • Keyword search: The system searches each file for preset keywords and adds the matched keywords as attributes to the file's metadata.

            For example, in the metadata template in the preceding example, the preset keywords are:

            image

            Because the file contains only the keywords "financing," "industry," "green," and "capital," the system extracts only these four keywords as the value for the file's keywords attribute.

          Used for Retrieval: When enabled, both metadata (fields and values) and chunk content are used for retrieval. When disabled, only chunk content is used for retrieval.

          Used for Model Reply: When enabled, both metadata (fields and values) and chunk content are provided to the LLM to generate responses. When disabled, only chunk content is provided.

          Excel header assembly

          When enabled, the knowledge base treats the first row of all XLSX and XLS files as the header and automatically appends it to each chunk (data row). This prevents the LLM from misinterpreting the header as a regular data row.

          You do not need to enable this setting if the knowledge base contains files in other formats, such as PDF.

          Chunking method

          Select smart chunking (recommended).

          Purpose: A knowledge base splits files into chunks and converts these chunks into vectors using an embedding model. The chunks and vectors are then stored as key-value pairs in a vector database. After a knowledge base is created, you can view or edit the specific content (text and images) of each chunk.
          Note: Once a knowledge base is created, the document chunking settings cannot be changed. An inappropriate chunking strategy may reduce retrieval and recall performance.

          Multi-turn conversation rewriting

          When enabled, the system uses a dedicated lightweight model to rewrite the user's current query into a standalone query with complete context by incorporating the conversation history. The system then uses this rewritten query to retrieve information from the knowledge base.

          Embedding model

          An embedding model converts original input prompts and knowledge text into numerical vectors to calculate semantic similarity. The default Official Vector (text-embedding-v2) model supports multiple languages in addition to Chinese and English and normalizes the resulting vectors. This setting cannot be changed.

          The vector dimensions generated by a knowledge base using (cannot be modified):

          • Official Vector (text-embedding-v2): 1,536 dimensions

          • qwen3 multimodal embedding (qwen3-vl-embedding): Automatically enabled when the Visual understanding use case is selected. It supports generating vectors for images and rich text documents after visual understanding.

          Reranking model

          A reranking model is external to the knowledge base. It reranks candidate chunks from the initial vector search and returns the top K chunks with the highest similarity scores. The recommended official reranker, qwen3-rerank (hybrid), considers both semantic relevance and text-matching features (such as BM25 scores) to better handle queries that require precise keyword hits. For semantic ranking only, select qwen3-rerank.

          Reranking model mode

          When you create a knowledge base, you can select one of the following modes for the Reranking model mode setting:

          • Q&A mode (default): Ranks candidate chunks based on their "Q&A match score" with the query. This mode is suitable for scenarios where a user asks a complete question and expects to find the answer within a chunk.

          • Similarity mode: Ranks candidate chunks based on their "semantic similarity score" with the query. This mode is suitable for scenarios where the query and the chunk have similar phrasing.

          • Custom advanced mode: Allows you to provide a natural language instruction of up to 200 characters to influence the reranking process. This mode is suitable for scenarios with special ranking requirements.

          Warning

          The reranking model mode can only be selected when you create a knowledge base and cannot be modified after creation. Before you configure this setting, note the following limitations:

          • Knowledge base type limitation: This setting applies only to document search, data query, and audio and video search knowledge bases. Image Q&A knowledge bases are not supported.

          • Use case limitation: This setting is supported only for the Basic document Q&A and Rich-text Reply use cases. The Visual understanding (rich text documents) and Rapid Q&A use cases are not supported.

          Similarity threshold

          This threshold sets the minimum similarity score for recalling a chunk from the results returned by the reranking model. Only chunks with scores that exceed this value are recalled.

          Note

          This is the default similarity threshold for the knowledge base. When you associate the knowledge base with a specific Alibaba Cloud Model Studio application, you can also set a separate threshold for that application, which overrides the knowledge base's default similarity threshold.

          Lowering this threshold recalls more chunks but may include less relevant content. Raising it reduces the number of recalled chunks. If set too high, the knowledge base may discard relevant chunks.

          You can use hit testing to fine-tune the similarity threshold to balance recall and precision.

          Maximum recall count

          Suppose an Alibaba Cloud Model Studio application is associated with three knowledge bases: A1, A2, and A3. The system retrieves chunks related to the input from these bases, reranks them using a reranking model, and selects the top K most relevant chunks to use as context for the LLM. This K value is the maximum recall count (up to 20), which determines the number of chunks the reranking model provides to the LLM.

          Increasing this value can improve the LLM's response accuracy but also increases the LLM's input token consumption.

          Vector storage

          Select a vector database to store text vectors. The Built-in vector database meets the basic functional needs of a knowledge base. For advanced features like database management, auditing, or monitoring, we recommend selecting AnalyticDB for PostgreSQL (ADB-PG).

          When you purchase an ADB-PG instance, you must enable Vector Engine Optimization. Otherwise, Alibaba Cloud Model Studio cannot use the instance.

        Creation instructions (visual understanding)

        When you select the Visual understanding (rich text documents) use case, the knowledge base uses a multimodal embedding model for visual understanding of the document. This method preserves the original layout information instead of using traditional text chunking methods.

        File format restrictions

        In the file upload area of the Select data tab, hover over View format requirements to view the requirements.

        Index configuration differences

        The index configuration for the visual understanding use case differs from that for basic document Q&A:

        • Embedding model: The qwen3 multimodal embedding (qwen3-vl-embedding) model is automatically selected and cannot be changed after creation.

        • Multi-turn conversation rewriting: Can be enabled or disabled.

        • Similarity threshold: Default is 0.20.

        • Maximum recall count: Default is 5.

        • Chunking method: Visual understanding does not use traditional text chunking methods (such as smart chunking or custom chunking). Instead, it understands each page of the document based on visual indexing.

        Editing restrictions

        • The embedding model (qwen3 multimodal embedding) and vector storage type (Built-in) cannot be changed after creation.

        • The knowledge base edition can be changed only once per day.

    • Data query (chatbot or NL2SQL scenarios)

      • Use cases:

        • Ideal for building Q&A systems on structured data (data organized in a predefined table schema) to create assistants that query FAQs, product data, or personnel information.

        • If your data consists of complete FAQ question-and-answer pairs, select Data Query. For example, if an Excel file contains two columns, Question and Answer, a data query knowledge base can use the Question column for retrieval and the Answer column as context for the LLM's response.

          A document search knowledge base does not support this functionality.
        • You can import multiple Excel files, but their table schemas must be identical.

      • Data source integration: You can upload local XLS or XLSX files.

        Creation instructions (Data Query)

        1. Select data: Specify the data source (files or content) to import into the knowledge base for retrieval. Supported methods include local upload and cloud import.

          Note

          After a knowledge base is created, its data source cannot be changed. Each knowledge base supports only one data source.

          • Local upload: Upload data tables (XLS or XLSX format) from your local computer. The first row must be the table header.

          • Cloud import (select data table): Select an existing data table from a Model Studio .

        2. Index configuration: Define how imported data is processed and stored, which directly affects retrieval performance.

          Among the following settings, only vector storage with AnalyticDB for PostgreSQL (ADB-PG) may incur fees. All other configurations are free.

          Retrieval and model reply

          • Used for Retrieval: When enabled, the knowledge base retrieves data from this column.

          • Used for Model Reply: When enabled, the LLM uses the retrieval results from this column to generate a response. For example, if you enable Used for Retrieval for the "Name," "Gender," "Position," and "Age" columns, but enable Used for Model Reply only for the "Name" and "Position" columns, the knowledge base will retrieve from all four columns. However, only the retrieval results from the "Name" and "Position" columns are provided to the LLM to generate its response.

            As shown in the following figure, because "used for model reply" is not enabled for the "Age" column, the LLM associated with this knowledge base still cannot answer the question "What is Zhang San's age?".

            image

          Multi-turn conversation rewriting

          When enabled, the system uses a dedicated lightweight model to rewrite the user's current query into a standalone query with complete context by incorporating the conversation history. The system then uses this rewritten query to retrieve information from the knowledge base.

          Embedding model

          An embedding model converts original input prompts and knowledge text into numerical vectors to calculate semantic similarity. The default Official Vector (text-embedding-v2) model supports multiple languages in addition to Chinese and English and normalizes the resulting vectors. This setting cannot be changed.

          The vector dimensions generated by a knowledge base using (cannot be modified):

          • Official Vector (text-embedding-v2): 1,536 dimensions

          • qwen3 multimodal embedding (qwen3-vl-embedding): Automatically enabled when the Visual understanding use case is selected. It supports generating vectors for images and rich text documents after visual understanding.

          Reranking model

          A reranking model is external to the knowledge base. It reranks candidate chunks from the initial vector search and returns the top K chunks with the highest similarity scores. The recommended official reranker, qwen3-rerank (hybrid), considers both semantic relevance and text-matching features (such as BM25 scores) to better handle queries that require precise keyword hits. For semantic ranking only, select qwen3-rerank.

          Reranking model mode

          When you create a knowledge base, you can select one of the following modes for the Reranking model mode setting:

          • Q&A mode (default): Ranks candidate chunks based on their "Q&A match score" with the query. This mode is suitable for scenarios where a user asks a complete question and expects to find the answer within a chunk.

          • Similarity mode: Ranks candidate chunks based on their "semantic similarity score" with the query. This mode is suitable for scenarios where the query and the chunk have similar phrasing.

          • Custom advanced mode: Allows you to provide a natural language instruction of up to 200 characters to influence the reranking process. This mode is suitable for scenarios with special ranking requirements.

          Warning

          The reranking model mode can only be selected when you create a knowledge base and cannot be modified after creation. Before you configure this setting, note the following limitations:

          • Knowledge base type limitation: This setting applies only to document search, data query, and audio and video search knowledge bases. Image Q&A knowledge bases are not supported.

          • Use case limitation: This setting is supported only for the Basic document Q&A and Rich-text Reply use cases. The Visual understanding (rich text documents) and Rapid Q&A use cases are not supported.

          Similarity threshold

          This threshold sets the minimum similarity score for recalling a chunk from the results returned by the reranking model. Only chunks with scores that exceed this value are recalled.

          Note

          This is the default similarity threshold for the knowledge base. When you associate the knowledge base with a specific Alibaba Cloud Model Studio application, you can also set a separate threshold for that application, which overrides the knowledge base's default similarity threshold.

          Lowering this threshold recalls more chunks but may include less relevant content. Raising it reduces the number of recalled chunks. If set too high, the knowledge base may discard relevant chunks.

          You can use hit testing to fine-tune the similarity threshold to balance recall and precision.

          Maximum recall count

          Suppose an Alibaba Cloud Model Studio application is associated with three knowledge bases: A1, A2, and A3. The system retrieves chunks related to the input from these bases, reranks them using a reranking model, and selects the top K most relevant chunks to use as context for the LLM. This K value is the maximum recall count (up to 20), which determines the number of chunks the reranking model provides to the LLM.

          Increasing this value can improve the LLM's response accuracy but also increases the LLM's input token consumption.

          Vector storage

          Select a vector database to store text vectors. The Built-in vector database meets the basic functional needs of a knowledge base. For advanced features like database management, auditing, or monitoring, we recommend selecting AnalyticDB for PostgreSQL (ADB-PG).

          When you purchase an ADB-PG instance, you must enable Vector Engine Optimization. Otherwise, Alibaba Cloud Model Studio cannot use the instance.
    • Image Q&A (image search)

      • Use cases:

        • Ideal for building multimodal retrieval applications for search-by-image and search by image and text, such as product discovery assistants or visual Q&A assistants.

      • Data source integration: You can upload local XLS or XLSX files.

        XLS and XLSX files must contain publicly accessible image URLs to build an image index. For more information, see the creation instructions below.

        Creation (Image Q&A)

        1. Select data: Specify a data source to import into the knowledge base for retrieval. Supported methods include Local upload and Cloud import (selecting an existing data table from a data connector).

          Note

          The data source cannot be changed after the knowledge base is created. A knowledge base can support only one data source.

          • Local upload: Upload data tables in XLS or XLSX format from your local computer.

            Note
            • Field requirement: The data table must contain at least one image_url field to generate an image index.

            • Build process: The knowledge base accesses the image URLs in the image_url field, extracts visual features, and stores them as vectors.

            • Retrieval process: The knowledge base compares the vector generated from a user's uploaded image with the stored image vectors and returns the most relevant records.

          • Cloud import (select a data table): Select an existing data table from your application data in Alibaba Cloud Model Studio.

        2. Index configuration: Configure how imported data is processed and stored, which directly affects retrieval performance.

          Among the following settings, only vector storage with AnalyticDB for PostgreSQL (ADB-PG) may incur fees. All other configurations are free.

          Retrieval and model reply

          • Used for Retrieval: When enabled, the knowledge base retrieves data from this column.

          • Used for Model Reply: When enabled, the LLM uses the retrieval results from this column to generate a response. For example, if you enable Used for Retrieval for the "Name," "Gender," "Position," and "Age" columns, but enable Used for Model Reply only for the "Name" and "Position" columns, the knowledge base will retrieve from all four columns. However, only the retrieval results from the "Name" and "Position" columns are provided to the LLM to generate its response.

            As shown in the following figure, because "used for model reply" is not enabled for the "Age" column, the LLM associated with this knowledge base still cannot answer the question "What is Zhang San's age?".

            image

          Multi-turn conversation rewriting

          When enabled, the system uses a dedicated lightweight model to rewrite the user's current query into a standalone query with complete context by incorporating the conversation history. The system then uses this rewritten query to retrieve information from the knowledge base.

          Embedding model

          An embedding model converts input prompts, knowledge text, and images into numerical vectors for similarity comparison. For more information, see Text and Multimodal Vectorization.

          • qwen2.5 multimodal embedding (qwen2.5-vl-embedding): Represents single-modal or mixed-modal inputs as a unified vector, suitable for cross-modal retrieval and image search scenarios. For example, if you input an image of a shirt with the text "find a similar style that looks younger," the model fuses the image and text instructions into a single vector.

          • Multimodal Embedding v1 (multimodal-embedding-v1): Generates a separate vector for each part of the input (image and text).

          • qwen3 multimodal embedding (qwen3-vl-embedding): An upgraded version of qwen2.5-vl-embedding that further improves image-text fusion understanding and cross-modal retrieval accuracy.

          Reranking model

          A reranking model is external to the knowledge base. It reranks candidate chunks from the initial vector search and returns the top K chunks with the highest similarity scores. The recommended official reranker, qwen3-rerank (hybrid), considers both semantic relevance and text-matching features (such as BM25 scores) to better handle queries that require precise keyword hits. For semantic ranking only, select qwen3-rerank.

          Reranking model mode

          When you create a knowledge base, you can select one of the following modes for the Reranking model mode setting:

          • Q&A mode (default): Ranks candidate chunks based on their "Q&A match score" with the query. This mode is suitable for scenarios where a user asks a complete question and expects to find the answer within a chunk.

          • Similarity mode: Ranks candidate chunks based on their "semantic similarity score" with the query. This mode is suitable for scenarios where the query and the chunk have similar phrasing.

          • Custom advanced mode: Allows you to provide a natural language instruction of up to 200 characters to influence the reranking process. This mode is suitable for scenarios with special ranking requirements.

          Warning

          The reranking model mode can only be selected when you create a knowledge base and cannot be modified after creation. Before you configure this setting, note the following limitations:

          • Knowledge base type limitation: This setting applies only to document search, data query, and audio and video search knowledge bases. Image Q&A knowledge bases are not supported.

          • Use case limitation: This setting is supported only for the Basic document Q&A and Rich-text Reply use cases. The Visual understanding (rich text documents) and Rapid Q&A use cases are not supported.

          Similarity threshold

          This threshold sets the minimum similarity score for recalling a chunk from the results returned by the reranking model. Only chunks with scores that exceed this value are recalled.

          Note

          This is the default similarity threshold for the knowledge base. When you associate the knowledge base with a specific Alibaba Cloud Model Studio application, you can also set a separate threshold for that application, which overrides the knowledge base's default similarity threshold.

          Lowering this threshold recalls more chunks but may include less relevant content. Raising it reduces the number of recalled chunks. If set too high, the knowledge base may discard relevant chunks.

          You can use hit testing to fine-tune the similarity threshold to balance recall and precision.

          Maximum recall count

          Suppose an Alibaba Cloud Model Studio application is associated with three knowledge bases: A1, A2, and A3. The system retrieves chunks related to the input from these bases, reranks them using a reranking model, and selects the top K most relevant chunks to use as context for the LLM. This K value is the maximum recall count (up to 20), which determines the number of chunks the reranking model provides to the LLM.

          Increasing this value can improve the LLM's response accuracy but also increases the LLM's input token consumption.

          Vector storage

          Select a vector database to store text vectors. The Built-in vector database meets the basic functional needs of a knowledge base. For advanced features like database management, auditing, or monitoring, we recommend selecting AnalyticDB for PostgreSQL (ADB-PG).

          When you purchase an ADB-PG instance, you must enable Vector Engine Optimization. Otherwise, Alibaba Cloud Model Studio cannot use the instance.

    Choose a use case: basic document Q&A, rich-text reply.

During peak request periods, the creation process can take several hours, depending on the data volume. Please be patient.

Update a knowledge base

Changes to a knowledge base automatically synchronize with any application that uses it.

Document search

  • Automatic update (recommended)

    You can set up automatic updates by integrating the OSS, FC, and Model Studio knowledge base APIs. Follow these steps:

    1. Create a bucket: Go to the OSS console and create an OSS bucket to store your source files.

    2. Create a knowledge base: Create an unstructured knowledge base to store your private content.

    3. Create a user-defined function: Go to the FC console and create a function to handle file change events, such as file creation and deletion. For more information, see Create a function. The function calls the relevant APIs from the Knowledge Base API Guide to synchronize your knowledge base with file changes in OSS.

    4. Create an OSS trigger: In FC, associate an OSS trigger with the previously created user-defined function. When a file change event such as an upload occurs, the trigger activates the function.

  • Manual update

    On the Knowledge Base page, find the knowledge base that you want to update and click View Details on its card.

    • To add a new file: Click Upload Data and select existing files from the data connector.

    • To delete a file: Find the file and click Delete to its right.

    • To modify file content: In-place updates and overwrites are not supported. First, delete the old version from the knowledge base, and then import the updated version.

      Note: Failure to remove the old version can lead to outdated search results.

Data query and image Q&A

Note: On the details page for an Image Q&A knowledge base, there is no direct Upload Data button. To update data, click the View Data Source link to open the data connector details page.
  • Automatic update

    Not supported.

  • Manual update

    If the data source for your knowledge base is a data table in Application Data, follow these two steps for manual updates.

    1. Step 1: Update the data table

      Go to the Application Data tab. In the left pane, select the target data table and click Upload Data.

      • To insert new data: Set the import type to Incremental Upload. Upload an Excel file that contains only the header row and the new data rows.

        The header row of the file must match the current table schema. You can click Download Template to get a standard template file, and then add your new data to it.
      • To delete data: Set the import type to Upload and Overwrite. Upload an Excel file that contains the header row and the latest full dataset, with the unwanted records removed.

        To get the full dataset, click the image icon to download the data in XLSX format.
      • To modify data: Set the import type to Upload and Overwrite. Upload an Excel file that contains the header row and the full, modified dataset.

    2. Step 2: Synchronize changes to the knowledge base

      Return to the Knowledge Base list, find the target knowledge base, and click View Details on its card. Click the image icon in the upper-left corner of the data table, and then confirm the prompt to synchronize the knowledge base.

      You must repeat these steps for each manual update.

Audio and video search

  • Automatic update

    Not supported.

  • Manual update

    On the Knowledge Base page, find the knowledge base that you want to update and click View Details on its card.

    • To add a new file: Click Upload Data and select existing files from Application Data.

    • To delete a file: Find the file and click Delete to its right.

      This action removes the file only from the knowledge base. This action does not affect the source file in Application Data.
    • To modify file content: In-place updates and overwrites are not supported. First, delete the old version from the knowledge base, and then import the updated version.

      Note: Failure to remove the old version can lead to outdated search results.

Edit knowledge base

After creating a knowledge base, you can modify only its knowledge base name, knowledge base description, and similarity threshold. To change other configurations, you must delete and recreate the knowledge base. This operation is only available in the console and has no corresponding API.

Procedure: On the Knowledge Base page, find the target knowledge base, click the 更多(...) icon on its card, and then click Edit. Note: You can modify a knowledge base's configuration only once per calendar day. The system silently rejects any subsequent attempts on the same day.

Delete a knowledge base

Warning

This action cannot be undone. Proceed with caution.

Before you delete a knowledge base, we recommend that you disassociate it from all published Model Studio applications.

You can still delete a knowledge base associated with unpublished applications.

Procedure

  1. For each published application associated with the knowledge base:

    1. On the My Applications page, find the application and click Configure.

    2. Remove the knowledge base from the list, and then click Publish in the upper-right corner to republish the application.

  2. On the Knowledge Base page, find the knowledge base, click the 更多(...) icon on its card, and then click Delete.

Change configuration

The Enterprise Edition uses RCUs for high retrieval performance at high QPS and offers larger storage capacity. The Standard Edition is suitable for development, testing, or low-concurrency scenarios.

Note

You can switch between the Standard and Enterprise Editions and change the RCU count for the Enterprise Edition.

You can change a Knowledge Base's configuration only once per calendar day.

RCU: A Retrieval Compute Unit (RCU) is a measure of the retrieval concurrency of a Knowledge Base. One RCU supports approximately 50 QPS for online retrieval. Higher RCU counts support greater concurrency.
  • Note:

    • To downgrade an Enterprise Edition Knowledge Base that uses platform storage to the Standard Edition, you must first reduce its used storage space to less than 80 GB.

      You can free up storage space by deleting files or data from the Knowledge Base.
  • Procedure:

    1. On the Knowledge Base page, for the Knowledge Base you want to edit, click the 更多(...) icon on its card, and then click Edit.

    2. In the dialog box, select an action based on the current edition:

      • Standard Edition: Select Upgrade.

      • Enterprise Edition: Select Downgrade or Change RCU Count.

    3. Follow the on-screen instructions. The new configuration takes effect immediately after you click OK.

Hit testing

Use hit testing to verify that your knowledge base provides accurate knowledge input to your AI application. Simulate user queries, evaluate the retrieval results, and fine-tune the similarity threshold.

The reranking model in hit testing supports three modes: q&a mode (default), designed for queries that do not perfectly match the document content; similarity mode, ideal for queries that are highly similar to the document content; and custom advanced mode. The ranking scores for the same query can vary significantly depending on the selected mode. For example, the same text segment might score 47% in q&a mode but up to 69% in similarity mode.

With hit testing, you can:

  • Verify that the knowledge base provides effective knowledge input to your AI application

  • Fine-tune the similarity threshold to balance the recall rate and accuracy

  • Identify content gaps or quality issues in your knowledge base

Use cases

  • Use case 1: Querying product pricing

    Test input: "How much does your Model Studio phone cost?"
    Expected result: Retrieve relevant text segments that contain price information.
  • Use case 2: Troubleshooting a technical issue

    Test input: "What should I do if my device can't connect to Wi-Fi?"
    Expected result: Retrieve relevant text segments about troubleshooting Wi-Fi connection issues.
  • Use case 3: Retrieval with visual understanding (visual understanding knowledge base)

    A visual understanding knowledge base supports three query modes: text-only, image-only, and image+text.
    Mode 1 (text-only): Enter "Object Storage Service" to retrieve relevant segments from documents and images.
    Mode 2 (image-only): Upload a product screenshot. The system uses visual understanding to match semantically similar segments.
    Mode 3 (image+text): Upload an image and enter descriptive text. A combined query can improve retrieval similarity.
  • Use case 4: Rapid Q&A (Rapid Q&A knowledge base)

    A Rapid Q&A knowledge base supports text-only queries (image input is not supported) and is ideal for fast retrieval from structured documents:
    Test input: "What is the price of the Qwen Pro 8?"
    Expected result: Quickly retrieve relevant FAQ segments that include price information.

Procedure

  1. On the knowledge base page, find the knowledge base you want to test and click Hit Test on its card.

  2. In the test interface, enter a question (we recommend using common user queries) and review the retrieval results.

    • Retrieval results: This section displays the hit results from the current test, sorted by similarity in descending order. Click any segment to view its content.

    • imageIcon: For image Q&A knowledge bases, the system converts the input image to a vector to retrieve records, then sends these records with the question to an LLM for an answer. By contrast, document search, data query knowledge bases do not use the uploaded image for retrieval. However, a document search knowledge base with its use case set to visual understanding uses the uploaded image for retrieval, supporting text-only, image-only, and image+text query modes. In this context, a combined image and text query improves retrieval similarity.

  3. Verify that the relevant text segments are correctly retrieved. If not, adjust the similarity threshold and repeat the previous step.

  4. Click View Recall History to compare retrieval performance with different threshold settings.

image

Quotas and limits

  • For information about supported data sources, capacity, and other limits, see Knowledge base quotas and limits.

  • Maximum number of knowledge bases per Model Studio application:

    • Document search: Up to 5

    • Data query: Up to 5

    • Image Q&A: Up to 1

    You can associate up to 11 knowledge bases of different types.

Billing

The knowledge base feature is free, but you may be charged for calling a Model Studio application that uses a knowledge base.

Step

Billing

Build a knowledge base

Free of charge.

Integrate with business applications

When a Model Studio application retrieves text chunks from a knowledge base, it increases the LLM's input token count and can increase model inference fees. For more information, see Billable items and pricing.

Note: You are not charged if you only use the Retrieve API to retrieve from a knowledge base and do not use a Model Studio application to generate a response.

management and O&M

Free of charge.

API reference

FAQ

Building a knowledge base

  • Q: Can I delete a file or data table from Application Data after importing it into a knowledge base?

    • For document search knowledge bases: Yes. Files in Application Data and files imported into a knowledge base are independent. Deleting a source file in Application Data does not affect the imported file.

    • For data query and Image Q&A knowledge bases: No. Deleting the source data causes features like data synchronization and knowledge base viewing to fail.

  • Q: When I call a knowledge base API, I receive the error code BailianIndexServiceNotOpen. What should I do?

    The BailianIndexServiceNotOpen error code indicates that the Model Studio knowledge base service is not activated. Log in to the Model Studio console, go to the Data > knowledge base page, and click Activate Now to activate the service. Then, try again.

Handling images and multimodal content

  • Q: My file contains illustrations that I want a Model Studio application to include in its response. What should I do?

    Document search

    Method 1 (For agent applications only)

    1. When you create a knowledge base, select document search as the Knowledge base type and With Illustrations as the use case.

      When you select With Illustrations, the knowledge base extracts summaries from the illustrations in the file. The large language model (LLM) then decides whether to insert an image based on the summary's relevance to the user's question.
      Important

      Do not select electronic document parsing when you upload documents. This parsing method cannot extract image content, which prevents the With Illustrations feature from working correctly.

      image

    2. When you create or edit an agent application, select the Qwen-Plus or Qwen-Plus-Latest model (these models are recommended for optimal performance). Click the + button to the right of Document knowledge base, and add the knowledge base that you created in the previous step.

      Note

      The configured recall length must be less than the actual document length. If the recall length is greater than the document length, the system returns the entire document and bypasses the logic for the With Illustrations feature.

      Note: The "With Illustrations" and "Show Answer Source" features cannot be enabled simultaneously.
    3. Actual Q&A result:

      image

    Method 2 (For agent applications and workflow applications)

    1. Upload an image to a publicly accessible location and get its full URL. We recommend using OSS. For instructions, see Upload an image to OSS and use its file URL.

    2. Insert the full URL of the image into the file. Relative paths are not supported. Do not embed image files directly in a document (for example, by copying and pasting or inserting a local image from a menu). You must reference images using their publicly accessible URLs.

      If an image fails to display even after following these instructions, verify that the URL in the chunk is complete. Check for and remove any extra spaces or special characters that could cause parsing errors. You can edit the chunk directly to make corrections.

      Example of correctly referencing an image in a file

      Sample prompt template

      Actual Q&A result

      image

      # Knowledge Base
      Please remember the following materials. They may be helpful for answering questions.
      ${documents}
      
      # Requirements
      If there are images, please display them.

      image

      Example of incorrectly referencing an image in a file

      Sample prompt template

      Actual Q&A result

      image

      # Knowledge Base
      Please remember the following materials. They may be helpful for answering questions.
      ${documents}
      
      # Requirements
      If there are images, please display them.

      image

      Explanation: If you embed an image directly in a file, the Model Studio application does not display it in its response.

    Image Q&A

    1. Upload an image to a publicly accessible location and get its full URL. We recommend using OSS. For instructions, see Upload an image to OSS and use its file URL.

    2. On the Table tab, create a new data table and add a field of type image_url to store the full URL of the image.

      Note
      • The image_url field does not support relative paths.

      • A single image_url field cannot store multiple image URLs. To associate a record with multiple images, create a separate image_url field for each image, such as image_1 and image_2.

      • Each image referenced by an image_url field must be no larger than 3 MB. If this limit is exceeded, the knowledge base creation fails.

      • After a data table is created, you cannot add new fields of type image_url or change an existing field's type to image_url. Include all required image fields in the initial table schema.

      image

    3. When you create a knowledge base, select Image Q&A as the Knowledge base type.

    4. When you create or edit an agent application, click the + button to the right of Image (Image Q&A knowledge base), add the knowledge base that you created in the previous step, and then change the prompt template to:

      # Knowledge Base
      Please remember the following materials. They may be helpful for answering questions.
      ${documents}
      
      # Requirements
      If there are images, please display them.
    5. Ask a question in the input box on the right.

      For example: "Briefly introduce the Model Studio X1 phone."

      Example of correctly referencing an image

      Sample prompt template

      User prompt and the Model Studio application's response

      image

      # Knowledge Base
      Please remember the following materials. They may be helpful for answering questions.
      ${documents}
      
      # Requirements
      If there are images, please display them.

      image

Permissions and security

  • Q: I received a "Missing permissions for this module" error when trying to manage a knowledge base. What should I do?

    By default, a RAM user cannot perform write operations such as creating, updating, or deleting a knowledge base. An Alibaba Cloud account must grant the RAM user page permissions for Administrator or, at a minimum, for both Application Data-Operations and Knowledge Base-Operations.

  • Q: Is a knowledge base private? Can other organizations or users access it?

    A knowledge base is private to its workspace and can be accessed and managed only by members of that workspace.

  • Q: Will Alibaba Cloud use the knowledge bases in my account to answer other users' questions?

    Alibaba Cloud is committed to data privacy and will not use your knowledge base data for model training or to answer other users' questions. See the Compliance & Privacy Statement for our data security and privacy commitments.

Migration and export

  • Q: How do I export a knowledge base to my local machine?

    One-click export is not currently supported. As an alternative, you can write a script that calls the ListChunks API to retrieve the document and chunk data in batches.