All Products
Search
Document Center

Vector Retrieval Service for Milvus:Build a RAG system with Vector Retrieval Service for Milvus and Dify

Last Updated:Dec 05, 2025

This topic describes how to build a retrieval-augmented generation (RAG) system using Vector Retrieval Service for Milvus (Milvus) and the Dify platform.

Background

RAG

Large language models (LLMs) often hallucinate because their knowledge is limited. The retrieval-augmented generation (RAG) technology solves this pain point by connecting to external knowledge bases. An efficient RAG system requires a powerful vector database.

This topic focuses on Milvus and the Dify low-code artificial intelligence (AI) platform. It shows you how to combine them to quickly build an enterprise-grade RAG application. This demonstrates the core value of a vector database in solving the "last mile" problem in AI.

Dify

Dify is an open source platform for developing AI applications. It features low-code pipelines and a user-friendly interface. Its core mission is to simplify and accelerate the process of building AI applications by integrating backend-as-a-service (BaaS) and LLM Operations (LLMOps) concepts.

Dify is a full-stack solution. On the backend, it provides reliable infrastructure, including API services and data management. This lets developers avoid building from scratch. For LLMOps, Dify offers an intuitive visual interface for prompt engineering, which simplifies complex prompt workflows. The built-in, high-quality RAG engine connects to private knowledge bases, such as company documents and databases. This connection allows the large language model to answer questions using domain-specific knowledge, which reduces hallucinations and ensures that answers are accurate and traceable.

Prerequisites

Procedure

Step 1: Install Dify

  1. Clone the open-source Dify project from GitHub to your local machine.

    git clone https://github.com/langgenius/dify.git
  2. Go to the deployment directory and back up the .env configuration file.

    cd dify/docker/
    cp .env.example .env
  3. Modify the following configuration information in the .env file.

    # Vector storage engine configuration
    VECTOR_STORE=milvus  # Specifies Milvus as the vector storage engine
    
    # Milvus connection information
    MILVUS_URI=http://YOUR_ALIYUN_MILVUS_ENDPOINT:19530
    MILVUS_USER=YOUR_ALIYUN_MILVUS_USER
    MILVUS_PASSWORD=YOUR_ALIYUN_MILVUS_PASSWORD

    This example uses the following parameters. Replace them with your actual values.

    Parameter

    Description

    MILVUS_URI

    The endpoint of the Milvus instance. The format is http://<Public IP address>:<Port>.

    • <Public IP address>: You can view this on the Instance Details page of your Milvus instance.

    • <Port>: You can view this on the Instance Details page of your Milvus instance. The default port is 19530.

    MILVUS_USER

    The username that you specified when you created the Milvus instance.

    MILVUS_PASSWORD

    The password for the user that you specified when you created the Milvus instance.

  4. Start Dify.

    docker compose up -d --build

    image

  5. In a browser, navigate to http://127.0.0.1/ to access the Dify service. Set the administrator account and password, and then log on to the console.

    Note

    If Dify is running on a remote server, such as an Elastic Compute Service (ECS) instance or a virtual machine, replace 127.0.0.1 with the public IP address or domain name of the server. Make sure that the server is accessible over the network.

    image

Step 2: Set up the model

  1. After you log on, click your profile picture in the upper-right corner and select Settings from the drop-down menu.

    image

  2. In the navigation pane on the left, select Model Provider. On the right, find Qwen and install it.

  3. After the model is installed, click API-Key Setup. Enter your API key from Alibaba Cloud Model Studio.

    image

  4. Configure the System Model Settings as needed.

    image

Step 3: Create a knowledge base

  1. At the top of the page, click Knowledge, and then click Create Knowledge.

    image

  2. Click Import from file. Download the sample data by clicking README.md. Then, upload the sample data file.

    image

  3. Modify the parameters and click Save & Process.

    image

    This example modifies the following parameters:

    • Maximum chunk length: Set it to 1024.

    • Embedding Model: Select text-embedding-v1.

    After the process is complete, the knowledge base is created.

    image

Step 4: Verify vector retrieval

Log on to the Vector Retrieval Service for Milvus console and select your Milvus instance. In the upper-right corner, click Attu Manager to open the Attu page. You can see that the corresponding collection data is imported. For more information, see Manage Milvus instances with Attu.

Step 5: Verify the RAG performance

  1. At the top of the page, click Studio to return to the home page. Select Create from Template.

    image

  2. In the search box, search for and use the Knowledge Retrieval + Chatbot template.

    image

  3. In the dialog box that appears, click Create.

    image

  4. Select the KNOWLEDGE RETRIEVAL node. Set the knowledge base to the one that you created in the previous step.

    image

  5. Select the LLM node and set the model to qwen-max.

    image

  6. In the upper-right corner, click Publish, and select Update.

    image

  7. Run the workflow to open the test page. Enter a question related to the content in the knowledge base to receive an answer.