All Products
Search
Document Center

Vector Retrieval Service for Milvus:Build a RAG system with Milvus and Dify

Last Updated:Jun 04, 2026

Build a retrieval-augmented generation (RAG) system with Vector Retrieval Service for Milvus (Milvus) and the Dify platform.

Background

RAG principles

Large language models can "hallucinate"—generating incorrect information—when their internal knowledge is limited. Retrieval-augmented generation (RAG) addresses this by connecting the model to an external knowledge base. An efficient RAG system requires a powerful vector database.

This topic shows how to integrate Milvus and Dify to build an enterprise-grade RAG application that demonstrates the value of a vector database in solving the "last mile" problem in AI.

Dify

Dify is an open-source AI application development platform with low-code workflows. It simplifies the process of building AI applications by integrating backend-as-a-service (BaaS) and LLMOps.

Dify provides backend infrastructure (API services, data management) so developers do not have to build from scratch. Its visual prompt orchestration interface simplifies prompt engineering. The built-in RAG engine connects to private knowledge bases such as enterprise documents and databases, enabling the LLM to generate domain-specific answers that are accurate and traceable, with reduced hallucinations.

Prerequisites

Procedure

Step 1: Install Dify

  1. Clone the open-source Dify project from GitHub to your local machine.

    git clone https://github.com/langgenius/dify.git
  2. Navigate to the deployment directory and back up the .env configuration file.

    cd dify/docker/
    cp .env.example .env
  3. Modify the following settings in the .env file.

    # Vector storage engine configuration
    VECTOR_STORE=milvus  # Specifies Milvus as the vector storage engine
    # Milvus connection information
    MILVUS_URI=http://YOUR_ALIYUN_MILVUS_ENDPOINT:19530
    MILVUS_USER=YOUR_ALIYUN_MILVUS_USER
    MILVUS_PASSWORD=YOUR_ALIYUN_MILVUS_PASSWORD

    Replace the placeholder values with your actual information.

    Parameter

    Description

    MILVUS_URI

    The endpoint of the Milvus instance. The format is http://<public IP address>:<port>.

    • <public IP address>: Available on the Details page of your Milvus instance.

    • <port>: Available on the Details page of your Milvus instance. The default is 19530.

    MILVUS_USER

    The username you set when creating the Milvus instance.

    MILVUS_PASSWORD

    The password for the user you set when creating the Milvus instance.

  4. Start Dify.

    docker compose up -d --build
    [root@xxx /docker]# docker compose up -d --build
    [+] Running 15/15
    ✔ Network docker_default              Created
    ✔ Network docker_milvus               Created
    ✔ Network docker_ssrf_proxy_network    Created
    ✔ Container docker-db-1               Healthy
    ✔ Container docker-redis-1            Started
    ✔ Container docker-sandbox-1          Started
    ✔ Container milvus-etcd               Started
    ✔ Container milvus-minio              Started
    ✔ Container docker-ssrf_proxy-1       Started
    ✔ Container docker-web-1              Started
    ✔ Container docker-plugin_daemon-1    Started
    ✔ Container docker-worker-1           Started
    ✔ Container docker-api-1              Started
    ✔ Container milvus-standalone         Started
    ✔ Container docker-nginx-1            Started
  5. Open http://127.0.0.1/ in a browser to access Dify. Set the administrator account and password, and then log in.

    Note

    If Dify runs on a remote server (ECS instance or virtual machine), replace 127.0.0.1 with the server's public IP address or domain name. Ensure the server is publicly accessible.

    Enter your Email, Username, and Password (at least 8 characters, with both letters and numbers), and click Set up.

Step 2: Configure models

  1. Click your profile picture in the upper-right corner and select Settings.

  2. In the left-side navigation pane, select Model Provider. Find Qwen and click Install.

  3. After the model is installed, select it and enter the API key from Alibaba Cloud Model Studio.

  4. In the System Model Settings panel, configure the System Inference Model, Embedding Model, Rerank Model, speech-to-text model, and text-to-speech model, and then click Save.

Step 3: Create a knowledge base

  1. At the top of the page, click Knowledge, and then click Create Knowledge.

  2. For Data source, select Import Existing Text. Download the sample data (README.md) and upload it.

  3. Modify the parameters as needed and click Save & Process.

    Use the default values for key parameters: indexing method High quality, retrieval setting vector retrieval, Rerank Model gte-rerank enabled, Top K 3, Score Threshold 0.5.

    In this example, modify the following parameters:

    • Maximum chunk length: Set to 1024.

    • Embedding Model: Select text-embedding-v1.

    After processing completes, the knowledge base is created.

    The summary confirms the settings: chunking mode Custom, text preprocessing replaces consecutive spaces/newlines/tabs, indexing method High quality, retrieval setting vector retrieval. Click Go to Documentation to view details.

Step 4: Verify vector retrieval

Log on to the Vector Retrieval Service for Milvus console. Select your Milvus instance and click Attu Manager in the upper-right corner. On the Attu page, verify that the corresponding collection is created. Attu tool management.

Step 5: Verify the RAG performance

  1. Click Studio at the top of the page, then select Create from Template.

  2. Search for and select the Knowledge Retrieval + Chatbot template.

  3. In the dialog box, click Create.

  4. Select the Knowledge Retrieval node and set the knowledge base to the one you created in the previous step.

    The workflow connects nodes in this order: START → KNOWLEDGE RETRIEVALLLM (qwen-max) → ANSWER. The query variable is sys.query.

  5. Select the LLM node and set the model to qwen-max.

  6. In the upper-right corner, click Publish, and then click Publish Update.

  7. Click Run to open the test page. Enter a question related to the knowledge base content to verify the answer.