All Products
Search
Document Center

Platform For AI:Use LangStudio to create a DeepSeek- and RAG-based Q&A application flow for finance and healthcare

Last Updated:May 09, 2025

Large language models (LLMs) may lack enterprise-specific or real-time data. Retrieval-Augmented Generation (RAG) technology enhances the accuracy and relevance of model responses by providing LLMs with access to private knowledge bases. This topic describes how to develop and deploy a RAG-based application in LangStudio.

Background information

In the realm of modern information retrieval, RAG models combine the advantages of information retrieval and generative artificial intelligence to deliver more accurate and relevant answers in specific scenarios. For example, in specialized fields such as finance and healthcare, users often require accurate and pertinent information for decision-making. Traditional generative models excel in natural language understanding and generation but may lack accuracy in specialized knowledge. RAG models effectively improve the accuracy and contextual relevance of answers by integrating retrieval and generation technologies. This topic provides a RAG-based application for the finance and healthcare fields by using Platform for AI (PAI) as the core platform.

Prerequisites

  • LangStudio supports Faiss or Milvus as its vector database. If you want to use Milvus, you must first create a Milvus database..

    Note

    In most cases, Faiss is used in test environments without the need to create an additional database. In production environments, we recommend that you use Milvus, which can process larger volumes of data.

  • The data required for the RAG knowledge base has been uploaded to OSS.

1. (Optional) Deploy LLM and embedding model

The RAG-based application flow requires both LLM and embedding model. This section describes how to quickly deploy the required model services through Model Gallery. If the deployed model service meets your business requirements and supports OpenAI APIs, you can skip this step and use it directly.

Choose QuickStart > Model Gallery and deploy the models for the following two scenarios.

Important

Make sure that you select an LLM that has been fine-tuned based on instructions. Base models cannot reliably follow user instructions to answer questions.

  • Select large-language-model in the Scenarios section and deploy DeepSeek-R1-Distill-Qwen-7B.image

  • Select embedding in the Scenarios section and deploy bge-m3 embedding model.image

2. Create a connection

The LLM and embedding connections created in this topic are based on the Elastic Algorithm Service (EAS) model services deployed in QuickStart > Model Gallery. For information about other connection types, see Create a connection.

2.1 Create an LLM connection

Go to LangStudio, select a workspace, and then choose Connection > Model Service. On the tab that appears, click New Connection to create a general LLM model service connection.

image

The following table describes the key parameters.

Parameter

Description

Name

If you deploy a model through Model Gallery, you can obtain the model name on the model details page. To go to the model details page, click the related model card on the Model Gallery page. For more information, see the "Model service" section of Create a connection.

Service Provider

  • PAI-EAS Model Service: In this example, PAI-EAS model services are used. You can select the LLM service deployed in 1. (Optional) Deploy LLM and embedding model for EAS Service. After you select the service, base_url and api_key are automatically filled in. base_url is set to the VPC endpoint of the deployed LLM and api_key is set to the token.

  • Third-party Model Service: You can use a third-party model service. For example, if you are using a DeepSeek model service, you must set base_url to https://api.deepseek.com, and enter the value of api_key obtained from the DeepSeek official website.

2.2 Create an embedding connection

You can create an embedding connection by referring to 2.1 Create an LLM connection.

image

2.3 Create a vector database connection

On the Application Development (LangStudio) page, choose Connection > Database. and click Create Connection. On the tab that appears, click New Connection to create a Milvus database connection.

image

The following table describes the key parameters.

Parameter

Description

uri

The endpoint of the Milvus instance, in the format of http://<Milvus internal endpoint>. Example: http://c-b1c5222fba****-internal.milvus.aliyuncs.com.

token

The username and password for logging on to the Milvus instance, in the format of <yourUsername>:<yourPassword>.

database

The database name. The default database default is used.

3. Create a knowledge base index

You must create a knowledge base index to parse, chunk, vectorize, and store the corpus in the vector database. The following table describes the key parameters. For information about other configurations, see Create a knowledge base index.

Parameter

Description

Basic Configurations

Data Source OSS Path

Set the value to the OSS path of the RAG knowledge base in Prerequisites.

Output OSS Path

Set the value to the path for storing intermediate results and index information generated during document parsing.

Important

If you use FAISS as the vector database, the application flow saves the generated index files to OSS. If you use a default role of PAI (set Instance RAM Role to Default Roles of PAI on the Start Runtime page), the application flow can access the default storage bucket of your workspace. Therefore, we recommend that you set this parameter to a directory of the OSS bucket where the storage path for the workspace resides. If you use a custom role, you must grant OSS access permissions to the custom role. We recommend that you attach the AliyunOSSFullAccess policy to the role.

Embedding Model and Databases

Embedding Type

Select General Embedding Model.

Embedding Connection

Select the embedding connection created in 2.2 Create an embedding connection.

Vector Database Type

Select Vector Database Milvus.

Vector Database Connection

Select the Milvus database connection created in 2.3 Create a vector database connection.

Table Name

Set the value to the collection of the Milvus database created in Prerequisites.

VPC Configuration

VPC

Select the same VPC as that of the Milvus instance or a VPC that is connected to the VPC where the Milvus instance resides.

4. Create and run a RAG-based application flow

  1. Go to LangStudio, select a workspace, and then click the Application Flow tab. On the tab that appears, click Create Application Flow to create a RAG-based application flow.

    image

  2. On the application flow details page, click Create Runtime to create a runtime and start it. Note: Make sure that the runtime is started before the system parses a Python node or you can view More Tools.

    image

    Key parameter:

    VPC: Select the same VPC as that of the Milvus instance in Prerequisites or a VPC that is connected to the VPC where the Milvus instance resides.

  3. Develop the application flow.

    image

    Retain the default settings for the nodes or configure them based on your business requirements. Use the following settings for the key nodes:

    • Knowledge Retrieval: retrieves text relevant to user questions from the knowledge base.

    • LLM: uses the retrieved documents as context, sends the documents together with user questions to the LLM, and then generates an answer.

      • Model Configuration: Select the connection created in 2.1 Create an LLM connection.

      • Chat History: If you turn on this switch, the chat history feature is enabled, and previous conversations are used as input variables.

    For more information about each component, see Develop an application flow.

  4. Debug or run the application flow. Click Run in the upper-right corner of the details page to run the application flow. For common issues during application flow runtime, see FAQ.

    image

  5. View the traces. Click View Traces below the generated answer to view the trace details or topology view.

    image

5. Deploy the application flow

On the development page of the application flow, click Deploy in the upper-right corner to deploy the application flow as an EAS service. Retain the default settings for the parameters or configure them based on your business requirements. Use the following settings for the key parameters:

  • Instances in the Resource Information section: Enter the number of service instances. Set this parameter to 1 for testing purposes. In production environments, we recommend that you configure multiple instances to mitigate the risk of a single point of failure (SPOF).

  • VPC (VPC) in the VPC section: Select the same VPC as that of the Milvus instance or a VPC that is connected to the VPC where the Milvus instance resides.

For more information about deployment, see Deploy an application flow.

6. Call the service

After the deployment is successful, you are redirected to the Elastic Algorithm Service (EAS) page of PAI. On the Online Debugging tab, configure and send a request. The key in the request body must be consistent with the Chat Input field in the Start Node of the application flow. In this example, the default field question is used.

image

For more methods to call the service (such as API operations) and detailed instructions, see Call a service.

References