After you develop an application flow, you can deploy it as an Elastic Algorithm Service (EAS) service. EAS provides features such as auto scaling and comprehensive O&M monitoring. This ensures that your application can flexibly respond to business changes and growth, improving system stability and performance, and better meeting production environment requirements.
Prerequisites
You have created and debugged an application flow, see Develop an application flow.
Deploy an application flow
Go to LangStudio and select a workspace. On the Application Flow tab, click your debugged application flow, and then click Deploy in the upper-right corner. Before deploying, make sure the runtime is started. The following table describes the key parameters.

Parameter | Description |
Resource Information | |
Resource Type | Select public resources or a dedicated resource group that you have created. |
Instances | Configure the number of service instances. In the production stage, configure multiple instances to reduce the risk of single points of failure. |
Deployment Resources | If you use the application flow only for business flow scheduling, you can select appropriate CPU resources based on the complexity of the business flow. Compared with GPU resources, CPU resources are usually more cost-effective. After deployment, you are charged for the resources, see Billing of EAS. |
VPC: The application flow is actually deployed as an EAS service. To ensure that the client can normally access it, select a virtual private cloud (VPC). Note that EAS services cannot access the Internet by default. To access the Internet, configure a VPC that can be accessed over the Internet, see Configure network connectivity. Note If an application flow includes a vector database connection (such as Milvus), ensure that the configured VPC is the one where the vector database instance resides or the two VPCs are connected. | |
History | |
Enable History | This parameter applies only to chat type application flows. When enabled, the system can store and transmit multiple rounds of chat history. This feature must be used together with the request header parameter. |
History Storage | Local storage does not support multi-instance deployment. If you deploy services for production use, use an external storage instead, such as ApsaraDB RDS. For more information, see Appendix: Chat history. Important If you use local storage, multi-instance deployment is not supported, and the scale-out from a single instance to multiple instances is also not supported. Otherwise, the chat history feature may not work properly. |
Enable Tracing: When enabled, you can view trace records to evaluate the effect of the application flow after deployment. | |
Roles and Permissions: In the application flow, if you use a Faiss vector database (select a Faiss or Milvus vector database when creating a knowledge base) or "Alibaba Cloud IQS Search" (required by the IQS web search-based chatbot template), you must select an appropriate role. | |
For more information about parameter configurations, see Parameters for custom deployment in the console.
Online debugging
Call the service
Online debugging
After successful deployment, you are redirected to PAI-EAS. On the Online Debugging tab, configure and send a request. The Key in the request body must be the same as the value of the Chat Input parameter in the Start Node of the application flow. In this topic, the default field question is used.

Make API calls
On the Overview tab, obtain the endpoint and token.

Send an API request.
You can call the service in simple mode or complete mode. The following table describes the differences between the two modes.
Property
Simple Mode
Complete Mode
Request path
<Endpoint>/<Endpoint>/runFeature description
Directly returns the output results of the application flow.
Returns a complex structure, including the node status, error messages, and output messages of the application flow.
Scenario
You need only the final output results of the application flow and do not care about the internal processing or status of the flow.
Suitable for simple queries or operations to quickly obtain results.
You need to understand the execution process of the application flow in detail, including the status of each node and possible error messages.
Suitable for debugging, monitoring, or analyzing the execution of the application flow.
Advantages
Simple to use, no need to parse complex structures.
Provides comprehensive information to help you understand the execution process of the application flow in depth.
Helps troubleshoot and optimize the performance of the application flow.
Simple mode
Complete mode
Langstudio supports Server-Sent Events (SSE), which can output the status, error messages, and output messages of each node when the application flow is executed. You can also customize the content of the
node_run_infosin the events. The following example uses online debugging. You need to append/runto the call address and then edit the request body:
The following table describes the request body parameters.
Field Name
Type
Default Value
Description
inputs
Mapping[str, Any]
None
The input data dictionary. Keys should match the input field names defined in the application flow. If the flow has no inputs, this field is ignored.
stream
bool
True
Controls the response format. Default value: Dynamic. Valid values:
True: Responds with SSE streaming. The Content-Type in the response header is
text/event-stream, and the data is returned in DataOnly format, divided into different events: RunStarted, NodeUpdated, RunOutput, and RunTerminated. For more information, see the tables below.False: Responds with a single JSON body. The Content-Type in the response header is
application/json. You can refer to the response information in Online debugging.
response_config
Dict[str, Any]
-
Controls the detailed node information included in the streaming response (when stream=True).
∟ include_node_description
bool
False
(Within response_config) Whether to include node descriptions in the SSE event stream.
∟ include_node_display_name
bool
False
(Within response_config) Whether to include node display names in the SSE event stream.
∟ include_node_output
bool
False
(Within response_config) Whether to include node outputs in the SSE event stream.
∟ exclude_nodes
List[str]
[]
(Within response_config) List of node names to exclude from the SSE event stream.
The returned data is divided into different events: RunStarted, NodeUpdated, RunOutput, and RunTerminated:
OpenAI compatible calling method
Deployed chat type application flows support OpenAI compatible calling, and can be used by clients that support OpenAI.
OpenAI API-based method
This example demonstrates streaming calls using cURL commands. Here are the request and response examples:
Sample request:
curl --location '<Endpoint>/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "default",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who are you?"
}
],
"stream":true
}'The following table describes the request parameters.
Parameter | Description |
--location '<Endpoint>/v1/chat/completions' | The destination URL of the request. Replace |
--header "Authorization: Bearer $DASHSCOPE_API_KEY" | The HTTP header. Replace |
"model": "default" | The model name, which is fixed as |
"stream":true | Specifies whether the returned information is streaming. Note: Streaming is supported only when an LLM node is used as the output node of the application flow (an LLM node is the direct input to the end node). |
Sample response:
data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"finish_reason":null,"delta":{"content":"I am"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":"a large"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":"language model"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":"created by Alibaba Cloud"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":". I am called Qwen."},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: {"choices":[{"delta":{"content":""},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715931028,"system_fingerprint":null,"model":"qwen-plus","id":"chatcmpl-3bb05cf5cd819fbca5f0b8d67a025022"}
data: [DONE]Integration with other clients
This example demonstrates integration with ChatBox v1.13.4 on the Windows platform.
Download and install Chatbox.
Open ChatBox and configure the model provider name, such as LangStudio, as follows.

Select the configured model provider and configure the service request parameters.

The following table describes the key parameters.
Parameter
Description
API Mode
Fixed as
OpenAI API Compatible.API Key
Set to the token of the deployed service, see Obtain the endpoint and token on the Overview tab.
API Host
Set to the endpoint of the deployed service (see Obtain the endpoint and token on the Overview tab.) and add the
/v1suffix at the end. This example uses an internet endpoint. Therefore, the API host ishttp://langstudio-20250319153409-xdcp.115770327099****.cn-hangzhou.pai-eas.aliyuncs.com/v1.API Path
Fixed as
/chat/completions.Model
Click New and enter a custom Model ID, such as qwen3-8b.
Call the deployed service in the chat dialog box.

View trace records
After you call a service, the system automatically generates a trace record. On the Tracing Analysis tab, find the trace record that you want to manage and click View Trace in the Actions column.

The trace data allows you to view the input and output information of each node in the application flow, such as the recall results of the vector database or the input and output information of the LLM node.
Appendix: Chat history
For chat-based application flows, LangStudio provides a feature to store the history of multi-round conversations. You can choose to use local storage or external storage to save the chat history.
Storage types
Local storage: The service uses the local disk to automatically create an SQLite database named chat_history.db on the EAS instance where the application flow is deployed to save the chat history. The default storage path is
/langstudio/flow/. Note that the local storage does not support multi-instance deployment. Regularly check the usage of the local disk. You can also view or delete the chat history by using the API provided below. If an EAS instance is removed, the related chat history is also cleared.External storage: Supports ApsaraDB RDS for MySQL. To use external storage, you must configure an RDS MySQL connection for storing the chat history when you deploy a service. For more information, see Service connection configuration - Database. The service automatically creates tables suffixed with the service name in the RDS MySQL database that you configure. For example, the service creates the
langstudio_chat_session_<Service name>table to store the chat session and thelangstudio_chat_history_<Service name>table to store the chat history.
Session or user support
Each chat request to an application flow is stateless. If you want multiple requests to be treated as the same conversation, you need to manually configure the request header. For information about how to make calls, see Make API calls.
Request header | Data type | Description | Note |
Chat-Session-Id | String | The session ID. For each service request, the system automatically assigns a unique identifier to the session to distinguish between different sessions, and returns it through the | Custom session IDs are supported. To ensure uniqueness, a session ID must be 32 to 255 characters in length and can contain letters, digits, underscores (_), hyphens (-), and colons (:). |
Chat-User-Id | String | The user ID, which identifies the user to whom the chat belongs. The system does not automatically assign a user ID. Custom user IDs are supported. | - |
Chat history API
The application flow service also provides chat history data management API operations, which allow you to easily view and delete these data. You can obtain the complete API schema by sending a GET request to {Endpoint}/openapi.json. This schema is built based on the Swagger standard. For a more intuitive understanding and exploration of these API operations, we recommend that you use Swagger UI to perform visualization operations, making operations simpler and clearer.