The DataWorks Agent is based on the Model Context Protocol (MCP). It connects to the DataWorks MCP Server and other big data MCP servers, such as the Hologres MCP Server, to provide capabilities like data development, Task O&M, and Data Integration in DataWorks using natural language.
This feature requires a third-party client. For a more streamlined agent experience, see DataWorks Agent.
How it works
The DataWorks Agent allows you to perform big data development tasks through a conversational interface. It uses a Large Language Model (LLM) to parse your requests and calls MCP Server capabilities to execute them.
For example, if you ask, "How many workspaces do I have?", the agent uses the LLM to parse the request and call the ListProjects tool. This tool, provided by the DataWorks MCP Server through the built-in DataWorks OpenAPI, queries and returns the result. For more complex tasks, the LLM may interact with the MCP Server multiple times.
The DataWorks Agent not only integrates with the DataWorks MCP Server but also connects to other MCP Servers. You can also choose your own LLM, such as Qwen, DeepSeek, or OpenAI.
Try the following prompts to explore more of the DataWorks Agent's capabilities.
|
Scope |
Example prompts |
|
Data Development |
[Query tasks] Find all paused |
|
[Rename tasks] Rename the nodes found above to "invalid_node_to_delete". Use sequence numbers to distinguish between multiple nodes. |
|
|
[Create tasks] Create five |
|
|
Task O&M |
[Rerun failed tasks] In this workspace, find tasks that failed on |
|
[Query failed instances] In the project directory of this workspace, find instances that failed on |
|
|
[Analyze rerun properties] Analyze the rerun properties of these tasks. If a task is rerunnable, rerun it. |
|
|
Data Integration |
[Synchronize a single MySQL table to MaxCompute] Create a batch Data Integration task in the current workspace with the following settings:
|
|
[Sales analysis] Analyze the sales trends of the top 10 best-selling products this month from the `order` table. |
Limitations
This feature is only available in workspaces with Data Studio (New Version) enabled, and can only be used in a personal development environment.
After you restart a personal development environment, you must reinstall the MCP Server feature. Proceed with caution.
Billing
Using the DataWorks Agent incurs the following charges:
-
DataWorks OpenAPI call charges
When the agent calls DataWorks OpenAPI through the MCP Server, you are billed according to the OpenAPI billing standards.
-
LLM token charges
When the agent parses user intent and generates natural language responses, it calls the LLM you have configured, such as Qwen. This process consumes input and output tokens, and you are billed by your chosen model provider. For example, if you use the
qwen-coder-plusmodel from Model Studio, charges are calculated based on the Model Studio billing description.
Quick start
After you configure the DataWorks Agent, click the
icon in the upper-right corner of the Cline page to open the chat interface and try it out by querying for members of the current workspace.
Enter this prompt: Query members of the current workspace.
The agent executes the task in the following steps:
-
Request parsing and confirmation: The agent parses your intent, such as "query members of the current workspace", automatically identifies that it needs to call the
ListProjectMembersAPI, and prompts you to confirm required parameters such as the target workspace (ProjectId). -
API call and response: After you approve the action, the agent calls the
ListProjectMembersOpenAPI, retrieves the member list for the workspace, and returns it in a structured format that includes roles and account types. For more information about theListProjectMembersOpenAPI, see ListProjectMembers - Query the member list of a workspace.Note-
During execution, the system prompts you to confirm relevant operations and obtain necessary information. You can click Approve to proceed or Reject to deny the operation.
-
The specific steps may vary slightly depending on the task's complexity, the chosen LLM, and the model version. The execution flow is determined by the agent's real-time parsing and interaction.
-
Configure DataWorks Agent
The DataWorks Agent uses an MCP client plug-in, such as Cline, to build the front-end chat interface and connects to the DataWorks MCP Server and other Alibaba Cloud MCP servers through the MCP Server configuration.
You can connect to more open source MCP servers as needed to enhance the capabilities of the DataWorks Agent.
Before you begin
-
You have created a workspace and selected Use Data Studio (New Version).
-
(Optional. Required for RAM users.) The RAM user for task development must be added to the workspace and granted the Development or Workspace Manager role. The Workspace Manager role has extensive permissions, so grant it with caution. For details on adding members, see Add workspace members.
If you are using an Alibaba Cloud account, you can skip this step.
-
You have created a personal development environment instance.
NoteIf your personal development environment is bound to a Virtual Private Cloud (VPC), you must configure internet access for the personal development environment.
Step 1: Enter personal development environment
Follow these steps to enable and enter your personal development environment.
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose in the Actions column.
-
Click the
icon next to Personal Development Environment in the top navigation bar to check the status of your personal development environment instance and enter it.-
If the instance status is Running: Click the running personal development environment instance under Personal Development Environment to enter it.
-
If the instance is in any other state: Click Management Environment in the pop-up window. On the Personal Development Environment Instances page, find the instance you created, click Start in the Actions column, and wait for the Instance Status to change to Running. Then, click the instance to enter the personal development environment.
NoteWhen an icon similar to
appears in the Personal Development Environment area, it means you have entered the personal development environment. -
Step 2: Install Cline
After entering the personal development environment, follow these steps to configure the DataWorks Agent. This document uses Cline as an example.
If you selected the dataworks-mcp:py3.11-ubuntu22.04 image when you created the personal development environment instance, you do not need to upgrade the program engine or install the Cline extension.
Upgrade program engine
Install Cline extension
Step 3: Configure LLM API key
After installing the Cline extension, follow these steps to configure your API key. This example shows how to connect to the Model Studio API using the OpenAI Compatible mode.
-
To connect to a model in a different mode, configure the parameters as provided in the interface.
-
Only the Use your own API key method is currently supported. The Get Started for Free method is not supported.
-
On the Data Studio personal development environment page, click
in the upper-right corner of the top navigation bar to open Copilot Chat, and then click
to switch to Cline. -
On the Cline page, click Use your own API key and configure the parameters as described in the following table.
Parameter
Description
API Provider
Specifies the API service provider. Select
OpenAI Compatible. This indicates that you will use an OpenAI-compatible interface to connect to the Model Studio API.Base URL
The base URL of the API service, which specifies the root address for API requests.
For example, the OpenAI-compatible API endpoint address provided by the Model Studio API is
https://dashscope-intl.aliyuncs.com/compatible-mode/v1.API key
The key used for authentication. You can obtain this API key from the Model Studio console.
Model ID
Specifies the model you want to use. Different models have different functions and performance characteristics.
Select
qwen-coder-plusorqwen-plus:-
qwen-coder-plus: Suitable for code generation and programming tasks. -
qwen-plus: Suitable for general text generation and processing tasks.
-
-
Click the Let's go! button below to complete the API key configuration.
Step 4: Configure MCP Server
After configuring the API key, follow these steps to connect to and configure the DataWorks MCP Server. For more information about the DataWorks MCP Server, see Appendix: DataWorks MCP Server.
-
On the Cline page, click the
icon in the upper-right corner to go to the Marketplace tab of MCP Server. -
Switch to the Installed tab to view installed MCP servers.
-
Click Configure MCP Servers to open the
cline_mcp_settings.jsonconfiguration file. DataWorks provides a default configuration foralibabacloud-dataworks-mcp-server. The configuration is as follows:{ "mcpServers": { "alibabacloud-dataworks-mcp-server": { "command": "npx", "args": [ "alibabacloud-dataworks-mcp-server" ], "env": { "REGION": "cn-shanghai", "ALIBABA_CLOUD_CREDENTIALS_URI": "http://localhost:7002/api/v1/credentials/0", "TOOL_CATEGORIES": "SERVER_IDE_DEFAULT" }, "disabled": false, "autoApprove": [], "timeout": 60 } } }Parameter
Description
command
npx, which indicates the command method provided by dataworks-mcp-server.
args
alibabacloud-dataworks-mcp-server, which indicates the command argument for dataworks-mcp-server.
env
REGION
The region where the current DataWorks workspace is located. The example uses cn-shanghai.
ALIBABA_CLOUD_CREDENTIALS_URI
Specifies the URI for Alibaba Cloud credentials.
ImportantThis parameter is effective only in the DataWorks personal development environment and is used to obtain Alibaba Cloud user authentication.
TOOL_CATEGORIES
Configures the allowlist of
Toolcategories. Enter the OpenAPI categories here, separated by commas.Example:
"TOOL_CATEGORIES":"Data Sources,Workspace Management,Resource Group Management,Data Map,Data Integration,Data Studio (New Version),Task O&M,Data Service,Open Platform,Data Quality,Tag Management,Security Center,SERVER_IDE_DEFAULT".Note-
SERVER_IDE_DEFAULTrefers to the defaultToolsin the personal development environment. The other service categories can be found in the left-side directory tree on the DataWorks - OpenAPI Overview page. -
To improve model loading performance and user experience,
TOOL_CATEGORIESis set toSERVER_IDE_DEFAULTin the default configuration. To enable all OpenAPI tools, you can remove this configuration item.
TOOL_NAMES
Configures the allowlist of
Toolnames. Enter the OpenAPI names here, separated by commas.Example:
"TOOL_NAMES":"ListProjects,CreateNode,UpdateNode".NoteYou can find the
TOOL_NAMESon the DataWorks - OpenAPI Overview page. -
-
After you save the configuration, when the list of available Tools loads, this confirms that
alibabacloud-dataworks-mcp-serveris installed and configured. You can now start using the DataWorks MCP Server features.NoteIf the information fails to load, confirm whether you have upgraded the program engine.

-
You can extend the capabilities of the DataWorks Agent by directly editing the
cline_mcp_settings.jsonconfiguration file or by installing other MCP servers from the Marketplace. For example, to use Hologres-related features in the DataWorks Agent, you can connect to the Hologres MCP Server.
FAQ
-
Q: When I run a preset prompt on the MCP Server, the API request is unresponsive or remains in a running state for a long time. What could be the cause and how can I fix it?
A: If an API request remains in a running state for a long time without returning a result, it may be due to a compatibility issue with an older version of the program engine. We recommend that you upgrade the program engine.
-
Q: How can I optimize the model's response speed?
A: To improve response performance, you can take the following measures:
-
Reduce the number of simultaneously enabled MCP servers to lower system resource overhead.
-
In the MCP Server's configuration file, explicitly specify
TOOL_CATEGORIESorTOOL_NAMESthrough theenvparameter to load only the necessary toolsets, thereby reducing the number of importedTools.
-
Appendix: DataWorks MCP Server
MCP (Model Context Protocol) is a protocol that provides standardized context for Large Language Models (LLMs). It defines a standard way for large models to connect to different data sources and tools, enabling them to understand and process information more effectively. An MCP client can call the capabilities of various MCP servers through the MCP protocol.
The DataWorks MCP Server encapsulates the DataWorks OpenAPI and provides the big data capabilities of DataWorks. You can integrate the DataWorks MCP Server into third-party products, programs, or agents to quickly call DataWorks capabilities.
When you use the agent outside the DataWorks personal development environment, you need to configure ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET (obtain them here) in the env parameter and delete the ALIBABA_CLOUD_CREDENTIALS_URI configuration.
icon to Configure Keybindings, 