DataWorks Agent is based on the Model Context Protocol (MCP). It connects to the DataWorks MCP Server and other big data MCP servers, such as the Hologres MCP Server, to perform tasks like data development, Task O&M, and data integration in DataWorks using natural language.
This feature is accessed through a third-party client. For a more streamlined agent experience, see DataWorks Agent.
How it works
DataWorks Agent allows you to perform big data development tasks using natural language in an intelligent chat window. The agent uses a Large Language Model (LLM) to parse your requests and intelligently calls MCP Server capabilities to execute tasks.
For example, you can ask "How many workspaces do I have?" in the DataWorks Agent chat window. The agent uses the LLM to parse this request and calls the ListProjects tool provided by the DataWorks MCP Server, which is built on DataWorks OpenAPI, to retrieve the result. For more complex tasks, the LLM may interact with the MCP Server multiple times.
DataWorks Agent not only integrates with the DataWorks MCP Server but also supports connecting to other MCP servers. You can also choose your own LLM, such as Qwen, DeepSeek, or OpenAI.
You can also try the following prompts to explore more task scenarios supported by DataWorks Agent.
Scope | Example prompt |
Data development | [Query tasks] Find all |
[Rename tasks] Rename the nodes found above to "invalid_node_to_delete". Use sequential numbers to differentiate multiple nodes. | |
[Create tasks] Create five | |
Task O&M | [Rerun failed tasks] Find the tasks that failed to run on |
[Query failed instances] Find all instances that failed on | |
[Analyze task rerun properties] Analyze the rerun properties of these tasks. If they are rerunnable, rerun them. | |
Data integration | [Sync a single MySQL table to MaxCompute] Create a batch data integration task in the current workspace with the following settings:
|
[Sales analysis] Analyze the sales trends of the top 10 best-selling products this month from the order table. |
Limitations
This feature is only available in a personal development environment within a workspace that has the new version of Data Studio enabled.
After a personal development environment is restarted, you must reinstall the MCP Server. Proceed with caution.
Billing
Using DataWorks Agent incurs the following fees:
DataWorks OpenAPI call fees
When the agent calls DataWorks OpenAPI through an MCP Server, fees are charged according to the OpenAPI billing standards.
LLM token fees
The agent calls the Large Language Model (LLM) you configure (such as Qwen) to parse user intent and generate natural language responses. This process consumes input and output tokens, which are billed according to the pricing rules of your chosen model provider. For example, if you use the
qwen-coder-plusmodel in Model Studio (Bailian), fees are calculated based on the Model Studio (Bailian) billing details.
Quick start with DataWorks Agent
After you configure DataWorks Agent, click the
icon in the upper-right corner of the Cline page to open the DataWorks Agent chat interface and try a quick example: query the members of the current workspace.
Enter the following prompt: Query the members of the current workspace.
The agent breaks down the steps and runs them:
Parse and confirm request: The agent parses your request to "Query the members of the current workspace", automatically identifies the
ListProjectMembersAPI to call, and prompts you to confirm necessary parameters, such as the target workspace (ProjectId).Call API and get response: After you grant permission, the agent calls the
ListProjectMembersOpenAPI to retrieve a structured list of members in the workspace, including their roles and account types. For more information about theListProjectMembersOpenAPI, see ListProjectMembers - Query the members of a workspace.NoteDuring execution, the system prompts you to confirm relevant actions and provide necessary information. You can click Approve to proceed or Reject to cancel the action.
The breakdown logic for key steps may vary slightly depending on task complexity, LLM selection, and model version. The actual execution flow depends on the agent's real-time parsing and interaction during the session.
Configure DataWorks Agent
DataWorks Agent uses an MCP client extension, such as Cline, to create the chat interface. It connects to the DataWorks MCP Server and other Alibaba Cloud MCP servers through MCP Server configurations.
You can connect to more open source MCP servers as needed to enhance the capabilities of DataWorks Agent.
Prerequisites
You have created a workspace and selected Use Data Studio (New Version).
(Optional, for RAM users) You have added the development RAM user to the workspace with the Development or Workspace Manager role. The Workspace Manager role has extensive permissions, so assign it with caution. For details on adding members, see Add members to a workspace.
If you are using an Alibaba Cloud account, you can skip this step.
You have created a personal development environment instance.
NoteIf you need to bind your personal development environment to a Virtual Private Cloud (VPC), you must configure public network access for the personal development environment.
Step 1: Enter the personal development environment
Follow these steps to start and enter your personal development environment.
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose in the Actions column.
Click the
icon next to Personal Development Environment in the top navigation bar to check the status of your personal development environment instance and enter it.If the instance status is Running: Click the running instance under Personal Development Environment to enter the environment.
If the instance is in another state: Click Management Environment in the pop-up window. On the Personal Development Environment Instances page, find your instance and click Start in the Actions column. Wait for the Instance Status to change to Running, then click the instance to enter the environment.
NoteAn icon similar to
in the Personal Development Environment area indicates that you have entered the personal development environment.
Step 2: Install Cline
After entering the personal development environment, follow these steps to configure DataWorks Agent. This guide uses Cline as an example.
If you selected the dataworks-mcp:py3.11-ubuntu22.04 image when creating your personal development environment instance, you do not need to upgrade the engine or install the Cline extension.
Upgrade the engine
Install the Cline extension
Step 3: Configure LLM API key
After installing the Cline extension, follow these steps to configure your API key. This example demonstrates how to connect to the Model Studio (Bailian) API using the OpenAI Compatible mode.
To connect to other models, configure the relevant parameters according to the user interface.
Currently, only the Use your own API key configuration method is supported. The Get Started for Free method is not supported.
On the Data Studio personal development environment page, click the
icon in the upper-right corner of the top navigation bar to open Copilot Chat, then click the
icon to switch to Cline.On the Cline page, click Use your own API key and configure the parameters as described in the following table.
Parameter
Description
API provider
The API service provider you want to use. Select
OpenAI Compatible. This indicates that you will use an OpenAI-compatible interface to connect to the Model Studio (Bailian) API.Base URL
The base URL for the API service, which specifies the root address for API requests.
For example, the OpenAI-compatible API endpoint provided by Model Studio (Bailian) is
https://dashscope-intl.aliyuncs.com/compatible-mode/v1.API key
The key used for authentication. You can obtain this API key from the Model Studio (Bailian) console.
Model ID
The model to use.
Models vary in function and performance. Select
qwen-coder-plusorqwen-plus:qwen-coder-plus: Suitable for code generation and programming tasks.qwen-plus: Suitable for general text generation and processing tasks.
Click the Let's go! button at the bottom to complete the API key configuration.
Step 4: Configure the MCP Server
After configuring the API key, follow these steps to connect to and configure the DataWorks MCP Server. For more information about the DataWorks MCP Server, see Appendix: DataWorks MCP Server.
In the upper-right corner of the Cline page, click the
icon to go to the Marketplace tab in the MCP Servers panel.Switch to the Installed tab to view installed MCP servers.
Click Configure MCP Servers to open the
cline_mcp_settings.jsonconfiguration file. DataWorks provides a default configuration for thealibabacloud-dataworks-mcp-server. The configuration is as follows:{ "mcpServers": { "alibabacloud-dataworks-mcp-server": { "command": "npx", "args": [ "alibabacloud-dataworks-mcp-server" ], "env": { "REGION": "cn-shanghai", "ALIBABA_CLOUD_CREDENTIALS_URI": "http://localhost:7002/api/v1/credentials/0", "TOOL_CATEGORIES": "SERVER_IDE_DEFAULT" }, "disabled": false, "autoApprove": [], "timeout": 60 } } }Parameter
Description
command
npx. This is the command provided by dataworks-mcp-server.
args
alibabacloud-dataworks-mcp-server. This is the command argument for dataworks-mcp-server.
env
REGION
cn-shanghai in the example. This indicates the region where the current DataWorks workspace is located.
ALIBABA_CLOUD_CREDENTIALS_URI
Specifies the URI for the Alibaba Cloud credential.
ImportantThis parameter is only effective in the DataWorks personal development environment and is used to obtain Alibaba Cloud user authentication.
TOOL_CATEGORIES
Configures an allowlist of
Toolcategories. Enter the OpenAPI categories here, separated by commas.Example:
"TOOL_CATEGORIES":"Data Source,Workspace Management,Resource Group Management,Data Map,Data Integration,Data Development (New),Operation Center,DataService,Open Platform,Data Quality,Label Management,Security Center,SERVER_IDE_DEFAULT".NoteSERVER_IDE_DEFAULTrefers to the defaultToolsin the personal development environment. The other service categories can be found in the left-side directory tree on the DataWorks - OpenAPI Overview page.To improve model loading performance and user experience, the default configuration sets
TOOL_CATEGORIEStoSERVER_IDE_DEFAULT. To enable all OpenAPI tools, you can remove this configuration item.
TOOL_NAMES
Configures an allowlist of
Toolnames. Enter the OpenAPI names here, separated by commas.Example:
"TOOL_NAMES":"ListProjects,CreateNode,UpdateNode".NoteYou can find
TOOL_NAMESon the DataWorks - OpenAPI Overview page.After you save the configuration, if the page loads and displays a list of available Tools, the
alibabacloud-dataworks-mcp-serveris installed and configured correctly. You can now use its features.NoteIf this information fails to load, confirm that you have upgraded the engine.

You can extend the capabilities of DataWorks Agent by editing the
cline_mcp_settings.jsonfile directly or by installing other MCP servers from the Marketplace. For example, to use Hologres-related functions in DataWorks Agent, you can connect to the Hologres MCP Server.
FAQ
Q: When I run a preset prompt on the MCP Server, an API request takes too long to respond. What could be the cause and how can I fix it?
A: This may be caused by a compatibility issue with an outdated engine version. We recommend that you upgrade the engine.
Q: What can I do to optimize the model's response speed when it is slow?
A: To improve response performance, try the following:
Reduce the number of simultaneously enabled MCP servers to lower system resource overhead.
In the MCP Server's configuration file, explicitly specify
TOOL_CATEGORIESorTOOL_NAMESin theenvparameter to load only the necessary toolsets, thereby reducing the number of importedTools.
Appendix: DataWorks MCP Server
MCP (Model Context Protocol) is a protocol that provides a standardized context for Large Language Models (LLMs). It defines a standard way for LLMs to connect to different data sources and tools, enabling them to understand and process information more effectively. An MCP client can call the capabilities of various MCP servers through the MCP protocol.
The DataWorks MCP Server, as a type of MCP server, encapsulates the DataWorks OpenAPI and provides DataWorks' big data processing capabilities. You can integrate the DataWorks MCP Server into third-party products, programs, or agents to quickly call DataWorks capabilities.
Outside the DataWorks personal development environment, you must configure ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET in the env parameter (obtain them from here) and remove the ALIBABA_CLOUD_CREDENTIALS_URI configuration.
icon to Configure Keybindings, 