All Products
Search
Document Center

DataWorks:DataWorks Agent on third-party clients

Last Updated:Feb 14, 2026

DataWorks Agent is based on the Model Context Protocol (MCP). It connects to the DataWorks MCP Server and other big data MCP servers, such as the Hologres MCP Server, to perform tasks like data development, Task O&M, and data integration in DataWorks using natural language.

Important

This feature is accessed through a third-party client. For a more streamlined agent experience, see DataWorks Agent.

How it works

DataWorks Agent allows you to perform big data development tasks using natural language in an intelligent chat window. The agent uses a Large Language Model (LLM) to parse your requests and intelligently calls MCP Server capabilities to execute tasks.

For example, you can ask "How many workspaces do I have?" in the DataWorks Agent chat window. The agent uses the LLM to parse this request and calls the ListProjects tool provided by the DataWorks MCP Server, which is built on DataWorks OpenAPI, to retrieve the result. For more complex tasks, the LLM may interact with the MCP Server multiple times.

DataWorks Agent not only integrates with the DataWorks MCP Server but also supports connecting to other MCP servers. You can also choose your own LLM, such as Qwen, DeepSeek, or OpenAI.

image

You can also try the following prompts to explore more task scenarios supported by DataWorks Agent.

Scope

Example prompt

Data development

[Query tasks] Find all MaxCompute SQL data development nodes in the current workspace's project directory with a scheduling type of 'Paused'.

[Rename tasks] Rename the nodes found above to "invalid_node_to_delete". Use sequential numbers to differentiate multiple nodes.

[Create tasks] Create five MaxCompute SQL nodes in the current workspace's project directory. The names should start with MC_Demo, be followed by an underscore, and end with an auto-incrementing number starting from 01.

Task O&M

[Rerun failed tasks] Find the tasks that failed to run on 20250330 in this workspace and rerun their instances. Note that bizdate is a timestamp in milliseconds.

[Query failed instances] Find all instances that failed on 20250331 in the current workspace's project. Note that bizdate is a timestamp in milliseconds.

[Analyze task rerun properties] Analyze the rerun properties of these tasks. If they are rerunnable, rerun them.

Data integration

[Sync a single MySQL table to MaxCompute] Create a batch data integration task in the current workspace with the following settings:

  • Source data source: mc_test_mysql (Table name: users)

  • Destination data source: mc_test_maxcompute (Destination table name: users, same as the source table)

  • Resource group: mc_test_res

  • Field mapping method: Automatic mapping by name (columns with the same name in the source and destination tables will be mapped to each other).

Data analysis

[Sales analysis] Analyze the sales trends of the top 10 best-selling products this month from the order table.

Limitations

This feature is only available in a personal development environment within a workspace that has the new version of Data Studio enabled.

Important

After a personal development environment is restarted, you must reinstall the MCP Server. Proceed with caution.

Billing

Using DataWorks Agent incurs the following fees:

  • DataWorks OpenAPI call fees

    When the agent calls DataWorks OpenAPI through an MCP Server, fees are charged according to the OpenAPI billing standards.

  • LLM token fees

    The agent calls the Large Language Model (LLM) you configure (such as Qwen) to parse user intent and generate natural language responses. This process consumes input and output tokens, which are billed according to the pricing rules of your chosen model provider. For example, if you use the qwen-coder-plus model in Model Studio (Bailian), fees are calculated based on the Model Studio (Bailian) billing details.

Quick start with DataWorks Agent

After you configure DataWorks Agent, click the image icon in the upper-right corner of the Cline page to open the DataWorks Agent chat interface and try a quick example: query the members of the current workspace.

Enter the following prompt: Query the members of the current workspace.

The agent breaks down the steps and runs them:

  1. Parse and confirm request: The agent parses your request to "Query the members of the current workspace", automatically identifies the ListProjectMembers API to call, and prompts you to confirm necessary parameters, such as the target workspace (ProjectId).

  2. Call API and get response: After you grant permission, the agent calls the ListProjectMembers OpenAPI to retrieve a structured list of members in the workspace, including their roles and account types. For more information about the ListProjectMembers OpenAPI, see ListProjectMembers - Query the members of a workspace.

    Note
    • During execution, the system prompts you to confirm relevant actions and provide necessary information. You can click Approve to proceed or Reject to cancel the action.

    • The breakdown logic for key steps may vary slightly depending on task complexity, LLM selection, and model version. The actual execution flow depends on the agent's real-time parsing and interaction during the session.

Configure DataWorks Agent

DataWorks Agent uses an MCP client extension, such as Cline, to create the chat interface. It connects to the DataWorks MCP Server and other Alibaba Cloud MCP servers through MCP Server configurations.

Note

You can connect to more open source MCP servers as needed to enhance the capabilities of DataWorks Agent.

Prerequisites

Step 1: Enter the personal development environment

Follow these steps to start and enter your personal development environment.

  1. Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose Shortcuts > Data Studio in the Actions column.

  2. Click the image icon next to Personal Development Environment in the top navigation bar to check the status of your personal development environment instance and enter it.

    • If the instance status is Running: Click the running instance under Personal Development Environment to enter the environment.

    • If the instance is in another state: Click Management Environment in the pop-up window. On the Personal Development Environment Instances page, find your instance and click Start in the Actions column. Wait for the Instance Status to change to Running, then click the instance to enter the environment.

    Note

    An icon similar to image in the Personal Development Environment area indicates that you have entered the personal development environment.

Step 2: Install Cline

After entering the personal development environment, follow these steps to configure DataWorks Agent. This guide uses Cline as an example.

Important

If you selected the dataworks-mcp:py3.11-ubuntu22.04 image when creating your personal development environment instance, you do not need to upgrade the engine or install the Cline extension.

Upgrade the engine

If you are using an older personal development environment or have already installed the Cline extension, you must upgrade the underlying engine to use the extension's features. If you have already performed the upgrade, you can skip this step.

One-click upgrade: After entering the personal development environment, if a pop-up appears prompting you to upgrade the underlying engine for compatibility, click the One-click Upgrade button to complete the upgrade.

Command-line upgrade: Click the image icon in the bottom-left corner of the toolbar to open the terminal. Run the following command in the terminal.

wget https://nodejs.org/dist/v20.19.0/node-v20.19.0-linux-x64.tar.xz
tar xf node-v20.19.0-linux-x64.tar.xz
mv /etc/dsw/node /etc/dsw/node14
mv node-v20.19.0-linux-x64 /etc/dsw/node

bash <(curl -s https://dataworks-notebook-${REGION}.oss-${REGION}.aliyuncs.com/public-datasets/aone-release/dwcode-server/scripts/update.sh)  0.2.169
Note

The ${REGION} variable in the command is automatically replaced with the current region. You do not need to replace it manually. You can also run echo ${REGION} in the terminal to view the current region.

After the upgrade is complete, click Reload in the pop-up window to apply the latest changes.

Install the Cline extension

Follow these steps to install the Cline extension in your personal development environment to serve as your agent chat window.

  1. Click the image icon in the left navigation pane of the personal development environment page to go to the Extensions page.

  2. Enter Cline in the search box on the Extensions page.

  3. Find the Cline extension that appears below.

  4. Click Install in the lower-right corner of the Cline extension and wait for the installation to complete.

  5. After installation, click the image icon in the upper-right corner of the top navigation bar on the Data Studio page to open Copilot Chat, then click the image icon to switch to Cline.

  6. You can right-click the image icon to Configure Keybindings, Move to Secondary Sidebar, or Move to Panel.

image

Step 3: Configure LLM API key

After installing the Cline extension, follow these steps to configure your API key. This example demonstrates how to connect to the Model Studio (Bailian) API using the OpenAI Compatible mode.

Note
  • To connect to other models, configure the relevant parameters according to the user interface.

  • Currently, only the Use your own API key configuration method is supported. The Get Started for Free method is not supported.

  1. On the Data Studio personal development environment page, click the image icon in the upper-right corner of the top navigation bar to open Copilot Chat, then click the image icon to switch to Cline.

  2. On the Cline page, click Use your own API key and configure the parameters as described in the following table.

    Parameter

    Description

    API provider

    The API service provider you want to use. Select OpenAI Compatible. This indicates that you will use an OpenAI-compatible interface to connect to the Model Studio (Bailian) API.

    Base URL

    The base URL for the API service, which specifies the root address for API requests.

    For example, the OpenAI-compatible API endpoint provided by Model Studio (Bailian) is https://dashscope-intl.aliyuncs.com/compatible-mode/v1.

    API key

    The key used for authentication. You can obtain this API key from the Model Studio (Bailian) console.

    Model ID

    The model to use.

    Models vary in function and performance. Select qwen-coder-plus or qwen-plus:

    • qwen-coder-plus: Suitable for code generation and programming tasks.

    • qwen-plus: Suitable for general text generation and processing tasks.

  3. Click the Let's go! button at the bottom to complete the API key configuration.

Step 4: Configure the MCP Server

After configuring the API key, follow these steps to connect to and configure the DataWorks MCP Server. For more information about the DataWorks MCP Server, see Appendix: DataWorks MCP Server.

  1. In the upper-right corner of the Cline page, click the image icon to go to the Marketplace tab in the MCP Servers panel.

  2. Switch to the Installed tab to view installed MCP servers.

  3. Click Configure MCP Servers to open the cline_mcp_settings.json configuration file. DataWorks provides a default configuration for the alibabacloud-dataworks-mcp-server. The configuration is as follows:

    {
      "mcpServers": {
        "alibabacloud-dataworks-mcp-server": {
          "command": "npx",
          "args": [
            "alibabacloud-dataworks-mcp-server"
          ],
          "env": {
            "REGION": "cn-shanghai",
            "ALIBABA_CLOUD_CREDENTIALS_URI": "http://localhost:7002/api/v1/credentials/0",
            "TOOL_CATEGORIES": "SERVER_IDE_DEFAULT"
          },
          "disabled": false,
          "autoApprove": [],
          "timeout": 60
        }
      }
    }

    Parameter

    Description

    command

    npx. This is the command provided by dataworks-mcp-server.

    args

    alibabacloud-dataworks-mcp-server. This is the command argument for dataworks-mcp-server.

    env

    REGION

    cn-shanghai in the example. This indicates the region where the current DataWorks workspace is located.

    ALIBABA_CLOUD_CREDENTIALS_URI

    Specifies the URI for the Alibaba Cloud credential.

    Important

    This parameter is only effective in the DataWorks personal development environment and is used to obtain Alibaba Cloud user authentication.

    TOOL_CATEGORIES

    Configures an allowlist of Tool categories. Enter the OpenAPI categories here, separated by commas.

    Example: "TOOL_CATEGORIES":"Data Source,Workspace Management,Resource Group Management,Data Map,Data Integration,Data Development (New),Operation Center,DataService,Open Platform,Data Quality,Label Management,Security Center,SERVER_IDE_DEFAULT".

    Note
    • SERVER_IDE_DEFAULT refers to the default Tools in the personal development environment. The other service categories can be found in the left-side directory tree on the DataWorks - OpenAPI Overview page.

    • To improve model loading performance and user experience, the default configuration sets TOOL_CATEGORIES to SERVER_IDE_DEFAULT. To enable all OpenAPI tools, you can remove this configuration item.

    TOOL_NAMES

    Configures an allowlist of Tool names. Enter the OpenAPI names here, separated by commas.

    Example: "TOOL_NAMES":"ListProjects,CreateNode,UpdateNode".

    Note

    You can find TOOL_NAMES on the DataWorks - OpenAPI Overview page.

  4. After you save the configuration, if the page loads and displays a list of available Tools, the alibabacloud-dataworks-mcp-server is installed and configured correctly. You can now use its features.

    Note

    If this information fails to load, confirm that you have upgraded the engine.

    image

  5. You can extend the capabilities of DataWorks Agent by editing the cline_mcp_settings.json file directly or by installing other MCP servers from the Marketplace. For example, to use Hologres-related functions in DataWorks Agent, you can connect to the Hologres MCP Server.

FAQ

  • Q: When I run a preset prompt on the MCP Server, an API request takes too long to respond. What could be the cause and how can I fix it?

    A: This may be caused by a compatibility issue with an outdated engine version. We recommend that you upgrade the engine.

  • Q: What can I do to optimize the model's response speed when it is slow?

    A: To improve response performance, try the following:

    • Reduce the number of simultaneously enabled MCP servers to lower system resource overhead.

    • In the MCP Server's configuration file, explicitly specify TOOL_CATEGORIES or TOOL_NAMES in the env parameter to load only the necessary toolsets, thereby reducing the number of imported Tools.

Appendix: DataWorks MCP Server

MCP (Model Context Protocol) is a protocol that provides a standardized context for Large Language Models (LLMs). It defines a standard way for LLMs to connect to different data sources and tools, enabling them to understand and process information more effectively. An MCP client can call the capabilities of various MCP servers through the MCP protocol.

The DataWorks MCP Server, as a type of MCP server, encapsulates the DataWorks OpenAPI and provides DataWorks' big data processing capabilities. You can integrate the DataWorks MCP Server into third-party products, programs, or agents to quickly call DataWorks capabilities.

Important

Outside the DataWorks personal development environment, you must configure ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET in the env parameter (obtain them from here) and remove the ALIBABA_CLOUD_CREDENTIALS_URI configuration.

image