All Products
Search
Document Center

DataWorks:DataWorks Agent

Last Updated:Feb 12, 2026

The DataWorks Copilot Agent uses natural language interaction and the advanced reasoning and planning of a Large Language Model (LLM). It automates complex tasks in Data Integration, Data Development, and Data Governance, providing an end-to-end workflow from requirements to results and significantly improving your productivity. This topic describes the use cases and core features of the DataWorks Copilot Agent.

Overview

The DataWorks Copilot Agent is built on a proprietary client. Unlike DataWorks Agent for third-party clients, you do not need to install extra software or perform complex configurations. You can use it directly from the relevant module pages in DataWorks.

Simply describe your requirements in natural language to complete Data Development and other tasks and significantly improve your work efficiency. The DataWorks Copilot Agent operates as follows:

image

Access the feature

  1. Log on to the DataWorks console. In the left-side navigation pane, choose Data Development and O&M > DataStudio. Select the target Workspace and open Data Studio.

  2. On the Data Studio page, click the image icon in the upper-right corner of the top navigation bar to open Copilot Chat. The Ask mode is enabled by default. In the lower-left corner of the dialog box, switch to Agent mode.

    image

Quick start

Step 1: Open Copilot Chat and enter Agent mode

On the Data Studio page, click the image icon in the upper-right corner of the top navigation bar to open Copilot Chat. In the lower-left corner of the dialog box, switch to Agent mode.

Step 2: Select an Agent

You can enter / in the input box or click / to quickly open the Agent menu. From there, select a specialized Agent for your current task. The Agent types include: Data Integration Agent, Data Map Agent, Data Development Agent, Data Governance Agent, and Data O&M Agent.

image

Step 3: Add context (Optional)

You can add Context to help Copilot better understand your requirements. Enter @ in the dialog box or click the @ icon in the lower-right corner to open the Add Context menu. You can then select the type of Context to add.

image

The supported types are:

  • Table: Adds metadata from one or more tables as Context.

  • Node/Code Files: Adds code from a specific node as Context.

  • Data Album: Reference a data album from Data Map.

  • Rules: Temporarily apply one or more specified rules to the current conversation.

  • Upload File: Uploads a local document to use as Context.

Step 4: Switch the LLM (Optional)

By default, the Copilot uses the DataWorks default model. You can click the image icon at the bottom of the dialog box to select a different supported LLM from the menu.

image

Step 5: Submit your question and start a conversation

Engage in a multi-turn conversation to refine your request. Ask follow-up questions or provide more details until the Agent understands your intent and delivers the desired result.

Core use cases

Leveraging the deep understanding and task orchestration capabilities of an LLM, the DataWorks Copilot Agent covers use cases across Data Integration, Data Development, Data Governance, Data Map, and Data O&M. The specific capabilities are described in the following table.

Scenario

Description

Data Integration

Describe data synchronization requirements in natural language, and the Agent automatically generates the corresponding data synchronization task configuration. This includes the source and destination Data Source types, Table Schema mappings, Column filtering conditions, partitioning strategies, and scheduling parameters.

Data Development

Provides a natural language-based ETL development experience, covering the end-to-end process from requirements analysis and code generation to Workflow creation and deployment.

Data Governance

The DataWorks Data Governance Agent transitions enterprise data governance from a proactive to an 'autonomous' model. The system converts your natural language commands into precise governance actions, then configures and executes them automatically.

Data Map

Improves the efficiency of data discovery and understanding. Through AI-driven natural language interaction, you can quickly explore metadata in various scenarios across massive datasets.

Data O&M

Provides comprehensive health assessments and issue diagnosis for task instances. It generates structured diagnostic reports by automatically analyzing multiple dimensions, including dependency lineage, resource utilization, historical performance trends, change impacts, log anomalies, and data quality.

Use case 1: Data Integration Agent

Description: Describe data synchronization requirements in natural language, and the Agent automatically generates the corresponding data synchronization task configuration. This includes the source and destination Data Source types, Table Schema mappings, Column filtering conditions, partitioning strategies, and scheduling parameters.

Steps:

  1. Enter / in the dialog box and select Data Integration Agent.

  2. Describe your data synchronization requirements, including the source, destination, table name, and synchronization method. For example: "Create an Offline Synchronization Task to sync the ods_user_info_d table from MySQL to the ods_user_info_d table in MaxCompute".

  3. The Agent parses your request and automatically populates information such as the Data Source and table mappings to create a data synchronization node.

  4. After the node is created, you can click it to view and modify it.

Use case 2: Data Development Agent

Description: Provides a natural language-based ETL development experience, covering the end-to-end process from requirements analysis and code generation to Workflow creation and deployment.

Steps:

  1. Enter / in the dialog box and select Data Development Agent.

  2. Describe your data development requirements in natural language and add Context as needed. For example: "Build a user profile analysis Workflow".

  3. The Agent breaks down the task into multiple steps, such as creating nodes, generating code, and configuring dependencies, and then executes them.

  4. You can review the generated node code and choose to keep or discard it.

Use case 3: Data Governance Agent

Description: The DataWorks Data Governance Agent transitions enterprise data governance from a proactive to an "autonomous" model. The system converts your natural language commands into precise governance actions, then configures and executes them automatically.

Core capabilities:

  • Quality rule configuration: Use natural language to automatically configure quality monitoring rules for specified key tables. The Data Governance Agent can intelligently analyze the Column types, business semantics, and importance of a specified table to automatically recommend and configure appropriate monitoring rules, such as primary key uniqueness, non-null constraints, and value range checks. This streamlines the work of data exploration and rule configuration.

    • Example: "Automatically generate quality rules for the core user dimension table dim_user_info."

    • Example: "Automatically configure table row count quality rules for tables that start with ods_."

  • Quality issue remediation: For quality issues that the system has already identified in the data asset governance module, such as "Frequently accessed tables with no quality rules" or "Tables produced by high-priority baseline tasks with no quality rules," you can provide governance requirements in natural language. The system then automatically analyzes the issue and performs the corresponding remediation.

    • Example: "Find frequently accessed tables that have no quality rules, and then recommend and configure rules for them."

    • Example: "Help me resolve issues in the data quality dimension."

Use case 4: Data Map Agent

Description: Improves the efficiency of data discovery and understanding. Through AI-driven natural language interaction, you can quickly explore metadata in various scenarios across massive datasets.

Core capabilities:

  • Natural language search: Use natural language Q&A to quickly locate data based on business intent without needing precise keywords. For example, "Find the summary table related to user activity."

  • Automatic scope adjustment: Allows you to specify a scope in the conversation. The Agent automatically understands the semantics and quickly locates data within that scope. For example, "In the adm_bi project, find tables related to business operations."

  • In-depth data understanding: Supports follow-up questions about target data to quickly get details such as Data Lineage, Owner, and Column definitions. For example, "What are the direct downstream dependencies of the @dws_bi_metric_di table? Which owners will be affected if it changes?"

Use case 5: Data O&M Agent

Description: Provides comprehensive health assessments and issue diagnosis for task instances. It generates structured diagnostic reports by automatically analyzing multiple dimensions, including dependency lineage, resource utilization, historical performance trends, change impacts, log anomalies, and data quality.

For more information about the Data O&M Agent, see AI-powered O&M.

Related documents

To learn about the customizable Agent feature, see DataWorks Agent for third-party clients.