Copilot AI Assistant for Intelligent DataWorks Automation - DataWorks

How it works

Copilot integrates three capabilities into DataWorks:

Capability	What it does	Best for
Agent	Autonomously plans and executes multi-step tasks across DataWorks modules	Data integration pipelines, ETL workflows, governance remediation, O&M diagnostics
AI coding assistant	Generates, completes, refactors, debugs, and explains SQL and Python	Single coding tasks: write a query, fix a bug, add comments
Quick AI actions	Embedded one-click actions inside specific DataWorks modules	Visualize query results, create tables, generate publish descriptions, diagnose task failures

Prerequisites

Before you begin, ensure that you have:

A DataWorks account at Basic Edition or higher
An Alibaba Cloud account owner, tenant administrator, or a user with equivalent permissions to activate Copilot for your organization

Some features are only available in the new Data Studio.

Activate Copilot

Copilot requires a one-time activation per Alibaba Cloud account. Once activated, all users under the account can start using it.

Click the Copilot icon in the upper-right corner of the DataWorks interface.
Read the DataWorks Copilot Service Agreement.
Click Confirm Participation.

Copilot is free of charge during public preview. After the public preview ends, it will become a paid service. Pricing details will be announced later.

Open Copilot

Access Copilot from three entry points:

Global entry point: Click the Copilot icon in the upper-right corner of the DataWorks interface to open the Copilot Chat window.
In the editor: In the intelligent code editor for code-based nodes, right-click or use a keyboard shortcut to open Copilot.
Embedded in modules: Look for quick action buttons marked with the Copilot icon in specific product modules.

From the global entry point, Copilot shows predefined scenario cards for data synchronization, intelligent table discovery, Data Development, and data governance. Click a card to load a sample prompt for that scenario.

Agent: Automate complex tasks

The DataWorks Agent goes beyond Q&A. Powered by the reasoning and planning capabilities of a Large Language Model (LLM), it understands your goal, breaks it into steps, creates an execution plan, and calls the relevant tools in the DataWorks MCP Server (Model Context Protocol Server) to carry out the task automatically.

Tip: Use Agent mode for multi-step tasks — such as building an ETL workflow end to end, configuring quality rules across a set of tables, or diagnosing a failing task instance. For single-step tasks like generating a SQL snippet or explaining a function, Ask mode is faster.

Switch to Agent mode

In the Copilot Chat window, switch from Ask mode to Agent mode.
Type / and select the appropriate Agent type.
Enter your request to start the task.

Agent types

Data Studio Agent

Provides a natural language-based ETL development experience, covering requirements analysis, code generation, workflow creation, and deployment.

Sample prompts to get started:

"Create an ETL workflow that reads from the ods_orders table, filters records from the last 7 days, and writes the results to dws_orders_weekly."
"Generate a scheduled workflow for daily incremental data loading into the sales summary table."

Data Integration Agent

Describe your data synchronization requirements in natural language. The Agent parses your intent and generates a full task configuration, including source and destination data source types, table schema mappings, field filtering conditions, partitioning strategies, and scheduling parameters.

Sample prompts to get started:

"Sync the user_profile table from MySQL to MaxCompute daily at 2 AM, filtering out test accounts."
"Create a full-load synchronization task from the PostgreSQL order database to OSS in Parquet format."

Data Map Agent

Improves the efficiency of data discovery and understanding through AI-driven natural language search across massive datasets.

Sample prompts to get started:

"Find the summary tables related to user activity." — Search by business intent without needing exact keywords.
"In the adm_bi project, find tables related to business operations." — Scope the search to a specific project.
"What are the direct downstream dependencies of the dws_bi_metric_di table? Which owners will be affected if it changes?" — Get data lineage and ownership details.

Data Governance Agent

Issue natural language commands that are converted into governance actions and executed automatically — from configuring quality rules to remediating identified issues.

Sample prompts to get started:

*Configure quality rules:*

"Automatically generate quality rules for the core user dimension table dim_user_info."
"For tables starting with ods_, automatically configure quality rules related to table row counts."

*Remediate quality issues:*

"Find frequently accessed tables that have no quality rules, then recommend and configure them."
"Help me resolve issues in the data quality dimension."

Data O&M Agent

Provides a comprehensive health assessment and issue diagnosis for task instances. The Agent analyzes dependency chains, resource levels, historical run trends, change impacts, log anomalies, and data quality, then generates a structured diagnostic report.

For more information, see AI-powered O&M.

AI coding assistant

The AI coding assistant handles SQL and Python coding tasks inside the intelligent code editor and Copilot Chat.

Switch between multiple models — the DataWorks default model, Qwen, and DeepSeek — to get the best results for your task.

Use it in the editor

Code completion — While developing a code-based node, Copilot predicts and suggests subsequent code based on the context (code already written, referenced table schemas, and more). Suggestions appear automatically. Press Tab to accept.

Right-click actions — Select code in the intelligent code editor, right-click, and choose Copilot from the context menu to access quick commands.

Use it in Copilot Chat (Ask mode)

Ask mode is the default mode for Copilot Chat. Select code in the editor to use it as context, then ask Copilot to act on it.

Supported tasks:

Task	How to trigger	Example
Generate ETL scripts	Type your requirement	"Based on `dwd_ec_trd_create_ord_di`, calculate sales amount, volume, SKU count, buyer count, and seller count per SPU from September 1–18, 2024."
Refactor code	Describe the change	"Transpose the SQL results from columns to rows using unpivot."
Debug code	Select the code, use quick command	Click Perform Diagnostics or select the failed code and right-click
Explain code	Reference the code	"Explain this SQL."
Generate comments	Reference the code	"Add a comment for each field."
Code Q&A	Ask a question	"How do I write a mapjoin in MaxCompute?"
Optimize performance	Select the code, use quick command	Select the code, open the chat, and ask for optimization
Generate test cases	Reference the code	"Generate SQL test cases and explain the testing steps."

Quick AI actions

DataWorks modules embed one-click AI actions for common operations.

Visualize query results

After a node runs or a SQL query completes, switch to the visualization tab in the results area to generate charts and data insights.

Create tables intelligently

In the Data Studio catalog, enter keywords for a table name to get AI-recommended field names and descriptions with a single click.

Generate publish descriptions

During the publishing process in Data Studio, click once to generate a deployment description.

Diagnose task failures

When a task fails in the Operation Center, click Perform Diagnostics. Copilot — integrated with LLMs including Qwen and DeepSeek-R1 (671B) — extracts key information from the logs, provides an error analysis and solution, and recommends quick actions.

To access: On the Operation Center page, go to Auto Triggered Node O&M > Auto Triggered Instances. Click a failed instance, select the failed node, and click Perform Diagnostics in the lower-right corner.

Recommend data quality rules

On the Data Quality page, go to Configure Rules > Configure By Table. Select the target table and click Create Monitor to open Copilot and generate quality rules for that table.

Create DataService Studio APIs

In the DataService Studio module, create a new API and select the code editor mode. Copilot generates a SQL script based on your requirements and automatically parses it into request and response parameters.

Improve accuracy with context

Copilot's responses improve significantly when it has the right context. Two mechanisms control this: Rules for persistent knowledge, and context references for per-conversation inputs.

Custom knowledge (Rules)

Rules are guidelines, standards, and background knowledge you define once and apply to every Copilot interaction.

Configure Rules: In the upper-right corner of the Copilot Chat window, click the icon.

Rule type	Who configures it	Scope	Use it for
Enterprise-level Rules	Administrators	Organization-wide (configurable scope)	Company-wide business terminology, coding standards
Personal-level Rules	Individual users	Current user only	Personal preferences, frequently used code snippets

Specify context per conversation

In each conversation, add context to focus Copilot on the relevant data and provide more accurate results.

In the Copilot Chat input box, type @ or click + to open the context selector.

Context type	What Copilot can access
Table	Metadata from one or more tables
Node/Code file	The code within a specific node
Data collections	Data collections from Data Map
Rules	One or more Rules, applied only for this conversation
Local file	Documents you upload as background information

Manage conversations

Conversation history

Copilot automatically saves your recent conversations. View up to 100 records from the last 7 days.

To access: In the Copilot Chat window, click History in the upper-right corner.

Start a new chat per task

Start a new chat for each independent task. This prevents context from different tasks from interfering with each other, keeping Copilot focused on the current task.

Availability

Attribute	Details
Eligible users	DataWorks Basic Edition or higher
Current stage	Public preview
Billing	Free during public preview; paid after public preview ends
Available regions	China (Zhangjiakou), China (Beijing), China (Ulanqab), China (Hangzhou), China (Shanghai), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Japan (Tokyo)

FAQ

Why are Copilot's responses inaccurate or not what I expected?

The most common cause is insufficient context. Add relevant tables, code files, or Rules using @ or + in the input box. The more precise the context, the more accurate the response. See Specify context per conversation.

What's the difference between Ask mode and Agent mode?

Ask mode handles single-step tasks: generate a snippet, explain a function, debug a block of code. Agent mode is for multi-step tasks that require planning, tool use, and autonomous execution across DataWorks — such as building a full ETL pipeline or remediating data quality issues across multiple tables. If your task has more than one step, switch to Agent mode.

How do I get Copilot to respond in English?

Add a clear instruction to your prompt, such as "Please respond in English" or "Explain in English". Switching the DataWorks interface language to English also improves the consistency and accuracy of English responses.