All Products
Search
Document Center

:Intelligent coding assistant

Last Updated:Oct 27, 2025

Through the experiments in this topic, you will experience the capabilities of DataWorks Copilot as a programming assistant, including code generation, code rewriting, code interpretation, code completion, one-click error correction, and interpretation of syntax and functions. You will also explore various clever agent capabilities, such as AI-aided table creation, intelligent chart generation, and description generation for node and function deployment.

Background information

DataWorks overview

DataWorks is an intelligent, lakehouse-based data development and governance platform that leverages the big data development methodology of Alibaba Group based on 15 years of big data experience. DataWorks is deeply compatible with dozens of big data and AI computing services provided by Alibaba Cloud, such as MaxCompute, E-MapReduce (EMR), Hologres, Realtime Compute for Apache Flink, and Platform for AI (PAI). DataWorks supports intelligent extract, transform, load (ETL) development, data analysis, and proactive data asset governance for data warehouses, data lakes, and the OpenLake lakehouse architecture to facilitate data management throughout the Data+AI lifecycle. Since 2009, DataWorks has continuously productized and refined the Alibaba data system to serve various industries such as public service sectors, finance, retail, Internet, automobile, and manufacturing. DataWorks has earned the trust of tens of thousands of customers who choose DataWorks for digital transformation and value creation.

DataWorks Copilot is your intelligent assistant. You can choose to use the default DataWorks model, Qwen3-235B-A22B, DeepSeek-R1-0528, or Qwen3-Coder to perform Copilot operations. With the advanced reasoning power of DeepSeek-R1, DataWorks Copilot enables you to perform complex tasks, such as SQL code generation, optimization, and testing, based on natural language interactions. This greatly enhances the efficiency of ETL development and data analysis.

Precautions

  • The regions and DataWorks editions in which the DataWorks Copilot public preview is available are limited. For more information, see the Public preview description section in the "DataWorks Copilot" topic.

  • DataWorks Copilot supports SQL and Python. SQL is used in the experiments.

  • The features of DataWorks Copilot do not vary based on compute engine types. DataWorks Copilot supports all code-type nodes. In the experiments, MaxCompute nodes are used.

Environment preparation

  • DataWorks is activated. For more information, see Activate DataWorks.

  • A DataWorks workspace is created. For more information, see Create a workspace.

  • Your account meets the public preview requirements of DataWorks Copilot and has joined the public preview. For more information, see DataWorks Copilot Ask mode.

Experiment procedure

Note

The experiments in the following sections are not in any particular order. You can try them out as you choose.

NL2Code

DataWorks Copilot returns SQL code based on natural languages.

  1. In the DATA STUDIO pane, click the image icon to the right of Workspace Directories and choose Create Node > MaxCompute > MaxCompute SQL.

    image

  2. In the popover that appears, enter a node name and press the Enter key.

    Preset node name: ads_ec_ec360_gmv_kpi_overview

    image

  3. In the code editor, click to ask DataWorks Copilot to do something and open the Copilot Chat interface. In the interface, enter a command, click Send, and wait for Copilot to return the result.

    Preset instruction: Collect statistics on the following information about each Standard Product Unit (SPU) based on the dwd_ec_trd_create_ord_di table in the time period from September 1 to 18, 2024: sales revenue, sales volume, number of Stock Keeping Units (SKUs), number of buyers, and number of sellers.

image

image

Intelligent code completion

DataWorks Copilot can complete your code.

In the DATA STUDIO pane, create a MaxCompute node. In the code editor, click Copilot to open the Copilot Chat interface. Enter any code snippet that contains a table name, press Space or Enter, and wait for Copilot to return the result.

Preset code:

-- Conduct an analysis of e-commerce website sales orders in September 2024.
SELECT  ds
        ,spu_id
        ,SUM(sales_amt) AS total_sales
        ,COUNT(DISTINCT order_id) AS total_orders
        ,COUNT(DISTINCT sku_id) AS total_skus
        ,COUNT(DISTINCT buyer_id) AS total_buyers
FROM    default.dwd_ec_trd_create_ord_di
WHERE   order_date BETWEEN '2024-09-01' AND '2024-09-18'
;

image

Intelligent Q&A

In DataWorks Copilot, you can perform the following operations by using natural languages, such as correcting code errors, generating code comments, and interpreting code. DataWorks Copilot can also answer questions about SQL syntax and functions.

SQL rewriting

In the DATA STUDIO pane, create a MaxCompute node. In the code editor of the MaxCompute node, enter and select a code snippet, and click Copilot in the upper-right corner of the Data Studio page. In the DataWorks Copilot chat window, enter your rewrite request and click Send. Then wait for DataWorks Copilot to return the result.

  • Preset instruction: Modify the SQL statement to transform the query results from columns to rows by using the UNPIVOT operation.

  • Preset code:

    -- Conduct an analysis of e-commerce website sales orders in September 2024.
    SELECT  ds
            ,spu_id
            ,SUM(sales_amt) AS total_sales
            ,COUNT(DISTINCT order_id) AS total_orders
            ,COUNT(DISTINCT sku_id) AS total_skus
            ,COUNT(DISTINCT buyer_id) AS total_buyers
            ,COUNT(DISTINCT buyer_id) / COUNT(DISTINCT order_id) AS avg_buyers_per_order
            ,COUNT(DISTINCT buyer_id) / COUNT(DISTINCT sku_id) AS avg_buyers_per_sku
            ,SUM(sales_amt) / COUNT(DISTINCT order_id) AS avg_sales_per_order
            ,SUM(sales_amt) / COUNT(DISTINCT sku_id) AS avg_sales_per_sku
    FROM    default.dwd_ec_trd_create_ord_di
    WHERE   order_date BETWEEN '2024-09-01' AND '2024-09-18'
    GROUP BY ds
    ,spu_id
    ORDER BY total_sales DESC
    LIMIT   10
    ;

image

SQL error correction

In the DATA STUDIO pane, create a MaxCompute node. In the code editor of the MaxCompute node, select and right-click a code snippet and choose Copilot > SQL Rectify. In the DataWorks Copilot chat popover, click Send, and wait for DataWorks Copilot to return the result.

Preset code:

-- Conduct an analysis of e-commerce website sales orders in September 2024.
SELECT  ds
        ,spu_id
        ,SUM(sales_amt) AS total_sales
        ,COUNT(DISTINCT order_id) AS total_orders
        ,COUNT(DISTINCT sku_id) AS total_skus
        COUNT(DISTINCT buyer_id) AS total_buyers
        COUNT(DISTINCT buyer_id) / COUNT(DISTINCT order_id) AS avg_buyers_per_order
        ,COUNT(DISTINCT buyer_id) / COUNT(DISTINCT sku_id)  AS avg_buyers_per_sku
        ,SUM(sales_amt) / COUNT(DISTINCT order_id) AS avg_sales_per_order
        ,SUM(sales_amt) / COUNT(DISTINCT sku_id) AS avg_sales_per_sku
FROM    default.dwd_ec_trd_create_ord_di
WHERE   order_date BETWEEN '2024-09-01' AND '2024-09-18'
ORDER BY total_sales DESC
LIMIT   10
;

image

SQL interpretation

In the DATA STUDIO pane, create a MaxCompute node. In the code editor of the MaxCompute node, select a code snippet, and click Copilot in the upper-right corner of the Data Studio page. In the DataWorks Copilot chat window, enter your instruction, and click Send. Then wait for DataWorks Copilot to return the result.

  • Preset instruction: Interpret the SQL statement.

  • Preset code:

    SELECT  season
            ,SUM(tran_amt) AS total
    FROM    mf_cop_sales
    PIVOT (SUM(tran_amt) FOR season IN ('Q1' AS spring,'Q2' AS summer,'Q3' AS autumn,'Q4' AS winter))AS pivot_tablel
    ;

image

Comment generation

In the DATA STUDIO pane, create a MaxCompute node. In the code editor of the MaxCompute node, select and right-click a code snippet and choose Copilot > SQL Comments. In the DataWorks Copilot chat popover, enter your instruction, click Send, and then wait for DataWorks Copilot to return the result.

  • Preset instruction: Add a comment for each field.

  • Preset code:

    CREATE TABLE ods_mbr_user_info
    (
      id                BIGINT
      ,gmt_create       STRING
      ,gmt_modified     STRING
      ,id_card_number   STRING
      ,id_card_type     STRING
      ,is_delete        STRING
      ,nick             STRING
      ,reg_address      STRING
      ,reg_birthdate    STRING
      ,reg_city_id      STRING
      ,reg_email        STRING
      ,reg_fullname     STRING
      ,reg_gender       STRING
      ,reg_mobile_phone STRING
      ,reg_nation_id    STRING
      ,reg_prov_id      STRING
      ,user_active_time STRING
      ,user_active_type STRING
      ,user_id          BIGINT
      ,user_regdate     STRING
      ,user_regip       STRING
      ,vip_level        STRING
    )
    COMMENT '';

image

Q&A about SQL syntax and functions

In the DATA STUDIO pane, click Copilot in the upper-right corner of the Data Studio page. In the DataWorks Copilot chat window, enter your instruction and click Send. Then wait for DataWorks Copilot to return the result.

  • Preset instruction 1: How to use the Common Table Expression (CTE) syntax in MaxCompute?

  • Preset instruction 2: How to use MapJoin in MaxCompute?

  • Preset instruction 3: How to use the MaxCompute PIVOT function?

  • Preset instruction 4: A table contains an auto-incrementing ID field whose values are not continuous. How to obtain the range of IDs by page in the table?

  • Preset instruction 5: I have tables named t1(name) and t2(name). Write an SQL statement to achieve the following purpose: If no query result is returned for t1.name, use the query result of t2.name.

image

AI Agent

DataWorks services, such as Data Studio, DataAnalysis, Data Map, and DataService Studio, let you use DataWorks Copilot to conveniently perform certain operations, making the services easier to use. For example, you can use the table creation assistant of DataWorks Copilot to create a table by entering the table name keyword. You can use the analysis assistant of DataWorks Copilot to view charts about data query results and obtain the data insights with a few clicks.

Intelligent table creation

In the left-side navigation pane of the Data Studio page, click Data Catalog. In the DATA CATALOG pane, select a data catalog and create a table in the catalog. In the top toolbar of the page that appears, click Use Copilot to Create Table. Enter your instruction in the displayed field, click Send, and then wait for DataWorks Copilot to return the result.

Preset instruction: Member registration information table

image

Generation of data charts and data insights

In the DATA STUDIO pane, create a MaxCompute node. In the code editor of the MaxCompute node, select an executable code snippet and click Run in the top toolbar. After the SQL statements are successfully run and the query results are generated, click the chart icon on the RESULT tab, which appears at the page bottom. In the area that appears on the right, click Copilot and wait for DataWorks Copilot to return the result.

Preset code:

-- Read sample data.
-- Open the MaxCompute three-layer model.
SET odps.namespace.schema = true
;

-- Make sure that the new implementation for data of the DECIMAL data type is used.
SET odps.sql.decimal.odps2 = true
;

SELECT  gross_regional_product -- Gross Regional Product (GRP) (CNY 100 million).
        ,value_added_of_the_primary_industry -- Value added of the primary industry (CNY 100 million). This refers to the market value of the final output of all resident units from primary-industry production activities in a country (or a region) during a period of time. The primary industry includes agriculture, forestry, animal husbandry, and fishery. 
        ,value_added_of_the_secondary_industry -- Value added of the second industry (CNY 100 million). This refers to the market value of the final output of all resident units from second-industry production activities in a country (or a region) during a period of time. The second industry includes mining, manufacturing, production and supply sector of electricity, gas, and water, and construction. 
        ,value_added_of_the_tertiary_industry -- Value added of the third industry (CNY 100 million). This refers to the market value of the final output of all resident units from third-industry production activities in a country (or a region) during a period of time. The third industry includes all sectors except the primary and second industries. 
        ,value_added_of_agriculture_forestry_animal_husbandry_and_fishery -- Value added of agriculture, forestry, animal husbandry, and fishery (CNY 100 million).
        ,value_added_of_industry -- Value added of industry (CNY 100 million). This refers to the final output of industrial enterprises from industrial production activities during the reporting period, expressed in monetary terms. 
        ,value_added_of_construction -- Value added of the construction industry (CNY 100 million). This refers to the final output of construction enterprises from production activities of the construction industry during the reporting period, expressed in monetary terms. 
        ,value_added_of_wholesale_and_retail_trades -- Value added of the wholesale and retail industries (CNY 100 million). This refers to the market value of the final output of all resident units from production activities of the wholesale and retail industries in a country (or a region) during a period of time. 
        ,value_added_of_transport_storage_and_post -- Value added of the transportation, warehousing, and post industries (CNY 100 million). This refers to the market value of the final output of all resident units from production activities of the transportation, warehousing, and post industries in a country (or a region) during a period of time. 
        ,value_added_of_hotels_and_catering_services -- Value added of the accommodation and catering industries (CNY 100 million). This refers to the market value of the final output of all resident units from production activities of the accommodation and catering industries in a country (or a region) during a period of time. 
        ,value_added_of_financial_intermediation -- Value added of the financial industry (CNY 100 million). This refers to the market value of the final output of all resident units from production activities of the financial industry in a country (or a region) during a period of time. 
        ,value_added_of_real_estate -- Value added of the real estate industry (CNY 100 million). This refers to the market value of the final output of all resident units from production activities of the real estate industry in a country (or a region) during a period of time. 
        ,value_added_of_others -- Value added of other industries (CNY 100 million). This refers to the market value of the final output of all resident units from production activities of other industries in a country (or a region) during a period of time. 
        ,per_capital_gross_regional_product -- GRP per capita (yuan/person). GRP per capita refers to the ratio of the absolute value of GRP to the average population of that year. It measures the economic contribution or value created by each resident in a country or region. 
        ,region -- the region. The enumeration includes the 31 provinces, municipalities, and autonomous regions in the Chinese mainland, such as Beijing, Zhejiang Province, and Inner Mongolia Autonomous Region. 
        ,year -- the year. The enumeration includes years from 2003 to 2022.
FROM    bigdata_public_dataset.life_service.mainland_gross_regional_product
WHERE   year IS NOT NULL
LIMIT   100
;

image

Description generation

Description generation for node deployment

In the DATA STUDIO pane, create a MaxCompute node. In the code editor of the MaxCompute node, select an executable code snippet and click Properties in the right-side navigation pane. On the tab that appears, configure the parameters and click Deploy in the top toolbar. On the DEPLOY tab, which appears at the page bottom, click the DataWorks Copilot icon in the lower-right corner of the input box in Step 3: Deploy to Production Environment and wait for DataWorks Copilot to return the result.

Preset node name: ads_ec_ec360_gmv_kpi_overview

image

Description generation for functions

In the left-side navigation pane of the Data Studio page, click Resource Management. In the RESOURCE MANAGEMENT pane, find an existing MaxCompute function, or create one. On the configuration tab of the MaxCompute function, click the DataWorks Copilot icon in the lower-right corner of the Description field and wait for DataWorks Copilot to return the result.

Preset function name: udfjaccardsimilarity

image