DataWorks extensions let you intercept and respond to user actions — for example, blocking task publishing during a change freeze, enforcing SQL conventions, or triggering approval workflows. This guide walks through building an extension using a self-hosted service and registering it in DataWorks.
How it works
Extensions use an event-driven architecture. When a user triggers an extension point event in DataWorks (such as submitting a task for publishing), DataWorks publishes an event message to an EventBridge event bus. Your self-hosted service receives the message, runs its processing logic, and calls a DataWorks API with the result. Based on that result, DataWorks either allows or blocks the operation.
This push-based model means your service only needs to handle messages as they arrive — no polling required.
After an extension point event is triggered, the associated process enters the Checking state and waits for the callback result. DataWorks then decides whether to block the process based on that result.
Prerequisites
Before you begin, ensure that you have:
Enabled message subscription — DataWorks open event messages must be configured to send to an EventBridge event bus, and that event bus must route messages to your service
A DataWorks Enterprise Edition subscription
Sufficient permissions: only the Open Platform administrator, tenant administrator, Alibaba Cloud accounts, and RAM users with the AliyunDataWorksFullAccess policy attached have read and write permissions on the Developer Backend. For more information, see Global module permission control and Manage product-level and console access with RAM policies.
Limitations
The Extensions module is available only to DataWorks Enterprise Edition users. If your subscription expires, all extensions become invalid and can no longer trigger event checks. Any checks that have been triggered but have not reached a final state automatically pass.
Available regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Zhangjiakou), China (Shenzhen), China (Chengdu), US (Silicon Valley), US (Virginia), Germany (Frankfurt), Japan (Tokyo), China (Hong Kong), and Singapore.
Usage notes
Multiple extensions can be associated with the same extension point event, so a single event can trigger multiple extensions simultaneously.
Extension point events are divided into tenant-level and workspace-level. When you register an extension, you can only select one level.
When a check is triggered for a composite node that contains inner nodes — such as a machine learning (PAI) node, a do-while node, or a for-each node — all inner nodes must pass the check before subsequent operations can proceed.
Step 1: Configure extension dependencies
Clone the DataWorks Open Platform sample code from GitHub, or see Appendix: DataWorks Open Platform sample code library to initialize the project.
Add the following dependencies to your pom.xml file.
DataWorks dependency library
DataWorks dependency
<dependency>
<groupId>com.aliyun</groupId>
<artifactId>dataworks_public20200518</artifactId>
<version>5.6.0</version>
</dependency>Packaging dependency configuration
Packaging configuration
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>EventBridge supports multiple endpoint types. Add additional dependencies based on the event target you configured in EventBridge.
Step 2: Develop the extension code
Your service receives event messages pushed by EventBridge and calls a DataWorks API to return the processing result. Development has three parts: parsing the message, writing the processing logic, and calling the callback API.
Parse the message
DataWorks event messages use the following format. The data field contains the DataWorks-specific payload; EventBridge adds the outer envelope.
{
"datacontenttype": "application/json;charset=utf-8",
"aliyunaccountid": "1111",
"aliyunpublishtime": "2024-07-10T07:25:34.915Z",
"data": {
"tenantId": 28378****10656,
"eventCode": "xxxx"
},
"aliyunoriginalaccountid": "11111",
"specversion": "1.0",
"aliyuneventbusname": "default",
"id": "45ef4dewdwe1-7c35-447a-bd93-fab****",
"source": "acs.dataworks",
"time": "2024-07-10T15:25:34.897Z",
"aliyunregionid": "cn-shanghai",
"type": "dataworks:ResourcesUpload:UploadDataToTable"
}Key fields:
| Field | Description |
|---|---|
datacontenttype | The format of the content in the data parameter. Supports only the application/json format. |
data.eventCode | Identifies the event type; use this to route messages to the correct handler |
id | The event ID (also referred to as MessageId); use this to retrieve event details via API |
data.tenantId | The tenant ID; find this in the user information section in the upper-right corner of DataWorks Data Studio |
source | Always acs.dataworks for DataWorks events |
The contents of the data field vary by event type. For the full schema of each event message, see Development Reference: Event list and message format.Write the processing logic
Process incoming messages according to your use case. Two built-in capabilities are available:
Disable the extension for specific workspaces: Use the
extension.project.disabledparameter. For details, see Configure extension parameters.Retrieve a data snapshot for Data Studio events: Call the GetIDEEventDetail API with the
MessageId(theidfield) to get a snapshot of the state at the time the event was triggered. This is only available for Data Studio extension points.
Call the callback API
After processing, call the appropriate DataWorks API based on which module triggered the event:
| Module | API |
|---|---|
| Data Studio | UpdateIDEEventResult |
| Operation Center | UpdateWorkbenchEventResult |
| All other modules | CallbackExtension |
Each callback includes three parameters:
| Parameter | Description |
|---|---|
ExtensionCode | The unique code for your extension; available from the extension list page after registration |
MessageId | The id field from the event message |
CheckResult | The processing result: OK (passed), FAIL (failed — handle promptly to avoid blocking downstream execution), or WARN (passed with warning) |
Step 3: Deploy the extension
Package the code into a runnable .jar file, then deploy it as an application service on Alibaba Cloud ECS or another provider of your choice.
Step 4: Register the extension in DataWorks
After the service is deployed, register the extension in DataWorks to get its Extension Code — required for the callback in Step 2.
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose More > Open Platform. On the page that appears, click Go to Open Platform. The Developer Backend tab appears.
In the left navigation pane, click Extensions.
Click Extension List > Register Extension. Select Deploy with a self-hosted service and fill in the registration form.
| Parameter | Description |
|---|---|
| Extension Name | A custom name for the extension, used for identification. |
| Extension Points to Process | The extension point events this extension handles. After you select extension points, the Event and Applicable Module fields are populated automatically. For a list of supported extension points, see List of supported extension point events. You can only select extension points of a single level (tenant-level or workspace-level) per extension. Note Extensions deployed using Function Compute currently support only the Pre-Data-Download Event. |
| Owner | The owner of the extension; users can contact this person if they encounter issues. |
| Test Workspace | A workspace where the extension takes effect before publishing, for end-to-end testing. Trigger events in this workspace to verify that DataWorks sends messages via EventBridge and that your service receives, processes, and returns callbacks correctly. Not required if you selected tenant-level extension points. |
| Extension Details URL | URL of a page describing the extension. When the extension is triggered, users can visit this page to see the check path and the reason for any block. |
| Extension Document URL | URL of the help document for the extension; helps users understand the validation logic and properties. |
| Extension Parameter Settings | Parameters used in the extension code, in key=value format (one per line). For example, use the built-in extension.project.disabled parameter to disable the extension for a specific workspace. See Configure extension parameters. |
| Extension Option Settings | Configuration items for extension users, defined as a JSON string. These let users control extension behavior per workspace. See Define options for an extension. |
Click OK to complete registration. The extension appears in the Extension List, and its Extension Code is now available for use in the callback API.
Step 5: Publish the extension
After testing and validating the extension in the test workspace, complete the publishing process. Once published, administrators (other than the extension owner) can enable the extension in the Management Center. For details, see Apply extensions.
Extension examples
The following topics provide end-to-end examples for common governance use cases:
What's next
For the message format of all supported events, see Development Reference: Event list and message format.
For an overview of OpenEvent and message subscription, see OpenEvent overview.
For the full list of extension points that support extensions, see Extensions overview.
To deploy an extension using Function Compute instead of a self-hosted service, see Develop and deploy extensions using the Function Compute method.