DataWorks extensions allow you to define custom logic to monitor user actions. For example, you can use extensions to intercept and block inappropriate actions, send notifications, or manage processes for specific events. This topic describes how to develop and deploy an extension using a self-hosted service.
Background information
Prerequisites
You must enable message subscription. The deployment of extensions using a self-hosted service relies on the message distribution capabilities of EventBridge. You must ensure that DataWorks open event messages are configured to be sent to an event bus in EventBridge. You must also ensure that messages from this event bus are routed to a service program that is deployed on-premises or in the cloud.
Limits
-
Only users of DataWorks Enterprise Edition can use the Extensions module.
-
The Extensions module is available in the following regions: China (Beijing), China (Hangzhou), China (Shanghai), China (Zhangjiakou), China (Shenzhen), China (Chengdu), US (Silicon Valley), US (Virginia), Germany (Frankfurt), Japan (Tokyo), China (Hong Kong), and Singapore.
Notes
-
Only the Open Platform administrator, tenant administrator, Alibaba Cloud accounts, and RAM users to which the AliyunDataWorksFullAccess policy is attached have read and write permissions on the developer backend. For more information about permission management, see Global module permission control and Manage product-level and console access with RAM policies.
Version limit: If your DataWorks Enterprise Edition subscription expires, all extensions become invalid and can no longer trigger event checks. Any checks that have been triggered but have not reached their final state will automatically pass.
Node limit: When a check is triggered for a composite node that contains inner nodes, such as a machine learning (PAI) node, a do-while node, or a for-each node, all inner nodes must pass the check before subsequent operations can proceed.
Trigger description: Multiple extensions can be associated with the same extension point event. This means a single event can trigger multiple extensions.
Processing flow
The following describes the basic flow of using a self-built service deployment extension program to consume messages from EventBridge:
After an extension point event is triggered, the associated process enters the Checking state and waits for a result from the extension's callback API. DataWorks then decides whether to block the process based on the result.
User side
Before you deploy an extension in DataWorks using a self-hosted service, you must develop the extension and deploy it in the cloud or on-premises. You can see Appendix: DataWorks Open Platform sample code library to initialize the project code and obtain the Open Platform sample code from GitHub. When you develop the extension, you must base the development and deployment on the type of service to which messages from the event bus are ultimately routed.
Step 1: Configure extension dependencies
When you develop an extension, add the following dependencies to the pom.xml file. EventBridge supports various types of endpoints to process and consume events. In addition to the following dependencies, you can configure other dependencies based on the event target set in EventBridge and the final message routing.
DataWorks dependency library
<dependency>
<groupId>com.aliyun</groupId>
<artifactId>dataworks_public20200518</artifactId>
<version>5.6.0</version>
</dependency>Packaging dependency configuration
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>Step 2: Develop the extension code
Messages routed by the event bus in EventBridge are pushed to a service that is deployed on-premises or in the cloud. This service receives the DataWorks messages that are pushed by the event bus and uses a specific DataWorks API to send a callback with the processed result.
Develop the program code.
Parse the message content
For the format of event messages that are pushed by DataWorks, see Appendix: Format of messages sent from DataWorks to EventBridge. In the message format,
datacontains the specific message content. During development, you can use thedata.eventCodefield to identify the message type and theidfield to retrieve message details.NoteOpenEvent uses EventBridge to distribute DataWorks event messages. Before you develop an extension, you must subscribe to DataWorks messages in EventBridge. For more information, see Enable message subscription.
Write the processing logic
Process the messages that are pushed by the event bus as needed. During extension development, you can use the following methods to improve development efficiency and application performance.
Use Advanced feature: Configure extension parameters, such as
extension.project.disabled, to disable the extension for a specific workspace.When you process extension points that are related to the Data Studio module, you can call the GetIDEEventDetail API to retrieve a data snapshot from the time when the extension point event was triggered based on the
MessageId.
NoteMessageIdcorresponds to theidfield in the message. For more information, see Appendix: Format of messages sent from DataWorks to EventBridge.Return the processing result to DataWorks
The extension service must return the processing result for the extension point to DataWorks through an OpenAPI. When you make the callback, you must select the appropriate OpenAPI based on the module where the extension point event occurred.
Extension point events in DataStudio: Use the UpdateIDEEventResult API to send the callback that contains the processing result.
Extension point events in Operation Center: Use the UpdateWorkbenchEventResult API to send the callback that contains the processing result.
Extension point events in other modules: Use the CallbackExtension API to send the callback that contains the processing result.
The callback API returns the extension (ExtensionCode), event message (MessageId), and message processing result (CheckResult) for the current service.
CheckResult values:
OK: The extension passed the check for this extension point event.FAIL: The extension failed the check for this extension point event. You must view and handle the error in a timely manner to prevent impacts on subsequent program execution.WARN: The extension passed the check for this extension point event, but with a warning.
ExtensionCode: You can obtain this code from the extension list page in DataWorks after you register the extension as described in the following sections.
MessageId: Corresponds to the ID field in the message. For more information, see Format of messages sent from DataWorks to EventBridge.
After the code is developed, you can package the program into a runnable
.jarfile for subsequent service deployment.
Step 3: Deploy the extension
After you develop and debug the extension code, you can deploy the packaged code as an application service on Alibaba Cloud ECS or another service provider.
DataWorks product side
After you finish code development, you can register and manage the extension in the DataWorks console.
Step 1: Register the extension
Before you can develop and use an extension, you must register it in DataWorks to obtain the corresponding Extension Code for subsequent development. The following procedure describes how to register an extension.
Go to the Developer Backend tab.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, click Go to Open Platform. The Developer Backend tab appears.
Register the extension.
In the left navigation pane, click Extensions to open the Extensions page.
Click . Select Deploy with a self-hosted service and configure the extension details.
Complete the extension registration.
Click OK to finish.
NoteAfter a successful registration, you can view the extension in the Extension List.
Step 2: Publish the extension
After the extension is developed, deployed, and registered in DataWorks, you must complete the testing, approval, and publishing process. Then, administrators other than the extension owner can enable the extension in the Management Center. For more information, see Apply extensions.
Appendix: Format of messages sent from DataWorks to EventBridge
In the following content, the `data` field contains the content that is pushed by DataWorks to EventBridge. EventBridge adds other information to this base.
{
"datacontenttype": "application/json;charset=utf-8", // The format of the content in the data parameter. datacontenttype supports only the application/json format.
"aliyunaccountid": "1111",// The ID of the Alibaba Cloud account.
"aliyunpublishtime": "2024-07-10T07:25:34.915Z",// The time when EventBridge received the event.
"data": {
"tenantId": 28378****10656,// The tenant ID. Each Alibaba Cloud account corresponds to a tenant in DataWorks, and each tenant has its own ID. You can find this value in the user information section in the upper-right corner of DataWorks Data Studio.
"eventCode": "xxxx"// The event code.
},
"aliyunoriginalaccountid": "11111",
"specversion": "1.0",
"aliyuneventbusname": "default",// The name of the EventBridge event bus used to receive DataWorks event messages.
"id": "45ef4dewdwe1-7c35-447a-bd93-fab****",// The event ID. A unique value that identifies the event.
"source": "acs.dataworks",// The event source, which is the service that provides the event. This indicates that the message was pushed by DataWorks.
"time": "2024-07-10T15:25:34.897Z",// The time when the event occurred.
"aliyunregionid": "cn-shanghai",// The region where the event was received.
"type": "dataworks:ResourcesUpload:UploadDataToTable"// The event type. You can use this event type in the EventBridge console to filter all messages pushed by DataWorks. The Type value is different for each event.
}The content of the `data` field varies based on the message type. For more information about each event message, see Development Reference: Event list and message format.
Extension examples
After you understand the development notes for extensions, you can develop your own extension code as needed. The following topics provide examples of extension registration, development, and application in common scenarios.
References
For the message format of various events, see Development Reference: Event list and message format.
OpenEvent provides message subscription for some events through EventBridge. For more information, see OpenEvent overview.
For a list of extension points that support event processing using extensions, see Extensions overview.
You can also deploy extensions using Function Compute. For more information, see Develop and deploy extensions using the Function Compute method.