The data push feature is a DataWorks data service that retrieves data from a data source using SQL queries and pushes it to a Webhook or email address. You can easily configure periodic pushes of business data to multiple Webhooks or email addresses. This topic describes how to configure and use the data push feature.
Overview
You can schedule periodic tasks to push data to a target Webhook or email address.
Supported data sources and channels
-
Supported data source types:
-
MySQL (compatible with StarRocks and Doris)
-
PostgreSQL (compatible with Snowflake and Redshift)
-
Hologres
-
MaxCompute (ODPS)
-
ClickHouse
-
-
Supported push channels include DingTalk, Lark, WeCom, email, and Teams.
Limitations
-
Each SELECT statement in the data push service can return a maximum of 10,000 rows.
-
Data size limits for different destinations:
-
For DingTalk, the pushed data size must not exceed 20 KB.
-
For Lark, the pushed data size must not exceed 20 KB, and each image must be smaller than 10 MB.
-
For WeCom, each bot can send up to 20 messages per minute.
-
For Teams, the pushed content must not exceed 28 KB.
-
For email, each data push task supports only one email body. For more limits, see the SMTP restrictions of your email service.
-
-
The data push feature is available only in DataWorks workspaces in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Chengdu), China (Hong Kong), Singapore, Japan (Tokyo), US (Silicon Valley), US (Virginia), and Germany (Frankfurt).
Prerequisites
-
Ensure that a data source is created. For details, see Data Source Management.
-
Ensure that public network access is enabled for your resource group. For details, see Network connectivity solution overview.
Step 1: Create a push task
-
Go to Data Service.
Log on to the DataWorks console. In the top navigation bar, select the region where your data source resides. In the left-side navigation pane, choose . Select the desired workspace from the drop-down list and click Go to Data Service.
-
Create a data push task.
In the left-side navigation pane of Data Services, choose to go to the Data Push page. Click the
icon, select Create Data Push Task, enter a name for the task, and click OK. The task configuration page opens.
Step 2: Configure the push task
Preparation (optional)
To help you quickly perform a data push, this topic uses an example to explain how to push query results from a MaxCompute table. In this example, you use the data push feature to send data from a table named sales to a specified channel. The data includes the daily sales amount for each department and the change in sales amount compared to the previous day. If you want to follow the steps in this example, you must first create the sales table in your environment. The following code provides the statements to create the sales table and insert data into it. For more information about how to create a table, see Create and use MaxCompute tables.
CREATE TABLE IF NOT EXISTS sales (
id BIGINT COMMENT 'Unique identifier',
department STRING COMMENT 'Department name',
revenue DOUBLE COMMENT 'Revenue amount'
) PARTITIONED BY (ds STRING);
-- Insert sample data into partitions
INSERT INTO TABLE sales PARTITION(ds='20240101')(id, department, revenue ) VALUES (1, 'Department 1', 12000.00);
INSERT INTO TABLE sales PARTITION(ds='20240101')(id, department, revenue ) VALUES (2, 'Department 2', 21000.00);
INSERT INTO TABLE sales PARTITION(ds='20240101')(id, department, revenue ) VALUES (3, 'Department 3', 5000.00);
INSERT INTO TABLE sales PARTITION(ds='20240102')(id, department, revenue ) VALUES (1, 'Department 1', 11000.00);
INSERT INTO TABLE sales PARTITION(ds='20240102')(id, department, revenue ) VALUES (2, 'Department 2', 20000.00);
INSERT INTO TABLE sales PARTITION(ds='20240102')(id, department, revenue ) VALUES (3, 'Department 3', 10000.00);
Select a data source
Select the Data Source Type, Data Source Name, and Data Source Environment to determine the environment of the data table for the data push. You can select the data source environment based on whether the data push is for a development table or a production table. If you are performing a hands-on exercise, confirm the environment where the sales table you created during the preparation phase is located.
For example, set Data source type to odps, Data source name to MaxCompute_Source, and Data source environment to production environment. To create a new data source, click the link below the Data source name field.
For a list of supported data source types, see Supported data sources and channels.
Write query SQL
-
Define the data scope and retrieve data.
In the Edit Query SQL section, use single-table or multiple-table SQL queries to define the data to be pushed. For example:
-- Get the sales revenue for each department on 20240102 SELECT id, department, revenue FROM sales WHERE ds='20240102'; -- Get the change in sales revenue compared to the previous day SELECT a.revenue - b.revenue AS diff FROM sales a LEFT JOIN sales b ON a.id = b.id AND a.ds > b.ds WHERE a.ds = '20240102'AND b.ds = '20240101';After you write the SQL, the result fields are automatically populated in the section. If parsing the output parameters fails or they are incorrect, you can disable Automatically Parse Parameters and manually Add Parameter.
You can also configure custom variables in SQL by using the
${variable_name}format. This variable is an Assignment Parameters(Assignment Parameters can be assigned time expressions and constants) to implement dynamic parameter input for your code. For more information, see Configure push content.-- Use scheduling parameters to dynamically assign time variables. -- Get the latest daily sales revenue for each department SELECT id, department, revenue FROM sales WHERE ds='${date}'; -- Get the change in sales revenue compared to the previous day SELECT a.revenue - b.revenue AS diff FROM sales a LEFT JOIN sales b ON a.id = b.id and a.ds > b.ds WHERE a.ds = '${date}' AND b.ds = '${previous_date}'; -
Paginated query.
For large tables, data push supports paginated queries using a Next Token. Click on the code editor toolbar for usage instructions.
Configure push content
In the Content to Push section, you can edit the message content using Markdown and Table formats. This content will be pushed to the Webhook.
After you customize the message title in the Title field, click Add in the body area. Then, choose Markdown, Table, or Email Body to edit the content. The following example shows a sample configuration. You can click Preview on the toolbar to see the message format.
-
If the push destination is an email address, the content customized in the Markdown and Table sections is sent as attachments. The email body is rendered and displayed in the email message.
-
If the push destination is not an email address, the content customized in the Markdown and Table sections will be displayed as the main body of the Webhook message. The Email Body will be hidden in the Webhook push message.
Markdown content
-
Use parameter variables: When composing the push content, you can add Assignment Parameters and Output Parameters to the rich text using the
${parameter_name}format. These variables are replaced with the corresponding assigned data or SQL query results when the data push task runs.-
Assignment Parameters: You need to assign a Constant or a scheduling parameter's Time Expression to the variable in the section.
-
Output Parameters: These parameters correspond to the field names or aliases from your SQL query, such as
A, B, ...in a statement likeSELECT A, B, ... FROM TABLE. They represent the queried data.
-
-
@mention members: You can configure this when pushing to a Lark Webhook to automatically @mention specific users.
-
By default, Markdown mode uses rich text to configure message content. When pushing to Lark, you can use the @mention feature to notify relevant personnel. You can click the
icon to switch to Markdown source mode and then use <at id="all" />or<at email="username@example.com" />to achieve this.
-
-
In addition to the features above, Markdown also supports functions like Add Image and inserting DingTalk Emoji.
In the push content area, select Markdown as the template type. In the body, use the
${parameter_name}syntax to reference parameters defined in the Input Parameters panel on the right. For example, if you write${creator}and${subscriber}in the body, and set creator to "admin" and subscriber to "user" on the Input parameters tab, the variables are automatically replaced with their values when the task runs. Input parameters also support scheduling time variables. For example, you can set date to${yyyymmdd}and previous_date to${yyyymmdd-1}. The Write Query SQL section can also reference input parameters for dynamic values, for example,SELECT id, department, revenue FROM sales WHERE ds='${date}';.
Table content
-
Click Add Column to increase the number of columns in the table. You can then associate Parameters with the corresponding columns.
-
When the push destination is a Lark Webhook, click the
icon to the right of a created table column to open the Modify Field dialog box. In this dialog box, you can adjust the Field, Display Name, Display Style, and Condition to create diverse display effects for the pushed content.-
Field: Switch to another Output Parameters field.
-
Display Name: The name you want to show in the table header when pushing to collaboration tools.
-
Display Style: Add a fixed prefix or suffix before or after the Value in the table.
-
Condition: Compares the Value in a table with a configured comparison value. You can customize the display color for values that Yes or No, and specify an Additional Unicode. Condition: You can enable conditional logic, set an operator (such as
>=) and a threshold (such as60). If the condition is met, you can select Change to green. If the condition is not met, you can select Change to red. You can also configure an Appended Identifier.
Note-
The method for authoring tables varies by channel. Table content support for different channels is as follows:
-
DingTalk: Supports Markdown tables and the built-in tables of data push. It does not support rendering the Display Style and Condition settings configured in the Modify Field dialog box. Also, DingTalk mobile does not support displaying tables.
-
Lark: Supports both Markdown and built-in tables, including the rendering of custom display styles and conditions.
-
WeCom: Supports pushing Markdown tables but does not render them.
-
Teams mobile: Supports pushing Markdown tables and can render them.
-
-
Email body
DataWorks data push supports adding an email body to the push content. When you edit the email body, note the following:
-
Each data push task supports only one email body.
-
The email body is rendered only when the push destination is an email address. If the push destination is not an email address, the Email Body is hidden in the Webhook push message.
Step 3: Configure push settings
Before you configure Push Settings, click the
icon in the lower-left corner of the Service Development page to open the settings panel. Switch to the Destination Management tab, and click Create Destination to create a destination. Supported channel types include DingTalk, Lark, WeCom, Teams, and Email.
Create a Webhook destination
When you click Create Destination, configure the following parameters:
-
Type: Select a channel type. Options include DingTalk, Lark, WeCom, and Teams.
-
Destination Name: Enter a custom name for the new push destination.
-
Webhook: The Webhook URL of the selected push channel.
-
For how to obtain a Lark bot Webhook, see Configure a Lark Webhook trigger.
-
For how to obtain a Teams Webhook, see Use Microsoft Teams workflows to create an incoming Webhook.
The Type drop-down list also supports the Email channel. After you complete the configuration, click OK.
Create an email destination
Before you configure Push Settings, click the
icon in the lower-left corner of the Service Development page to open the settings panel. Switch to the Destination Management tab, and click Create Destination to create a destination.
When you click Create Destination, you must configure the following parameters:
-
Type: Select Email.
-
Destination Name: Enter a custom name for the new push destination.
-
SMTP Host: The address of the mail server.
-
SMTP Port: The port number of the mail server. The default value is 465, which can be manually changed.
-
Sender Address: The email sending address.
-
SMTP Account: The full email account.
-
SMTP Password: The password for the email account.
-
Receiver Address: The destination email address.
Push settings
Click Push Settings on the right side to configure the task's scheduling cycle, scheduling resources, and push destinations. The specific configuration items are as follows:
-
Scheduling cycle and run time configuration: Configure the scheduling cycle and specific time for the data push service to push the edited content.
Scheduling cycle
Specified time
Scheduling time
Example
Month
Specify the days of the month on which to run the push task.
The scheduling time for the data push task on the push day.
Scheduling Frequency: Month
Specified Time: 1st of every month
Data Timestamp: 08:00
Actual run time: The push task runs at 08:00 on the 1st of every month.
Week
Specify the days of the week on which to run the push task.
The scheduling time for the data push task on the push day.
Scheduling Frequency: Week
Specified Time: Monday
Data Timestamp: 09:00
Actual run time: The push task runs at 09:00 every Monday.
Day
NoteThe daily cycle schedules the task to run every day.
The scheduling time for the data push task on the push day.
Scheduling Frequency: Day
Data Timestamp: 08:00
Actual run time: The push task runs at 08:00 every day.
Hour
NoteYou can choose between two push modes:
-
Push at a specified hourly interval.
-
Push at specified hours and minutes.
Push at an hourly interval:
Start Time: 00:00
Time Interval: 1 hour
End Time: 23:59
Actual run time: Pushes once every hour from 00:00 to 23:59 daily.
Push at specified hours and minutes:
Hour: 0, 1
Specified Minute: 10
Actual run time: Pushes at 00:10 and 01:10 daily.
-
-
Timeout Definition: Sets a time limit for task execution. The task is terminated if it exceeds this limit.
-
Default Value: With the Default Value setting, the task timeout is dynamically adjusted based on the system load, with a value ranging from 3 to 7 days. Timed-out tasks are terminated.
-
Example: If you set a Custom timeout of 1 hour, the push task is terminated if it runs for more than 1 hour after its scheduled start time.
-
-
Valid From: Configure the time range during which the data push task is active.
-
Permanent: The data push task remains effective permanently and is not limited by an effective date range.
-
Example: If you configure a Specified Time range from 2024-01-01 to 2024-12-31, the push task runs according to the configured scheduling cycle within this period.
-
-
Resource Group for Scheduling: You can configure an Exclusive resource group for scheduling or a serverless resource group (general-purpose resource group) to provide scheduling resources for the data push task. For more information about resource groups, see Resource Group Management.
-
Push Every Time: Controls whether to send a push notification when the SQL query returns no data.
-
Enabled (default): The push is executed on every scheduled run, regardless of whether the query returns data.
-
Disabled: If all variables used in the push content, except for input parameters, are empty, the message is not sent. You can use
WHEREorHAVINGclauses in SQL to filter data. If the filter conditions are not met and the query result is empty, the push task is automatically skipped and no message is sent.
-
-
Destination: You can push the configured content to a selected destination. You can only choose from existing push destinations, which are configured in Data Push Task Management.
NoteWhen pushing to a DingTalk Webhook, you must add a keyword in the section of the bot's configuration. Ensure that the push content includes this keyword for the push to succeed.
Step 4: Test the push task
After creating the data push task, click the Save button on the toolbar to save the current configuration. Then, click Test to perform a development-stage test to verify that the data push works correctly. You must manually assign constant values to the variables for the test.
A data push task must pass a test push in the development environment before it can be Submit and Publish.
Step 5: Publish the push task
Manage task versions
-
After you confirm that the tests during development are successful, click Submit. If the push task is not submitted, it remains in a draft state and no new version is generated.
-
After you submit the service, a new version is generated. In the Version panel on the right, find the submitted version that is Can Be Published and click Publish. Publishing the task activates its schedule as defined in the Push Settings.
In the Version panel, manage the data push task as follows.
Status
Actions
Description
Publish
Data Push Task Management
Goes to the Data Push Task Management page, where you can view detailed information about published tasks. For more information, see Manage data push tasks.
Can Be Published
Publish
Publishes the corresponding version of the task.
Abandoned
Discards the corresponding version of the task and changes its status to Abandoned.
Off-Line, Abandoned
Version Details
View the configuration information and corresponding push content for that version of the data push task.
Roll Back
Restores this version, making it the current configuration.
NoteThe Version Details and Roll Back operations are available and function identically for tasks in all statuses.
Manage push tasks
After a data push task is successfully published, click Data Push Task Management in the Operation column of the Version panel, or navigate to the Data Push Tasks list page via the path.
This page lists all published Data Push Tasks and displays details such as their ID, Name, Data Source Name, Data Source Environment, Node Mode, Resource Group for Scheduling, Owner, Deployer, and Published Time. In the Operation column, perform the following operations on published data push tasks:
|
Actions |
Description |
|
Unpublish |
Takes the selected task offline. |
|
Test |
Goes to the Test Data Push Task page, where you can test a published task. |
Clicking the
icon in the Name column takes you to the Version Details page for the selected task.
Test a published task
Go to the Data Push Test page in either of the following ways:
-
Method 1: Choose .
-
Method 2: Choose .
Testing a published task confirms that it runs correctly and that the destination receives the data as expected.
On the Data Push Test page, select or search for the target data push task from the drop-down list, select the Push to Destination check box as needed, and then click Start Test.
FAQ
Q: Does data push support on-demand pushes?
A: For occasional, on-demand pushes, use the Test function with the Push to Destination option selected. For conditional recurring pushes, disable the Push Every Time setting; the task will then only run if your SQL query returns data.