DataWorks supports two approaches for sending emails from a data pipeline: using a PyODPS node with custom Python SMTP code, or using the built-in data push feature.
Prerequisites
Before you begin, ensure that you have:
-
A DataWorks workspace
-
An exclusive resource group in the same region as your workspace (required for the PyODPS node approach)
-
SMTP credentials for your email service (host, port, username, and password)
Usage notes
-
TCP port 25 is disabled on Elastic Compute Service (ECS) instances by default. Use port 465 (SMTP over SSL) or port 587 (SMTP with STARTTLS) instead.
-
When a PyODPS node runs on an exclusive resource group, you cannot install additional third-party Python modules. Only standard library modules and pre-installed packages are available.
-
The PyODPS 2 node stores query results in a temporary file before sending. There is no limit on the number of data records in the email.
Send emails using a PyODPS node
Step 1: Create an exclusive resource group
-
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, click Resource Group.
-
On the Exclusive Resource Groups tab, click Create Resource Group.
-
Configure the resource group parameters. For details, see Create and use a serverless resource group.
ImportantThe resource group must be in the same region as your DataWorks workspace.
-
Click Buy Now.
Step 2: Associate the resource group with a workspace
-
Find the resource group you created and click Associate Workspace in the Actions column.
-
In the Associate Workspace panel, find your workspace and click Associate in the Actions column.
Step 3: Open DataStudio
Log on to the DataWorks console. In the top navigation bar, select the target region. In the left-side navigation pane, choose Data Development and O&M > Data Development. Select your workspace from the drop-down list and click Go to Data Development.
Step 4: Create a PyODPS 2 node
-
On the DataStudio page, hover over the
icon and choose Create Node > MaxCompute > PyODPS 2. Alternatively, click the workflow name in the Business Flow section, right-click MaxCompute, and choose Create Node > PyODPS 2. -
In the Create Node dialog box, set the Name and Path parameters.
NoteThe node name can be up to 128 characters and can contain letters, digits, underscores (_), and periods (.).
-
Click Confirm.
-
On the node configuration tab, enter your email-sending code. Replace the placeholders in the table below with your actual values before running. Option 1: Port 465 (SMTP over SSL) — recommended Use this option when your email provider supports SSL on port 465.
Placeholder Description Example <yourHost>SMTP server address of your email service smtp.example.com<yourUserName>Username for authenticating with the mail server user@example.com<yourPassWord>Password for the mail server account — <senderAddress>Sender email address sender@example.com<receiverAddress>Recipient email address recipient@example.comimport smtplib from email.mime.text import MIMEText from odps import ODPS mail_host = '<yourHost>' mail_username = '<yourUserName>' mail_password = '<yourPassWord>' mail_sender = '<senderAddress>' mail_receivers = ['<receiverAddress>'] # Read query results and build email body mail_content = '' with o.execute_sql('query_sql').open_reader() as reader: for record in reader: mail_content += str(record['column_name']) + ' ' + record['column_name'] + '\n' message = MIMEText(mail_content, 'plain', 'utf-8') message['Subject'] = 'mail test' message['From'] = mail_sender message['To'] = mail_receivers[0] try: # Connect using SSL on port 465 smtpObj = smtplib.SMTP_SSL(mail_host + ':465') smtpObj.login(mail_username, mail_password) smtpObj.sendmail(mail_sender, mail_receivers, message.as_string()) smtpObj.quit() print('mail send success') except smtplib.SMTPException as e: print('mail send error', e)Option 2: Port 587 (SMTP with STARTTLS) Use this option when your email provider requires STARTTLS on port 587.
import smtplib from email.mime.text import MIMEText from odps import ODPS mail_host = '<yourHost>' mail_username = '<yourUserName>' mail_password = '<yourPassWord>' mail_sender = '<senderAddress>' mail_receivers = ['<receiverAddress>'] # Read query results and build email body mail_content = '' with o.execute_sql('query_sql').open_reader() as reader: for record in reader: mail_content += str(record['column_name']) + ' ' + record['column_name'] + '\n' message = MIMEText(mail_content, 'plain', 'utf-8') message['Subject'] = 'mail test' message['From'] = mail_sender message['To'] = mail_receivers[0] try: # Connect and upgrade to TLS on port 587 smtpObj = smtplib.SMTP() smtpObj.connect(mail_host, 587) smtpObj.ehlo() smtpObj.starttls() smtpObj.login(mail_username, mail_password) smtpObj.sendmail(mail_sender, mail_receivers, message.as_string()) smtpObj.quit() print('mail send success') except smtplib.SMTPException as e: print('mail send error', e) -
Click the
icon in the top toolbar to save the node.
Step 5: Commit the node
Before committing, go to the Properties tab and configure the Rerun and Parent Nodes parameters.
-
Click the
icon in the top toolbar. -
In the Submit dialog box, enter a description in the Change description field.
-
Click Confirm.
If your workspace is in standard mode, click Deploy in the upper-right corner after committing to deploy the node. For details, see Deploy nodes.
Step 6: Switch the node to the exclusive resource group
-
In the upper-right corner of the node configuration tab, click Operation Center.
-
In the left-side navigation pane of Operation Center, choose Auto Triggered Node O&M > Auto Triggered Nodes.
-
Find the node, then in the Actions column choose More > Modify Scheduling Resource Group.
-
In the Modify Scheduling Resource Group dialog box, select the exclusive resource group from the New Resource Group drop-down list.
-
Click OK.
Step 7: Test the node
Run the node to verify that the email is delivered. For details, see View and manage auto triggered tasks.
Send emails using the data push feature
DataWorks provides two built-in options for pushing data to email addresses without writing custom SMTP code:
-
Data push node: Create a data push node in an auto triggered workflow in DataStudio. The node pushes the output parameters of an upstream node to a specified email address. For details, see Create a data push node.
-
Data push feature: Create a data push task in DataService Studio. Write SQL statements for single- or multi-table queries, format the output as rich text or a table, and configure a scheduling cycle to send the results to an email address on a regular basis. For details, see Data push.
Each data push node or task supports only one email body. Once the email body is added, it cannot be added again.