All Products
Search
Document Center

DataWorks:Use a Shell node to run Python scripts

Last Updated:Mar 26, 2026

DataWorks Shell nodes support running Python scripts by uploading them as resources and referencing those resources from a Shell node. Both Python 2 and Python 3 are supported. You can run Python scripts on a common Shell node (backed by MaxCompute) or an EMR Shell node (backed by E-MapReduce).

Limitations

How it works

DataWorks uploads your Python script as a resource, then references that resource from a Shell node. The node runs the script using the interpreter path you specify:

Python version Interpreter command
Python 3 /home/tops/bin/python3 <script>.py
Python 2 python <script>.py

Prerequisites

  • A DataWorks workspace with DataStudio access.

  • A Shell node or EMR Shell node. See Create a Shell node or Create an EMR Shell node.

  • If your script requires third-party packages, install them on the resource group before running the node:

    • Serverless resource group (recommended): use the image management feature to install packages.

    • Exclusive resource group for scheduling: use the O&M Assistant feature to install packages.

Run a Python script on a common Shell node

Use this procedure when your Shell node runs on a MaxCompute resource group.

Step 1: Create a MaxCompute Python resource

  1. Log on to the DataWorks console. In the top navigation bar, select a region. In the left-side navigation pane, choose Data Development and O&M > Data Development, select a workspace, and click Go to Data Development.

  2. On the DataStudio page, right-click a workflow, then choose Create Resource > MaxCompute > Python. In the dialog box, set Name to mc.py and click Create.

    mc.py is an example name. Use any name that suits your project.
  3. On the resource configuration tab, write your Python script. Example: Python 3:

    Python 3

    print('This is a test text')

    Python 2

    print "This is a test text"
  4. Click the 保存 icon to save, then click the 提交 icon to commit the resource.

Step 2: Reference the resource in a Shell node

  1. On the DataStudio page, right-click a workflow, then choose Create Node > General > Shell. Configure the Name parameter and click Confirm.

  2. On the Shell node configuration tab, locate mc.py under Resource in the MaxCompute folder. Right-click the resource name and select Insert Resource Path. When the resource is referenced successfully, the configuration tab displays the resource path:

    image

Step 3: Configure and run the node

Add the resource reference directive and the interpreter command to the configuration tab, then run the node.

Use Python 3 to run the referenced resource in the common Shell node

##@resource_reference{"mc.py"}
/home/tops/bin/python3 mc.py

Use Python 2 to run the referenced resource in the common Shell node

##@resource_reference{"mc.py"}
python mc.py

To run the node, click the image icon. In the warning dialog box, click Continue to Run. In the Runtime Parameters dialog box, select a resource group, specify a custom image, and click OK.

The output confirms the script ran successfully:

image

Run a Python script on an EMR Shell node

Use this procedure when your Shell node runs on an E-MapReduce (EMR) resource group.

Step 1: Create an EMR file resource

  1. Log on to the DataWorks console. In the top navigation bar, select a region. In the left-side navigation pane, choose Data Development and O&M > Data Development, select a workspace, and click Go to Data Development.

  2. On the DataStudio page, right-click a workflow, then choose Create Resource > EMR > EMR File. In the dialog box, set File Source to Local, click Upload to upload the emr.py script, and click Create. Sample script content: Python 3:

    emr.py is an example name. Use any name that suits your project.

    Python 3

    print('This is a test text')

    Python 2

    print "This is a test text"
  3. Click the 提交 icon in the top toolbar to commit the resource.

Step 2: Reference the resource in an EMR Shell node

  1. On the DataStudio page, right-click a workflow, then choose Create Node > EMR > EMR Shell. Configure the Name parameter and click Confirm.

  2. On the EMR Shell node configuration tab, locate emr.py under Resource in the EMR folder. Right-click the resource name and select Insert Resource Path. When the resource is referenced successfully, the configuration tab displays the resource path:

    image

Step 3: Configure and run the node

Add the resource reference directive and the interpreter command to the configuration tab, then run the node.

Use Python 3 to run the referenced resource in the EMR Shell node

##@resource_reference{"emr.py"}
/home/tops/bin/python3 emr.py

Use Python 2 to run the referenced resource in the EMR Shell node

##@resource_reference{"emr.py"}
python emr.py

To run the node, click the image icon. In the Parameters dialog box, select a resource group, specify a custom image, and click Run.

The output confirms the script ran successfully:

image