All Products
Search
Document Center

DataWorks:Python nodes

Last Updated:Mar 26, 2026

Python nodes run Python 3 code as scheduled jobs in DataWorks. Use them to automate data processing tasks that need to run on a fixed schedule—from simple transformations to complex batch workflows.

Prerequisites

Before you begin, make sure you have:

Limitations

  • Python version: Python nodes support Python 3 only. Python 2 is not supported.

  • Third-party packages: Python nodes provide a basic runtime environment. To use third-party packages, create a custom image with the required dependencies installed, then configure the node to use that image.

  • Resource group: Debugging and scheduling Python nodes requires a serverless resource group. Make sure your workspace has one attached before running or scheduling the node.

  • Compute units: Tasks on a serverless resource group support a maximum of 64 CU (compute units). To avoid resource shortages at startup, keep your configuration at or below 16 CU.

Step 1: Develop the Python node

  1. Write your Python code in the node editor. The following example shows a bubble sort implementation:

    def bubble_sort(arr):
        n = len(arr)
        # Outer loop: controls each pass through the list
        for i in range(n):
            # Inner loop: compares and swaps adjacent elements
            for j in range(0, n-i-1):
                # Swap if the current element exceeds the next
                if arr[j] > arr[j+1]:
                    arr[j], arr[j+1] = arr[j+1], arr[j]
        return arr
    
    if __name__ == "__main__":
        example_list = [64, 34, 25, 12, 22, 11, 90]
        sorted_list = bubble_sort(example_list)
        print("Sorted list:", sorted_list)
  2. Test the code by clicking the debug configuration panel on the right. Select your resource group and other test settings, then click image Run.

  3. Configure the scheduling properties for the node to define how often and when the job runs.

  4. Save the node.

Step 2: Publish the node and monitor runs

  1. Commit and publish the node to the production environment.

  2. Once published, the job runs automatically on the configured schedule. To view run status and perform operations, go to Operation Center > Task O&M > Auto Triggered Task O&M > Auto Triggered Tasks. For details, see Get started with Operation Center.

What's next