Lindorm Distributed Processing System (LDPS) integrates with the task orchestration feature of Data Management (DMS), letting you schedule Apache Spark jobs, monitor run history, and view logs—all from a visual interface. LDPS supports workloads across data production, interactive analytics, machine learning, and graph computing.
Prerequisites
Before you begin, ensure that you have:
-
An activated DMS account
-
LDPS activated for your Lindorm instance. See Activate LDPS and modify the configurations
-
A compiled Spark job (JAR, Python, or SQL). See Create a job in Java or Create a job in Python
-
The job file uploaded to Hadoop Distributed File System (HDFS) or Object Storage Service (OSS). See Upload files in the Lindorm console
Create a Lindorm Spark task flow
To create a task flow, you need: a task flow name, the path to your Spark job file in HDFS or OSS, and your Lindorm instance ID and region.
-
Log on to the DMS console V5.0.
-
Go to the Task Orchestration page.
-
Simple mode: In the Scene Guide section, click Data Transmission and Processing (DTS). Then click Task Orchestration in the Data processing section.
-
Normal mode: In the top navigation bar, choose DTS > Data Development > Task Orchestration.
-
-
Click Create Task Flow.
-
In the Create Task Flow dialog box, enter a Task Flow Name and optional Description, then click OK.
-
In the Task Type section on the left, drag Lindorm Spark nodes onto the canvas. Connect nodes to define dependencies between them.
-
Configure each Lindorm Spark node:
-
Double-click the node, or click the node and then click the
icon. -
In the Basic configuration section, set the following parameters:
Parameter Description Region The region where your Lindorm instance is deployed. Lindorm Instance The ID of your Lindorm instance. Task Type The Spark job type: JAR, Python, or SQL. -
JAR
-
Python
-
-
In the Job configuration section, paste and edit the configuration template for your job type:
-
JAR job configuration
-
SQL job configuration
-
JAR job
{ "mainResource": "oss://path/to/your/file.jar", "mainClass": "path.to.main.class", "args": ["arg1", "arg2"], "configs": { "spark.hadoop.fs.oss.endpoint": "", "spark.hadoop.fs.oss.accessKeyId": "", "spark.hadoop.fs.oss.accessKeySecret": "", "spark.hadoop.fs.oss.impl": "org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem", "spark.sql.shuffle.partitions": "20" } }Parameter Type Required Description Example mainResourceString Yes Path to the JAR file in HDFS or OSS. HDFS: hdfs:///path/spark-examples_2.12-3.1.1.jar; OSS:oss://testBucketName/path/spark-examples_2.12-3.1.1.jarHDFS: hdfs:///path/spark-examples_2.12-3.1.1.jar; OSS:oss://testBucketName/path/spark-examples_2.12-3.1.1.jarmainClassString Yes Entry point class for the JAR job. com.aliyun.ldspark.SparkPiargsArray No Arguments passed to mainClass.["arg1", "arg2"]configsJSON No Spark system parameters. If the job is stored in OSS, configure the OSS keys below. {"spark.sql.shuffle.partitions": "200"}If the JAR file is stored in OSS, set the following keys inside
configs:Key Description spark.hadoop.fs.oss.endpointOSS endpoint where the job file is stored. spark.hadoop.fs.oss.accessKeyIdAccessKey ID for OSS access. See Obtain an AccessKey pair. spark.hadoop.fs.oss.accessKeySecretAccessKey secret for OSS access. See Obtain an AccessKey pair. spark.hadoop.fs.oss.implOSS file system class. Set to org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem. -
Python job
{ "mainResource": "oss://path/to/your/file.py", "args": ["arg1", "arg2"], "configs": { "spark.hadoop.fs.oss.endpoint": "", "spark.hadoop.fs.oss.accessKeyId": "", "spark.hadoop.fs.oss.accessKeySecret": "", "spark.hadoop.fs.oss.impl": "org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem", "spark.submit.pyFiles": "oss://path/to/your/project_file.py,oss://path/to/your/project_module.zip", "spark.archives": "oss://path/to/your/environment.tar.gz#environment", "spark.sql.shuffle.partitions": "20" } }Parameter Type Required Description Example mainResourceString Yes Path to the Python file in OSS or HDFS. OSS: oss://testBucketName/path/spark-examples.py; HDFS:hdfs:///path/spark-examples.pyOSS: oss://testBucketName/path/spark-examples.py; HDFS:hdfs:///path/spark-examples.pyargsArray No Arguments passed to the Python script. ["arg1", "arg2"]configsJSON No Spark system parameters. If the job is stored in OSS, configure the four OSS keys (same as JAR job) plus the Python-specific keys below. {"spark.sql.shuffle.partitions": "200"}Python-specific keys in
configs:Key Description spark.submit.pyFilesComma-separated OSS paths to additional Python files or ZIP modules. spark.archivesOSS path to a Python environment archive ( .tar.gz).
-
-
Click Try Run in the upper-left corner to verify the job runs as expected.
-
-
After all nodes are configured, click Publish in the upper-left corner to publish the task flow.
View publishing history and logs
-
On the Task Orchestration page, click the name of the task flow.
-
Click Go to O&M in the upper-right corner.
-
View the history or logs:
-
Publishing history: On the Task Flow Information page, click the Published Tasks tab to see all published versions of the task flow.

-
Run logs: On the Running History tab, select Scheduling Trigger or Triggered Manually from the drop-down list to filter runs. The list shows all nodes in the task flow and their execution status.
Click View for a node to view the logs for the submission of the Lindorm Spark job, and obtain the job ID and SparkUI of the node.
-
If job submission fails, record the job ID and the Spark UI URL before submit a ticket.
Advanced settings
After changing any advanced setting in the DMS console, republish the task flow for the changes to take effect.
Configure scheduling
Set a scheduling policy to run the task flow automatically on a fixed schedule.
-
On the Task Orchestration page, click the name of the task flow.
-
In the lower-left corner, click Task Flow Information.

-
In the Scheduling Settings section on the right, turn on Enable Scheduling and configure the policy. The following table describes the parameters.
Parameter Description Scheduling Type Cyclic scheduling: runs on a repeating schedule (for example, every week). Schedule once: runs once at a specific time. Effective Time The period during which the scheduling policy is active. Defaults to January 1, 1970–January 1, 9999 (permanent). Scheduling Cycle Frequency of execution: Hour, Day, Week, or Month. Timed Scheduling How to define the trigger time within the cycle. Run at an interval: set Starting Time, Intervals (hours), and End Time. Run at the specified point in time: pick specific hours using the Specified Time field. Specified Time If Scheduling Cycle is Week, select days of the week. If Month, select days of the month. Specific Point in Time The exact time on specified days when the task flow runs (for example, 02:55).Cron expression Auto-generated cron expression based on the scheduling parameters above.
Example: To run a task flow at 00:00 and 12:00 every day:
-
Set Scheduling Type to Cyclic scheduling.
-
Set Scheduling Cycle to Hour.
-
In Timed Scheduling, select Specified Time, then select 0Hour and 12Hour.
Configure variables
For task flows with cyclic scheduling, configure time variables to pass dynamic date values to your jobs. For example, the built-in bizdate variable resolves to the day before the scheduled execution time.
-
On the task flow page, double-click the Lindorm Spark node, or click it and then click the
icon. -
In the right-side navigation pane, click Variable Setting.
-
On the Node Variable or Task Flow Variable tab, add the variable.
-
Reference the variable in the Job configuration section. For all available variables, see Variables.
Manage notifications
Enable notifications to receive alerts based on task flow execution results.
-
In the lower-left corner of the task flow page, click Notification Configurations.
-
Turn on the notification types you need:
-
Basic Notifications
-
Success Notification: triggers when the task flow completes successfully.
-
Failure Notification: triggers when the task flow fails.
-
-
Timeout Notification: triggers when the task flow execution time exceeds the configured timeout threshold.
-
Alert Notification: triggers when the task flow is about to start.
-
-
(Optional) Configure notification recipients. See Manage notification rules.
Execute SQL statements
-
Log on to the DMS console V5.0.
-
Click the Home tab.
-
In the left-side navigation pane, click the
icon to add an instance. -
In the Add Instance dialog box, select Lindorm_Compute in the NoSQL Database section.

-
Enter the Instance Region, Instance ID, Database Account, and Database password, then click Submit.
-
In the confirmation dialog box, click Submit to open the SQL Console.
-
On the SQLConsole tab, enter your SQL statement and click Execute.
What's next
-
For more about DMS task orchestration, see Overview.