This topic explains how to create a Shell-type offline computing task using Dataphin.
Procedure
On the Dataphin home page, navigate to the top menu bar and select Development > Data Development.
On the Development page, select Project from the top menu bar (Dev-Prod mode requires environment selection).
In the left navigation pane, choose Data Processing > Script Task. In the Script Task list, click the
icon and select Shell.In the Create Shell Task dialog box, configure the following parameters.
Parameter
Description
Task Name
Enter the offline computing task name.
The length must not exceed 256 characters and must not include vertical lines (|), forward slashes (/), backslashes (\), colons (:), question marks (?), angle brackets (<>), asterisks (*), or quotation marks (").
Schedule Type
Choose the Schedule Type for the task. Available Schedule Types include the following:
Recurring Task: Automatically included in the system's periodic scheduling.
One-Time Task: Requires manual initiation to execute.
Select Directory
Select the directory that contains the task.
If no directory exists, you can Create Folder using the following steps:
Click the
icon above the task list on the left to open the Create Folder dialog box.In the Create Folder dialog box, specify the folder Name and choose the desired Select Directory location.
Click Confirm.
Use Template
Single-click the Use Template switch to decide on using a code template. When enabled, you must also Select Template and specify the Template Version.
Leverage reference code templates for streamlined development. These templates are read-only and cannot be modified. Simply configure the necessary parameters to finalize your code. For more information, see create an offline computing template.
Python Third-Party Packages
Choose one or more third-party Python packages to incorporate. For more information, see install Python module.
NoteAfter incorporating a third-party module into the Python third-party package, it must be declared in the task to enable its import within the code. This can be configured and modified under the computing task properties > Python third-party package configuration.
Description
Provide a brief description of the task, up to 1000 characters.
Click Confirm.
In the code editor on the current Shell task tab, write the code for the Shell offline computing task and click Run above the code editor.
In the sidebar on the right, click Property. In the Property panel, you can configure the Basic Information, Runtime Resources, Python Module, Runtime Parameter, Scheduling Properties (for recurring tasks), Schedule Dependency (for recurring tasks), Runtime Configuration, and Resource Configuration.
Basic Information
This section describes how to define the name, responsible individual, description, and other fundamental details of a scheduling task. For guidance on configuration, see configure basic task information.
Running Resources
This service allocates CPU and memory resources to support the execution of the current computing task, with a default allocation of 0.1 core 256MB. For guidance on configuring these resources, see configure offline task running resources.
Python Third-party Package
Select the required Python third-party packages. For more information, see install Python module.
Runtime Parameter
If your task involves parameter variables, you can set values for these parameters in the properties section. This allows for the automatic substitution of parameter variables with their respective values during node scheduling. For guidance on setting this up, see parameter configuration and node parameter usage.
Scheduling Properties (for recurring tasks)
For offline computing tasks set as Recurring Task, you must configure the task's scheduling properties in addition to its Basic Information. For guidance on configuration, see configure scheduling properties.
Schedule Dependency (for recurring tasks)
For Recurring Task scheduling types in offline computing, you must configure the task's schedule dependency in addition to its Basic Information. For guidance on setting up schedule dependencies, refer to configure schedule dependency.
Running Configuration
You can set the task-level running timeout and rerun policy for your offline computing task to suit your business needs. In the absence of specific configurations, it will default to the tenant-level settings. For guidance on setting these parameters, see configure computing task running settings.
Resource Configuration
You can assign a scheduling resource group to the task. When the task runs, it uses the resource quota from this group. For more information, see Configure resources for a computing task.
Save and submit the task.
Click the
icon above the code editor to save the code.Click the
icon to submit the code.
In the Submitting Log page, you need to confirm the Submission Content and the results of the Pre-check, and fill in the remarks. For more information, see For more information, see offline computing task submission instructions.
Once confirmed, click Confirm And Submit.
What to do next
In Dev-Prod mode, upon successful task submission, proceed to the release list to promote the task to the production environment. For more information, see manage release tasks.
In Basic mode, the Shell task is scheduled in the production environment after you submit it. You can go to the Operation Center to view the published task. For more information, see View and manage script tasks, , View and manage one-time tasks.