The Data Development page is a place where you can design data computing processes according to your business needs and make mutually dependent tasks be automatically run in the scheduling system.
In the data development stage, DataWorks provides four types of objects for you to choose as per your needs: task, script, resource, and function. The project relationship among these objects is as follows:
Task: Tasks are the main objects of data development, including periodic properties and dependencies, and serve as a main carrier of data computing. Various types of tasks and nodes are supported for different scenarios. For more information, see Task type overview.
Script: Scripts are the auxiliary objects of data development, excluding the periodic properties and dependencies. Scrips are mainly used to process non-periodic temporary data, such as adding, deleting, and modifying temporary tables. For more information, see Script development.
Function and resource: Files and computing functions that need to be referenced when running the codes in a task must be uploaded to the computing space (MaxCompute) before the task is run. For more information, see Resource management and Function management.
The following figure shows how a task is developed and used:
For details on how to proceed with these steps, see Instructions.
From the preceding process, we can see that DataWorks provides four running modes to make the computing statements in a task take effect. The use cases and limitations are as follows:
|Procedure||Trigger mode||If instances are generated in the O&M center||Scheduling property||Use case||Note|
|Page direct run||Manual||No||Not subject to scheduling period and dependency||Suitable for the code debugging stage. Saving or submitting is not required||Scripts and tasks are supported. Supported task types only include ODPS_SQL, OPEN_MR, ODPS_MR, and SHELL.|
|Test Run||Manual||Yes||Subject to scheduling period but not to dependency||Suitable for checking parameter replacements and code running||Only tasks are supported, and the latest submitted version is used.|
|System automatic run||Automatic||Yes||Subject to scheduling period and dependency||A main method to automatically compute data by using DataWorks. Maintainers in the O&M center are required to maintain all the periodic instances and make sure they are run in sequence||Only tasks are supported, and the latest submitted version is used.|
|Data completing run||Manual||Yes||Subject to scheduling period and dependency||A supplement to system automatic run. Used when some newly created or wrong tasks need to trigger the data computing for a period before today||Only tasks are supported, and the latest submitted version is used.|