Creates a job that runs in a cluster. You can configure the data source, code source, startup command, and computing resources of each node on which a job runs.
Operation description
Before you call this operation, make sure that you understand the billing methods and pricing of Deep Learning Containers (DLC) of Platform for AI (PAI).
Debugging
Authorization information
The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:
- Operation: the value that you can use in the Action element to specify the operation on a resource.
- Access level: the access level of each operation. The levels are read, write, and list.
- Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- For mandatory resource types, indicate with a prefix of * .
- If the permissions cannot be granted at the resource level,
All Resourcesis used in the Resource type column of the operation.
- Condition Key: the condition key that is defined by the cloud service.
- Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
| Operation | Access level | Resource type | Condition key | Associated operation |
|---|---|---|---|---|
| paidlc:CreateJob | create | *All Resources * |
| none |
Request syntax
POST /api/v1/jobs HTTP/1.1
Request parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
| body | object | No | The request parameters. | |
| DisplayName | string | Yes | The job name. The name must be in the following format:
| tf-mnist-test |
| JobType | string | Yes | The job type. The value is case-sensitive. The following job types are supported:
Valid values and corresponding frameworks:
| TFJob |
| JobSpecs | array | Yes | JobSpecs describes the configurations for job running, such as the image address, startup command, node resource declaration, and number of replicas. A DLC job consists of different types of nodes. If nodes of the same type have exactly the same configuration, the configuration is called JobSpec. JobSpecs specifies the configurations of all types of nodes. The value is of the array type. | |
| JobSpec | No | The runtime configurations of the job. | ||
| UserCommand | string | Yes | The startup command for all nodes of the job. | python /root/code/mnist.py |
| DataSources | array<object> | No | The data sources for job running. | |
| object | No | All data sources of the job. The data source is mounted to the local path of the container in which each node resides based on the data source configuration. The local path is specified by the MountPath parameter in the data source configuration. The process in the startup command of the job directly accesses the distributed file system that resides in the path specified by the MountPath parameter. Each data source is a distributed file system. | ||
| DataSourceId | string | No | The data source ID. | d-cn9dl******* |
| MountPath | string | No | The path to which the job is mounted. By default, the mount path in the data source configuration is used. This parameter is optional. | /root/data |
| Uri | string | No | The data source path. | oss://bucket.oss-cn-hangzhou-internal.aliyuncs.com/path/ |
| Options | string | No | The mount attribute of the custom dataset. Set the value to OSS. | { "fs.oss.download.thread.concurrency": "10", "fs.oss.upload.thread.concurrency": "10", "fs.jindo.args": "-oattr_timeout=3 -oentry_timeout=0 -onegative_timeout=0 -oauto_cache -ono_symlink" } |
| CodeSource | object | No | The code source of the job. Before the node of the job runs, DLC automatically downloads the configured code from the code source and mounts the code to the local path of the container. | |
| CodeSourceId | string | No | The ID of the code source. | code-20210111103721-xxxxxxx |
| Branch | string | No | The branch of the referenced code repository. By default, the branch configured in the code source is used. This parameter is optional. | master |
| Commit | string | No | The commit ID of the code to be downloaded. By default, the commit ID configured in the code source is used. This parameter is optional. | 44da109b5****** |
| MountPath | string | No | The path to which the job is mounted. By default, the mount path configured in the data source is used. This parameter is optional. | /root/data |
| UserVpc | object | No | The VPC settings. | |
| VpcId | string | No | The VPC ID. | vpc-abcdef**** |
| SwitchId | string | No | The vSwitch ID. This parameter is optional.
| vs-abcdef**** |
| SecurityGroupId | string | No | The ID of the security group. | sg-abcdef**** |
| ExtendedCIDRs | array | No | The extended CIDR block.
| |
| string | No | The extended CIDR block. | 192.168.0.1/24 | |
| DefaultRoute | string | No | The default route. Default value: false. Valid values:
| eth0 |
| ThirdpartyLibs | array | No | The third-party Python libraries to be installed. | |
| string | No | The third-party Python library and its version. Example: | numpy==1.16.1 | |
| ThirdpartyLibDir | string | No | The folder in which the third-party Python library file requirements.txt is stored. Before the startup command specified by the UserCommand parameter is run on each node, DLC fetches the requirements.txt file from the folder and runs | /root/code/ |
| Envs | object | No | The environment variables. | |
| string | No | The environment variable in the | ENABLE_DEBUG_MODE | |
| JobMaxRunningTimeMinutes | long | No | The maximum running duration of the job. Unit: minutes. | 1024 |
| WorkspaceId | string | No | The workspace ID. | ws-20210126170216-xxxxxxx |
| ResourceId | string | No | The ID of the resource group. This parameter is optional.
| rs-xxx |
| Priority | integer | No | The priority of the job. Default value: 1. Valid values: 1 to 9.
| 8 |
| Settings | JobSettings | No | The additional parameter configurations of the job. | |
| ElasticSpec | JobElasticSpec | No | This parameter is not supported. | |
| DebuggerConfigContent | string | No | This parameter is not supported. | “” |
| Options | string | No | The additional configuration of the job. You can use this parameter to adjust the behavior of the attached data source. For example, if the attached data source of the job is of the OSS type, you can use this parameter to add the following configurations to override the default parameters of JindoFS: | key1=value1,key2=value2 |
| SuccessPolicy | string | No | The policy that is used to check whether a distributed multi-node job is successful. Only TensorFlow distributed multi-node jobs are supported.
| AllWorkers |
| CredentialConfig | CredentialConfig | No | The access credential configuration. | |
| Accessibility | string | No | The job visibility. Valid values:
| PRIVATE |
Response parameters
Examples
Sample success responses
JSONformat
{
"JobId": "dlc7*******",
"RequestId": "473469C7-AA6F-4DC5-B3DB-xxxxxxx"
}Error codes
For a list of error codes, visit the Service error codes.
Change history
| Change time | Summary of changes | Operation |
|---|---|---|
| 2024-12-18 | The internal configuration of the API is changed, but the call is not affected | View Change Details |
| 2024-08-09 | The internal configuration of the API is changed, but the call is not affected | View Change Details |
| 2024-07-05 | The internal configuration of the API is changed, but the call is not affected | View Change Details |
| 2023-12-08 | The internal configuration of the API is changed, but the call is not affected | View Change Details |
| 2023-09-11 | The internal configuration of the API is changed, but the call is not affected | View Change Details |
