Submits a serverless job to an Elastic High Performance Computing (E-HPC) cluster.
Debugging
Authorization information
The following table shows the authorization information corresponding to the API. The authorization information can be used in the Action policy element to grant a RAM user or RAM role the permissions to call this API operation. Description:
- Operation: the value that you can use in the Action element to specify the operation on a resource.
- Access level: the access level of each operation. The levels are read, write, and list.
- Resource type: the type of the resource on which you can authorize the RAM user or the RAM role to perform the operation. Take note of the following items:
- For mandatory resource types, indicate with a prefix of * .
- If the permissions cannot be granted at the resource level,
All Resourcesis used in the Resource type column of the operation.
- Condition Key: the condition key that is defined by the cloud service.
- Associated operation: other operations that the RAM user or the RAM role must have permissions to perform to complete the operation. To complete the operation, the RAM user or the RAM role must have the permissions to perform the associated operations.
| Operation | Access level | Resource type | Condition key | Associated operation |
|---|---|---|---|---|
| ehpc:SubmitServerlessJob | *All Resources * |
| none |
Request parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
| ClusterId | string | Yes | The cluster ID. You can call the ListClusters operation to query the cluster ID. | ehpc-hz-FYUr32**** |
| JobName | string | Yes | The name of the serverless job. Note
The name can contain lowercase letters, digits, and hyphens (-). It cannot start or end with a hyphen.
| testjob |
| ArrayProperties | object | No | The configuration of the array job. Note
The value of an array job index is passed to a serverless job container through the environment variable EHPC_ARRAY_TASK_ID. Users can access the container from business programs.
| |
| IndexStart | long | No | The starting value of the array job index. Valid values: 0 to 4999. | 1 |
| IndexEnd | long | No | The end value of the array job index. Valid values: 0 to 4999. The value must be greater than or equal to the value of IndexStart. | 5 |
| IndexStep | long | No | The interval of the array job index. Note
If the array job property is IndexStart=1,IndexEnd=5, and IndexStep=2, the array job contains three subtasks. The values of the subtask indexes are 1,3, and 5.
| 2 |
| JobPriority | long | No | The scheduling priority of the serverless job. Valid values: 0 to 999. A greater value indicates a higher priority. | 10 |
| EphemeralStorage | integer | No | The size of the temporary storage space added to the serverless job container. Unit: GiB. Note
By default, a space of 30 GiB is provided free of charge. If you require a larger space, you can pass this parameter to specify your required space size.
| 200 |
| Timeout | long | No | The validity period of the serverless job. After the validity period expires, the job is forcibly terminated. Unit: seconds. | 3600 |
| VSwitchId | array | No | The IDs of the vSwitches to which the serverless job container belongs. | |
| string | No | The vSwitch ID. Note
E-HPC supports only virtual private cloud (VPC) networks. You can call the DescribeVSwitches operation to query the created vSwitches.
| vsw-bp1gb5gf5546rn**** | |
| InstanceType | array | No | The Elastic Compute Service (ECS) instance types used by the serverless job container. | |
| string | No | The ECS instance type. | ecs.g7.8xlarge | |
| Cpu | float | No | The vCPU size of the serverless job container. Unit: cores. | 2 |
| Memory | float | No | The memory size of the serverless job container. Unit: GiB. | 4 |
| SpotStrategy | string | No | The bidding policy of the ECS instances. Valid values:
Default value: NoSpot. | SpotWithPriceLimit |
| SpotPriceLimit | float | No | The maximum hourly price of the preemptible elastic container instance. The value can be accurate to three decimal places. If you set SpotStrategy to SpotWithPriceLimit, you must specify the SpotPriceLimit parameter. | 0.062 |
| RamRoleName | string | No | The Resource Access Manamement (RAM) role that is associated with the Serverless job container. | testRamRoleName |
| Container | object | Yes | The properties of the Serverless job container. | |
| EnvironmentVar | array<object> | No | The value of the environment variable for the container. | |
| object | No | The value of the environment variable for the container. | ||
| Key | string | No | The name of the environment variable for the container. The name can be 1 to 128 characters in length and can contain letters, digits, and underscores (_). The name cannot start with a digit. Specify the name in the [0-9a-zA-Z] format. | PATH |
| Value | string | No | The value of the environment variable for the container. The value must be 0 to 256 bits in length. | /usr/local/bin |
| WorkingDir | string | No | The working directory of the container. | /usr/local/ |
| Image | string | Yes | The container image. | registry-vpc.cn-hangzhou.aliyuncs.com/ehpc/hpl:latest |
| Command | array | No | The container startup commands. | |
| string | No | The container startup command. | python3 | |
| Arg | array | No | The arguments of the container startup command. You can specify up to 10 arguments. | |
| string | No | The startup argument. | hello.py | |
| Gpu | integer | No | The number of GPUs used by the container. | 1 |
| VolumeMount | array<object> | No | The data volumes that are mounted to the container. | |
| object | No | The data volumes mounted to the container. | ||
| MountPath | string | No | The directory to which the volume is mounted. Note
The data stored in this directory is overwritten by the data on the volume. Exercise caution when you specify this parameter.
| /data |
| FlexVolumeDriver | string | No | The driver type when you use the FlexVolume plug-in to mount a volume. Valid values:
| alicloud/oss |
| FlexVolumeOptions | string | No | The options of the FlexVolume object. Each option is a key-value pair in a JSON string. | {"bucket":"hpctest","url": "oss-cn-hangzhou-internal.aliyuncs.com ","path":"/data","ramRole":"AliyunECSInstanceForEHPCRole"} |
| DependsOn | array<object> | No | The dependencies of the serverless job. | |
| object | No | The serverless job dependencies. | ||
| JobId | string | No | The ID of the dependent job. | 10 |
| Type | string | No | The dependency type. Valid values:
Default value: AfterSucceeded. | AfterAny |
| RetryStrategy | object | No | The retry policy of the serverless job. | |
| Attempts | integer | No | The number of retries for the serverless job. Valid values: 1 to 10. | 5 |
| EvaluateOnExit | array<object> | No | The retry rules for the serverless job. You can specify up to 10 rules. | |
| object | No | |||
| Action | string | No | The job action. Valid values:
| Retry |
| OnExitCode | string | No | The job exit code, which is used together with Action to form a job retry rule. Valid values: 0 to 255. | 10 |
Response parameters
Examples
Sample success responses
JSONformat
{
"JobId": 10,
"RequestId": "04F0F334-1335-436C-A1D7-6C044FE73368"
}Error codes
| HTTP status code | Error code | Error message | Description |
|---|---|---|---|
| 400 | InvalidParams | The specified parameter %s is invalid. | The specified parameter %s is invalid. |
| 400 | NotEnabled | You have not enabled this service | You have not enabled this service |
| 400 | InDebt | Your account has overdue payments. | Your account has overdue payments. |
| 403 | InvalidClusterStatus | The operation failed due to invalid cluster status. | The cluster status does not support the operation. |
| 403 | ConflictOpt | A conflicting operation is running. | A conflicting operation is running. Please try again later. |
| 404 | ClusterNotFound | The specified cluster does not exist. | The specified instance does not exist. |
| 406 | EcsError | An error occurred while calling the ECS API operation. | ECS API call error. %s |
| 406 | DbError | A database service error occurred. | Database request failed. |
| 406 | AliyunError | An Alibaba Cloud product error occurred. | Alibaba Cloud product error. %s |
| 406 | AgentError | The agent service request failed: %s | Operation unsuccessful: %s |
| 406 | ServiceAPIError | Failed to call the operation. Cause:%s | An error occurred while calling the API. %s |
| 407 | NotAuthorized | You are not authorized by RAM for this request. | The request is not authorized by RAM. |
| 409 | PartFailure | Part of the batch operation failed. | Part of the batch operation failed. |
| 500 | UnknownError | An unknown error occurred. | An unknown error occurred. |
| 503 | ServiceUnavailable | The request has failed due to a temporary failure of the server | The request has failed due to a temporary failure of the server. |
For a list of error codes, visit the Service error codes.
Change history
| Change time | Summary of changes | Operation |
|---|---|---|
| 2023-09-06 | The Error code has changed. The request parameters of the API has changed | View Change Details |
| 2023-07-25 | The Error code has changed | View Change Details |
| 2023-07-21 | The Error code has changed | View Change Details |
