Submits a job in a cluster.

Description

Before you submit a job in a cluster, you must upload a job data file, for example, job.sh, to the cluster. For more information, see CreateJobFile.

Debugging

OpenAPI Explorer automatically calculates the signature value. For your convenience, we recommend that you call this operation in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operation for different SDKs.

Request parameters

Parameter Type Required Example Description
Action String Yes SubmitJob

The operation that you want to perform. Set the value to SubmitJob.

ClusterId String Yes ehpc-hz-FYUr32****

The ID of the cluster.

You can call the ListClusters operation to query the cluster ID.

CommandLine String Yes ./LammpsTest/lammps.pbs

The command that is used to run the job.

RunasUser String Yes root

The name of the user that runs the job.

You can call the ListUsers operation to query the users of the cluster.

RunasUserPassword String Yes 12****

The password of the user.

Name String No job1

The name of the job. The name must be 6 to 30 characters in length start with a letter. It can contain letters, digits, and periods (.).

Priority Integer No 0

The priority of the job. Valid values: 0 to 9. A large value indicates a high priority.

Default value: 0

PackagePath String No ./Tem

The path that is used to run the job.

StdoutRedirectPath String No ./LammpsTest

The output file path of stdout.

StderrRedirectPath String No ./LammpsTest

The output file path of stderr.

ReRunable Boolean No false

Specifies whether the job can be rerun. Valid values:

  • true: The job can be rerun.
  • false: The job cannot be rerun.
ArrayRequest String No 1-10:2

The job array.

Format: X-Y:Z. X is the minimum index value. Y is the maximum index value. Z is the step size. For example, 2-7:2 indicates that three jobs need to be run and their index values are 2, 4, and 6.

Variables String No [{Name:,Value:},{Name:,Value:}]

The runtime variables passed to the job. They can be accessed by using environment variables in the executable file.

InputFileUrl String No https://ehpc-hangzhou.oss-cn-hangzhou.aliyuncs.com/test-u4****/testlist_ehpc.sh

The URL of the job files that are uploaded to an Object Storage Service (OSS) bucket.

UnzipCmd String No tar xzf

The command that is used to decompress the job files downloaded from an OSS bucket.

PostCmdLine String No example.sh

The command that is used to perform subsequent operations on the job after the job is submitted.

ContainerId String No ehpc-container-uerfrfffff****

The ID of the container application. If you want to use a container application, you must specify its ID.

You can call the ListContainerApps operation to query the container application ID.

JobQueue String No workq

The name of the queue in which the job is run.

You can call the ListQueues operation to query the queue name.

Node Integer No 2

The number of compute nodes required to run the job.

Note If the parameter is not specified, the Task, Thread, Mem, and Gpu parameters become invalid.
Task Integer No 2

The number of tasks required by a single compute node.

Thread Integer No 1

The number of threads required by a single compute node.

Mem String No 1GB

The maximum memory usage required by a single compute node. Unit: GB, MB, or KB. The unit is case-insensitive.

Gpu Integer No 1

The maximum GPU usage required by a single compute node.

The parameter only takes effect when the cluster uses PBS and a compute node is a GPU-accelerated instance.

ClockTime String No 12:00:00

The maximum running time of the job. Format:

  • hh:mm:ss
  • mm:ss
  • ss

We recommend that you use the hh:mm:ss format. If the maximum running time is 12 hours, the value is shown as 12:00:00.

Response parameters

Parameter Type Example Description
JobId String 1.manager

The ID of the job.

RequestId String 04F0F334-1335-436C-A1D7-6C044FE7****

The ID of the request.

Examples

Sample requests

https://ehpc.cn-hangzhou.aliyuncs.com/?Action=SubmitJob
&ClusterId=ehpc-hz-FYUr32****
&CommandLine=./LammpsTest/lammps.pbs
&RunasUser=root
&RunasUserPassword=12****
&<Common request parameters>

Sample success responses

XML format

<SubmitJobResponse>
      <RequestId>04F0F334-1335-436C-A1D7-6C044FE7****</RequestId>
      <JobId>1.manager</JobId>
</SubmitJobResponse>

JSON format

{
    "RequestId": "04F0F334-1335-436C-A1D7-6C044FE7****",
    "JobId": "1.manager"
}

Error codes

HttpCode Error code Error message Description
400 InvalidParams The specified parameter %s is invalid. The error message returned because the following parameter is invalid: %s.
400 NotEnabled You have not enabled this service The error message returned because the service has not been activated for your account.
400 InDebt Your account has overdue payments. The error message returned because your account has overdue payments.
403 InvalidClusterStatus The operation failed due to invalid cluster status. The error message returned because the operation is not supported while the cluster is in the current state.
403 ConflictOpt A conflicting operation is running. The error message returned because an operation that conflicts with the current operation is in progress. Try again later.
403 UsernameExist The username already exists. The error message returned because the username already exists.
403 IncorrectCredential The username or password is incorrect. The error message returned because the username or password is invalid.
404 ClusterNotFound The specified cluster does not exist. The error message returned because the specified cluster does not exist.
404 ContainerNotFound The specified container does not exist. The error message returned because the specified container application does not exist.
407 NotAuthorized You are not authorized by RAM for this request. The error message returned because you are not authorized by RAM for this request.
406 AgentError The agent service request failed. The error message returned because the proxy request has failed.
406 AgentError.Job.SubmitFailure Failed to submit jobs: %s The error message returned because the following jobs have failed to be submitted: {}.
406 AgentError.Job.InvalidContainerType Unsupported container type: %s. The error message returned because the type of the specified container application is invalid: %s.
409 PartFailure Part of the batch operation failed. The error message returned because the batch operation has failed.
500 UnknownError An unknown error occurred. The error message returned because an unknown error has occurred.
404 ManagerNotFound The manager nodes do not exist or their status is abnormal. The error message returned because the management node does not exist or is not running as expected.
403 AgentError.Account.ValidateCredentialFailure Username or password verification failed. The error message returned because the username or password has failed to be verified.
406 AliyunError An Alibaba Cloud product error occurred. The error message returned because the operation has failed to call another Alibaba Cloud service.
503 ServiceUnavailable The request has failed due to a temporary failure of the server The error message returned because the request has failed. The service is temporarily unavailable.

For a list of error codes, visit the API Error Center.