All Products
Search
Document Center

Platform For AI:Manage training jobs

Last Updated:Jan 08, 2024

After you create a training job, you can stop, clone, share, delete the job or generate a script for the job.

Stop a training job

If a job has an invalid configuration, conflicts with other jobs, or is running for a long period of time, you can stop the job from running. To stop a job, go to the Distributed Training Jobs page, click Stop in the Actions column of the job. image.png

Clone a training job

If you want to create a new job that uses the same configurations that you specified for an existing job, you can clone the existing job to save time. To clone a job, click Clone in the Actions column of the job. image.png

Share a training job

You can click Share on the Details page and share the job link to other workspace members so that they can view the progress, logs, and results of the training job. image.png

Generate a script for a training job

If you want to submit the same training job by using the Deep Learning Containers (DLC) client, you can click Generate Script on the Details page to generate a command line script for the job. You can use the commands in the script to submit the job. For more information about how to use the DLC client, see The DLC client. image.png

Delete a training job

You can click Delete in the Actions column to delete training jobs that you no longer need to save storage space and resources.

Warning

DLC jobs that are deleted cannot be restored. Perform the delete operation with caution.

image.png