All Products
Search
Document Center

:Submit job in a Docker container

Last Updated:Dec 26, 2022

Contents:

  • Before you start

  • Prepare a job

    • Upload data file to OSS

    • Prepare task programs

  • Submit job

    • Compile job configuration

    • Run command to submit job

  • Check job running status

  • Check job execution result

1. Before you start

In Batch Compute, the procedure for submitting a job in a Docker container is similar to that in an environment without a Docker container, except for the following differences:

  • The image specified by ImageId must support Docker.

    You only need to set the ImageId field in the task description to the ID of the Batch Compute public image (supporting Docker, with ID of img-ubuntu) or specify the ID of the cluster with this image ID set.

  • The following two environment variables (EnvVars) are added to the task description:

    Parameter

    Description

    Required

    BATCH_COMPUTE_DOCKER_IMAGE

    Docker image name

    No

    BATCH_COMPUTE_DOCKER_REGISTRY_OSS_PATH

    Storage path of the Docker image in OSS-Registry

    No

    • If the task description does not contain the BATCH_COMPUTE_DOCKER_IMAGE parameter, no Docker container is used. In this case, the BATCH_COMPUTE_DOCKER_REGISTRY_OSS_PATH parameter is ignored.

    • If the task description contains BATCH_COMPUTE_DOCKER_IMAGE, a Docker container is used.

2. Prepare a job

Use Python to compile a job that counts the number of times INFO, WARN, ERROR, and DEBUG appear in a log file.

This job contains the following tasks:

  • The split task is used to divide the log file into three parts.

  • The count task is used to count the number of times INFO, WARN, ERROR, and DEBUG appear in each part of the log file. In the count task, InstanceCount must be set to 3, indicating that three count tasks are started concurrently.

  • The merge task is used to merge all the count results.

DAG

DAG

2.1. Upload data file to OSS

Download the data file used in this example log-count-data.txt.

Upload the log-count-data.txt file to oss://your-bucket/log-count/log-count-data.txt.

  • your-bucket indicates the bucket you created. In this example, the region is cn-shenzhen.

  • To upload the file to the OSS, see Upload files to the OSS.

2.2. Prepare task programs

The job program used in this example is complied using Python. Download the program: log-count.tar.gz.

Decompress the program package into the following directory:

mkdir log-count && tar -xvf log-count.tar.gz -C log-count

After decompression, the log-count/ directory structure is as follows:

log-count
   |-- conf.py    # Configuration
   |-- split.py       # split task program
   |-- count.py      # count task program
   |-- merge.py    # merge task program
Note

Note: Do not change the task programs.

3. Submit job

You can submit the job by using the Python SDK or Java SDK or in the console. In this example, the job is submitted using the command line tool.

3.1. Compile job configuration

In the parent directory of log-count, create a file: job.cfg (under the same parent directory as log-count). The file contains the following content:

[DEFAULT]
job_name=log-count
description=demo
pack=./log-count/
deps=split->count;count->merge

[split]
cmd=python split.py

[count]
cmd=python count.py
nodes=3

[merge]
cmd=python merge.py

The file describes a multi-task job, with tasks executed in the following sequence: split->count->merge.

3.2. Run command to submit job

bcs sub --file job.cfg -r oss://your-bucket/log-count/:/home/input -w oss://your-bucket/log-count/:/home/output --docker localhost:5000/myubuntu@oss://your-bucket/dockers/
  • In the command, -r and -w indicate read-only directory attaching and writable directory mapping respectively. For more information, see OSS directory attaching.

  • The same OSS path can be attached to different local directories, but different OSS paths cannot be attached to the same local directory.

  • --docker means that a Docker image is used, with the image name in the following format: image_name@storage_oss_path. OSS automatically sets the Docker image name and repository address in environment variables.

Note

Note: The region specified for BCS must be the same as the region where the Docker container is located.

4. Check job running status

bcs j   # Obtain the job list. The job list obtained each time is cached. Generally, the first job in the cache is the one you exactly submitted.
bcs ch 1  # Check the status of the first job in the cache.
bcs log 1 # Check the log of the first job in the cache.

5. Check job execution result

After the job is executed, run the following command to check the result on OSS.

bcs oss cat oss://your-bucket/log-count/merge_result.txt

The expected result is as follows:

{"INFO": 2460, "WARN": 2448, "DEBUG": 2509, "ERROR": 2583}