All Products
Search
Document Center

Batch Compute:Quick start for cli 2

Last Updated:Jun 13, 2024

This section describes how to use the Batch Compute-cli tool to submit a job that counts the occurrences of “INFO”, “WARN”, “ERROR”, and “DEBUG” in a log file.

Preparation

Note: Make sure that you have signed up Batch Compute service in advance.

Contents:

  • Install and configure the Batch Compute-cli tool

  • Prepare a job

    • Upload the data file to the OSS

    • Prepare task programs

  • Submit the job

  • Check the job running status

  • Check the job execution result

Click Preparation for the installation and configuration of the Batch Compute-cli tool.

2. Prepare a job

The job aims to count the occurrences of “INFO”, “WARN”, “ERROR”, and “DEBUG” in a log file.

This job contains the following tasks:

  • The split task is used to divide the log file into three parts.

  • The count task is used to count the number of times “INFO”, “WARN”, “ERROR”, and “DEBUG” appear in each part of the log file. In the count task, InstanceCount must be set to 3, indicating that three count tasks are started concurrently.

  • The merge task merges all the results of the count task.

DAG

2.1. Upload data file to the OSS

Download the data file used in this example: log-count-data.txt

Upload the log-count-data.txt file to:

oss://your-bucket/log-count/log-count-data.txt
  • your-bucket indicates the bucket you created. In this example, the region is cn-shenzhen.

bcs oss upload ./log-count-data.txt oss://your-bucket/log-count/log-count-data.txt

bcs oss cat oss://your-bucket/log-count/log-count-data.txt  # Check whether the file is uploaded successfully
  • The bcs oss command can complete some typical actions related to your OSS instance. bcs oss -h shows the help information about this command. We recommend that you use this command when only a few data is to be tested. In the case of a large amount of data, the upload or download takes a long time because multithreading is not implemented yet. For more information about how to upload data to OSS instances, see OSS tools.

2.2 Prepare task programs

The job program used in this example is complied using Python. Download the program: log-count.tar.gz.

Decompress the program package into the following directory:

mkdir log-count && tar -xvf log-count.tar.gz -C log-count

After decompression, the log-count/ directory structure is as follows:

log-count
  |-- conf.py    # Configuration
  |-- split.py       # split task program
  |-- count.py      # count task program
  |-- merge.py    # merge task program
Note: Do not change the task programs.

3. Submit job

3.1. Compile job configuration

In the parent directory of log-count, create a file: job.cfg (under the same parent directory as log-count). The file contains the following content:

[DEFAULT]
job_name=log-count
description=demo
pack=./log-count/
deps=split->count;count->merge

[split]
cmd=python split.py

[count]
cmd=python count.py
nodes=3

[merge]
cmd=python merge.py

The file describes a multi-task job, with tasks executed in the following sequence: split->count->merge.

  • For more information about task description in a .cfg file, see Multiple tasks.

3.2. Submit the job

bcs sub --file job.cfg -r oss://your-bucket/log-count/:/home/input -w oss://your-bucket/log-count/:/home/output
  • In the command, -r and -w indicate read-only directory attaching and writable directory mapping respectively. For more information, see Access data on OSS.

  • The same OSS path can be attached to different local directories, but different OSS paths cannot be attached to the same local directory.

4. Check job running status

bcs j   # Obtain the job list. The job list obtained each time is cached. Generally, the first job in the cache is the one you exactly submitted.
bcs ch 1  # Check the status of the first job in the cache.
bcs log 1 # Check the log of the first job in the cache.

5. Check job execution result

After the job is executed, run the following command to check the result on OSS.

bcs oss cat oss://your-bucket/log-count/merge_result.json

The expected result is as follows:

{"INFO": 2460, "WARN": 2448, "DEBUG": 2509, "ERROR": 2583}