All Products
Search
Document Center

Batch Compute:Quick start for console

Last Updated:Feb 20, 2024

This section describes how to use the console to submit a job. The job aims to count the number of times INFO, WARN, ERROR, and DEBUG appear in a log file.

Note

Make sure that you have signed up Batch Compute service in advance.

Contents:

  1. Prepare a job.

    1.1. Upload the data file to the OSS.

    1.2. Upload the task program to the OSS.

  2. Use the console to submit the job.

  3. Check the job status.

  4. Check the result.

1. Prepare a job

The job aims to count the number of times INFO, WARN, ERROR, and DEBUG appear in a log file.

This job contains the following tasks:

  • The split task is used to divide the log file into three parts.

  • The count task is used to count the number of times INFO, WARN, ERROR, and DEBUG appear in each part of the log file. In the count task, InstanceCount must be set to 3, indicating that three count tasks are started concurrently.

  • The merge task is used to merge all the count results.

DAG:

DAG

1.1 Upload data file to OSS

Download the data file used in this example: log-count-data.txt

Upload the log-count-data.txt file to:

oss://your-bucket/log-count/log-count-data.txt
  • your-bucket indicates the bucket created by yourself. In this example, it is assumed that the region is cn-shenzhen.

  • For more information about how to upload data to the OSS, see Upload files to OSS and Common OSS tools.

1.2 Upload task program to OSS

The job program used in this example is compiled using Python. Download the program log-count.tar.gz.

In this example, it is unnecessary to modify the sample codes. You can directly upload log-count.tar.gz to the OSS, for example:

Note

oss://your-bucket/log-count/log-count.tar.gz.

The upload method has been described earlier.

  • Batch Compute supports only the compressed packages with the extension tar.gz. Make sure that you use the preceding method (gzip) for packaging; otherwise, the package cannot be parsed.

  • If you must modify codes, decompress the file, modify the codes, and then follow these steps to pack the modified codes:

The command is as follows:

> cd log-count  # Switch to the directory.
> tar -czf log-count.tar.gz * # Pack all files under this directory to log-count.tar.gz.

You can run the following command to check the content of the compressed package:

$ tar -tvf log-count.tar.gz

The following list is displayed:

conf.py
count.py
merge.py
split.py

2. Use console to submit job

  1. Log on to the Batch Compute console.

  2. Choose Job List > Submit Job, and submit the job. Select an appropriate region, which must be the same as the region of the bucket.

    Here, AutoCluster is used to submit a job. For AutoCluster, you must configure at least two parameters, including:

    • Available image ID. You can use the image provided by the system or custom an image. For more information about how to custom an image, see Use an image.

    • InstanceType. For more information about the instance type, see Currently supported instance types.

    To run this example, you also need to change PackagePath (OSS directory to which the job is packed and uploaded. It is oss://your-bucket/log-count/log-count.tar.gz in this example),

    StdoutRedirectPath, and StderrRedirectPath (output address of task results and errors) to the corresponding OSS directory (oss://your-bucket/log-count/logs/ in this example).

    The following shows the JSON template of the job. For more information about parameters, click here.

     {
       "DAG": {
         "Dependencies": {
           "split": [
             "count"
           ],
           "count": [
             "merge"
           ],
           "merge": []
         },
         "Tasks": {
           "split": {
             "InstanceCount": 1,
             "LogMapping": {},
             "AutoCluster": {
               "Configs": {
                 "Networks": {
                   "VPC": {
                     "CidrBlock": "192.168.0.0/16"
                   }
                 }
               },
               "ResourceType": "OnDemand",
               "InstanceType": "ecs.sn1ne.large",
               "ImageId": "img-ubuntu-vpc"
             },
             "Parameters": {
               "Command": {
                 "EnvVars": {},
                 "CommandLine": "python split.py",
                 "PackagePath": "oss://your-bucket/log-count/log-count.tar.gz"
               },
               "InputMappingConfig": {
                 "Lock": true
               },
               "StdoutRedirectPath": "oss://your-bucket/log-count/logs/",
               "StderrRedirectPath": "oss://your-bucket/log-count/logs/"
             },
             "InputMapping": {
               "oss://your-bucket/log-count/": "/home/input/"
             },
             "OutputMapping": {
               "/home/output/": "oss://your-bucket/log-count/"
             },
             "MaxRetryCount": 0,
             "Timeout": 21600,
             "ClusterId": ""
           },
           "merge": {
             "InstanceCount": 1,
             "LogMapping": {},
             "AutoCluster": {
               "Configs": {
                 "Networks": {
                   "VPC": {
                     "CidrBlock": "192.168.0.0/16"
                   }
                 }
               },
               "ResourceType": "OnDemand",
               "InstanceType": "ecs.sn1ne.large",
               "ImageId": "img-ubuntu-vpc"
             },
             "Parameters": {
               "Command": {
                 "EnvVars": {},
                 "CommandLine": "python merge.py",
                 "PackagePath": "oss://your-bucket/log-count/log-count.tar.gz"
               },
               "InputMappingConfig": {
                 "Lock": true
               },
               "StdoutRedirectPath": "oss://your-bucket/log-count/logs/",
               "StderrRedirectPath": "oss://your-bucket/log-count/logs/"
             },
             "InputMapping": {
               "oss://your-bucket/log-count/": "/home/input/"
             },
             "OutputMapping": {
               "/home/output/": "oss://your-bucket/log-count/"
             },
             "MaxRetryCount": 0,
             "Timeout": 21600,
             "ClusterId": ""
           },
           "count": {
             "InstanceCount": 3,
             "LogMapping": {},
             "AutoCluster": {
               "Configs": {
                 "Networks": {
                   "VPC": {
                     "CidrBlock": "192.168.0.0/16"
                   }
                 }
               },
               "ResourceType": "OnDemand",
               "InstanceType": "ecs.sn1ne.large",
               "ImageId": "img-ubuntu-vpc"
             },
             "Parameters": {
               "Command": {
                 "EnvVars": {},
                 "CommandLine": "python count.py",
                 "PackagePath": "oss://your-bucket/log-count/log-count.tar.gz"
               },
               "InputMappingConfig": {
                 "Lock": true
               },
               "StdoutRedirectPath": "oss://your-bucket/log-count/logs/",
               "StderrRedirectPath": "oss://your-bucket/log-count/logs/"
             },
             "InputMapping": {
               "oss://your-bucket/log-count/": "/home/input/"
             },
             "OutputMapping": {
               "/home/output/": "oss://your-bucket/log-count/"
             },
             "MaxRetryCount": 0,
             "Timeout": 21600,
             "ClusterId": ""
           }
         }
       },
       "Description": "batchcompute job",
       "Priority": 0,
       "JobFailOnInstanceFail": true,
       "Type": "DAG",
       "Name": "log-count"
     }
    • Check that all parameters and directories are correct, click Submit Job in the lower left corner, and then click OK.

3. Check job status

  • Click the newly submitted job log-count in the job list to view the details of this job.

    Job details

  • Click the task name split to view the details of this task.

    task details

  • Click the green block to view the instance log.

    View the log

4. Check job execution result

You can log on to the OSS console and check the following file under your bucket: /log-count/merge_result.json.

The expected result is as follows:

{"INFO": 2460, "WARN": 2448, "DEBUG": 2509, "ERROR": 2583}