This topic describes how to use the MaxCompute client to run and submit a MapReduce job.

The MaxCompute client provides a jar command to run MapReduce jobs. The syntax of the command is as follows:
        -conf <configuration_file>         Specify an application configuration file
        -resources <resource_name_list>    file\table resources used in mapper or reducer, seperate by comma
        -classpath <local_file_list>       classpaths used to run mainClass
        -D <name>=<value>                  Property value pair, which will be used to run mainClass
        -l                                 Run job in local mode
jar -conf \home\admin\myconf -resources a.txt,example.jar -classpath ..\lib\example.jar:.\other_lib.jar -D java.library.path=.\native;
<GENERIC_OPTIONS> includes the following optional parameters:
  • -conf <configuration file>: specifies the JobConf configuration file.
  • -resources <resource_name_list>: specifies the resource used to run a MapReduce job. Typically, you must specify the name of the resource that contains the Map or Reduce function in the resource_name_list.

    If the Map or Reduce function reads from other MaxCompute resources, you must add the names of these resources to resource_name_list.

    Separate resources with commas (,). If you use cross-project resources, insert PROJECT/resources/ before the resource name, for example, -resources otherproject/resources/resfile.

    For information about how to use the Map or Reduce function to read resources, see Resource samples.

  • -classpath <local_file_list>: specifies the classpath used to run a MapReduce job locally. This parameter specifies the local paths (both the relative and absolute paths) of the JAR package where the main function is located.

    Separate package names with the default file delimiter. Generally, the Windows operating system uses a semicolon (;) as the default file delimiter, and Linux uses a comma (,). If you run a MapReduce job on a cloud server, separate package names with commas (,).

    Note You may prefer to compile the main function in the same package as the Map or Reduce function. For more information, see WordCount samples. When you execute the sample program, mapreduce-examples.jar is included in both the -resources and -classpath options. The two options differ in the following ways: the -resources option references the Map or Reduce function and is executed in a distributed environment while the -classpath option references the main function and is executed locally (the specified JAR package is saved in a local directory).
  • -D <prop_name>=<prop_value>: specifies the Java property of <mainClass> for local job execution. You can specify multiple properties.
  • -l: indicates that the MapReduce job is executed locally. This option is mainly used for program debugging.

You can use the -conf option to specify the JobConf configuration file, which affects the JobConf settings in the SDK.

The following is an example of the JobConf configuration file.

In the preceding example, the JobConf file defines a variable named import.filename. The value of this variable is resource.txt.

You can obtain the value of this variable by calling the JobConf interface in MapReduce. Alternatively, you can call the JobConf interface in the SDK to obtain the variable value. For more information, see Resource samples.

add jar data\mapreduce-examples.jar;
    jar -resources mapreduce-examples.jar -classpath data\mapreduce-examples.jar wc_in wc_out;
    add file data\src.txt;
    add jar data\mapreduce-examples.jar;
    jar -resources src.txt,mapreduce-examples.jar -classpath data\mapreduce-examples.jar wc_in wc_out;
    add file data\a.txt;
    add table wc_in as test_table;
    add jar data\work.jar;
    jar -conf odps-mapred.xml -resources a.txt,test_table,work.jar
        -classpath data\work.jar:otherlib.jar
        -D import.filename=resource.txt args;