This topic describes how to use MaxCompute Studio to develop MapReduce programs. This includes how to write, debug, package, upload, and run MapReduce programs.
Write MapReduce programs
- In the Project tool window, expand your MaxCompute Java module and choose . Then, right-click java and choose .
- Create a Driver class. Specify Name and Kind, and click OK.
- Name: the name of the MaxCompute Java class. If no package is created, enter packagename.classname. The system automatically creates a package.
- Kind: the category of the MaxCompute Java class. Select Driver. Supported categories include custom functions (UDF, UDAF, and UDTF), MapReduce (Driver,
Mapper, and Reducer), and non-structural development frameworks (StorageHandler, Extractor,
Note If you create a Mapper or Reducer class, set Kind to Mapper or Reducer.
- After you create a Driver class, develop a MapReduce Java program in the editor.
The Java template is automatically populated with framework code. You need only to set the input table, output table, Mapper, and Reducer classes.Note For more information about how to develop a MapReduce program, see (Optional) Use MapReduce.
- Use the same method to create a Mapper and a Reducer.
Perform local run to debug MapReduce programs
Perform local run to test your MapReduce program and check whether the results meet your expectation.
- Right-click the compiled Java script and select Run.
- In the Run/Debug Configurations dialog box, select the name of the MaxCompute project where the MapReduce program
- Click OK to run the MapReduce program.
- The system reads the specified table data in warehouse as the input during the local run. You can view the log output in the console.
- The system downloads the table data from the specified MaxCompute project to the warehouse directory. If the data is already downloaded, the system does not perform this step.
- For more information about warehouse, see warehouse directory.
Perform unit testing to debug MapReduce programs
Package and upload MapReduce programs
After you debug a MapReduce program, compress the MapReduce program into a JAR package and upload the package to the MaxCompute server as a resource. For more information, see Package, upload, and register.
Run MapReduce programs
Use the MaxCompute client to run MapReduce programs.
- In the left-side navigation pane, click Project Explorer.
- Right-click your project name and select Open in Console.
- In the Console tool window, run the following command to start your MapReduce program. For more
commands, see Job submission.
jar-libjars wordcount.jar -classpath D:\odps\clt\wordcount.jar com.aliyun.odps.examples.mr.WordCount wc_in wc_out;