The following section assumes the MaxCompute console has been installed to use the example ‘MapReduce WordCount’. Maven users can search “odps-sdk-mapred” from the Maven Library to get the required SDK (available in different versions). The configuration is as follows:
Create input and output tables and upload the data. For the SQL statement to create table, see CREATE TABLE:
CREATE TABLE wc_in (key STRING, value STRING);
CREATE TABLE wc_out (key STRING, cnt BIGINT);
-- Create input table and output table
Use Tunnel Commands to upload data:
tunnel u kv.txt wc_in
-- Upload example data
The data is shown in kv.txt as follows:
You can also insert data directly using the INSERT statement as follows:
insert into table wc_in select '238',' val_238' from (select count(*) from wc_in) a;
Write MapReduce program and compile it.
MaxCompute supports an Eclipse development plug-in to help quickly develop MapReduce programs and provide a local debugging MapReduce function.
Users must create a MaxCompute project in Eclipse first, and then write the MapReduce program. After the local debugging is run successfully, users can upload the compiled program to MaxCompute. For more information, see MapReduce Eclipse Plug-in.
Add .jar package into the project. (in this example, the name of the JAR package is “word-count-1.0.jar”):
add jar word-count-1.0.jar;
Run “-jar” command on MaxCompute console:
jar -resources word-count-1.0.jar -classpath /home/resources/word-count-1.0.jar
com.taobao.jingfan.WordCount wc_in wc_out;
Check the running result on ODPS console:
select * from wc_out;
If other resources are used the in java program, you must add ‘-resources’ parameters. For more information about JAR commands, see Jar Commands.