All Products
Search
Document Center

MaxCompute:Quick Start

Last Updated:Mar 27, 2026

Write a MapReduce program in MaxCompute Studio, package it as a JAR file, and run it on the MaxCompute client. This topic walks through a WordCount example that counts word occurrences in a text file.

How it works

MaxCompute MapReduce processes data through three stages:

(input) <key, value> → map → <key, value> → combine → <key, value> → reduce → <key, value> (output)

For the WordCount example: the map stage splits each line into words and emits <word, 1> pairs; the reduce stage sums the counts per word and writes the result to the output table.

Prerequisites

Before you begin, ensure that you have:

  • The MaxCompute client (odpscmd) installed and configured. For details, see MaxCompute client (odpscmd).

  • MaxCompute Studio installed and connected to your MaxCompute project. For details, see Install MaxCompute Studio and Manage project connections.

  • A source data file saved to your local machine. This topic uses a file named data.txt with the content hello,odps. Save it to the bin directory of the MaxCompute client.

Maven SDK dependencies

To develop a MapReduce program with Maven, search for odps-sdk-mapred, odps-sdk-commons, and odps-sdk-core in the Maven Central Repository to find the required SDK for Java versions. This example uses version 0.36.4-public. Add the following dependencies to your pom.xml:

<dependency>
    <groupId>com.aliyun.odps</groupId>
    <artifactId>odps-sdk-mapred</artifactId>
    <version>0.36.4-public</version>
</dependency>
<dependency>
    <groupId>com.aliyun.odps</groupId>
    <artifactId>odps-sdk-commons</artifactId>
    <version>0.36.4-public</version>
</dependency>
<dependency>
    <groupId>com.aliyun.odps</groupId>
    <artifactId>odps-sdk-core</artifactId>
    <version>0.36.4-public</version>
</dependency>

Step 1: Develop a MapReduce program

  1. Create a MaxCompute Java module in IntelliJ IDEA.

    1. In the top navigation bar, choose File > New > Module.

    2. In the New Module dialog box, select MaxCompute Java in the left-side navigation pane.

    3. Configure Module SDK and click Next.

    4. Enter a module name in the Module name field — for example, mapreduce — and click Finish.

  2. Create and write the WordCount MapReduce program.

    1. In the Project pane, expand your MaxCompute Java module and navigate to src > main > java. Right-click java and choose New > MaxCompute Java.

    2. In the Create new MaxCompute java class dialog box, click Driver, enter a class name in the Name field — for example, WordCount — and press Enter. 新建Java class

    3. In the code editor for WordCount.java, write the WordCount MapReduce logic to count word occurrences. For the complete sample code, see Sample code.

  3. Run and debug the program.

    1. In the Project pane, right-click WordCount.java and select Run.

    2. In the Run/Debug Configurations dialog box, set MaxCompute project to your target project. 配置项目信息

    3. Click OK to run and debug the script and verify it executes as expected.

Step 2: Generate and upload a MapReduce JAR file

  1. In the Project pane, right-click WordCount.java and select Deploy to server.

  2. In the Package a jar and submit resource dialog box, configure the parameters and click OK to package and upload the script. For parameter details, see Procedure.

    Note

    If you developed the MapReduce program with Maven, manually upload the JAR file from the MaxCompute client after packaging. For details, see Add resources. Sample command:

    add jar mapreduce-1.0-SNAPSHOT.jar;

    打包

Step 3: Run a MapReduce job

  1. Log on to the MaxCompute client, or start it from within MaxCompute Studio. For details on the integrated client, see Integrate the MaxCompute client.

  2. Create input and output tables. The input table holds the source data; the output table receives the processing results.

    -- Create an input table named wc_in.
    create table wc_in (key STRING, value STRING);
    -- Create an output table named wc_out.
    create table wc_out (key STRING, cnt BIGINT);

    For table creation syntax, see Create a table.

  3. Upload the source data file to wc_in. Confirm the file content before uploading. The data.txt file used in this example contains:

    hello,odps

    Run the Tunnel Upload command:

    tunnel upload data.txt wc_in;

    For Tunnel command reference, see Tunnel commands.

  4. Run the JAR command to execute the MapReduce job.

    Parameter

    Description

    -resources mapreduce-1.0-SNAPSHOT.jar

    The resource called by the MapReduce job — the JAR file uploaded in Step 2

    -classpath mapreduce-1.0-SNAPSHOT.jar

    The path of the JAR file that contains MainClass

    com.aliyun.odps.mapred.open.example.WordCount

    MainClass defined in the MapReduce program

    wc_in wc_out

    The input table and output table

    jar -resources mapreduce-1.0-SNAPSHOT.jar -classpath mapreduce-1.0-SNAPSHOT.jar com.aliyun.odps.mapred.open.example.WordCount wc_in wc_out;

    For JAR command syntax, see Syntax.

  5. Verify the results. Run the following command to query the output table:

    select * from wc_out;

    Expected output:

    +------------+------------+
    | key        | cnt        |
    +------------+------------+
    | hello      | 1          |
    | odps       | 1          |
    +------------+------------+

What's next