This topic describes how to use SQLTask and Tunnel to export a large amount of data.

SQLTask

SQLTask is an SQL class of the MaxCompute Java SDK. By using SQLTask, you can easily run SQL statements and obtain the returned results. For more information about SQLTask, see Java SDK.

The SQLTask.getResult(i) method returns a list. When running an SQL statement, you can use this method repeatedly to obtain the complete SQL computation result. However, this method has limits. Currently, the SELECT statement can return a maximum of 10,000 data entries to MaxCompute Console. For more information, see the valid values of READ_TABLE_MAX_ROW in Project operations. That is, if you run the SELECT statement in MaxCompute Console or by using SQLTask, you limit the number of data entries to return in each query result to 10,000. If you use the CREATE TABLE XX AS SELECT or INSERT INTO/OVERWRITE TABLE statement to insert the query result into a table, the number of data entries to return is not limited.

Tunnel

If the query result to be exported is all data in a table or a specific partition, you can use the Tunnel command-line tool to export all data at a time. MaxCompute provides the Tunnel command-line tool and Tunnel SDK. For more information, see Tunnel commands and Tunnel overview.

For example, use the Tunnel command-line tool to export data as follows:
>tunnel d wc_out c:\wc_out.dat;
2017-12-16 19:32:08  -  new session: 201712161932082dxxxxx     total lines: 3
2017-12-16 19:32:08  -  file [0]: [0, 3), c:\wc_out.dat
downloading 3 records into 1 file
2017-12-16 19:32:08  -  file [0] start
2017-12-16 19:32:08  -  file [0] OK. total: 21 bytes
download OK

Export more than 10,000 data entries by using SQLTask and Tunnel

SQLTask cannot process more than 10,000 data entries, but Tunnel can. Therefore, you can use SQLTask and Tunnel together to export a large amount of data. Use the following sample code:
private static final String accessId = "userAccessId";
private static final String accessKey = "userAccessKey";
private static final String endPoint = "http://service.cn-shanghai.maxcompute.aliyun.com/api";
private static final String project = "userProject";
private static final String sql = "userSQL";
private static final String table = "Tmp_" + UUID.randomUUID().toString().replace("-", "_");// Use a random string as the name of a temporary table for storing the exported data.
private static final Odps odps = getOdps();
public static void main(String[] args) {
    System.out.println(table);
    runSql();
    tunnel();
}
/*
     * Download the query result returned by using SQLTask.
     * */
private static void tunnel() {
    TableTunnel tunnel = new TableTunnel(odps);
    try {
        DownloadSession downloadSession = tunnel.createDownloadSession(project, table);
        System.out.println("Session Status is : "+ downloadSession.getStatus().toString());
        long count = downloadSession.getRecordCount();
        System.out.println("RecordCount is: " + count);
        RecordReader recordReader = downloadSession.openRecordReader(0, count);
        Record record;
        while ((record = recordReader.read()) ! = null) {
            consumeRecord(record, downloadSession.getSchema());
        }
        recordReader.close();
    } catch (TunnelException e) {
        e.printStackTrace();
    } catch (IOException e1) {
        e1.printStackTrace();
    }
}
/*
     * Save the data.
     * If only a small amount of data is returned, you can display the query result and copy the data. In actual scenarios, you can also use the java.io package to write data to a local file or a file on the remote server to save the query result.
     * */
private static void consumeRecord(Record record, TableSchema schema) {
    System.out.println(record.getString("username")+","+record.getBigint("cnt"));
}
/*
     * Run the SQL statement and save the query result as a temporary table so that you can use Tunnel to download data.
     * Set the lifecycle of the saved data to only one day. If the data fails to be deleted, it does not occupy too much storage space.
     * */
private static void runSql() {
    Instance i;
    StringBuilder sb = new StringBuilder("Create Table ").append(table)
        .append(" lifecycle 1 as ").append(sql);
    try {
        System.out.println(sb.toString());
        i = SQLTask.run(getOdps(), sb.toString());
        i.waitForSuccess();
    } catch (OdpsException e) {
        e.printStackTrace();
    }
}
/*
     * Initialize the project connection of MaxCompute.
     * */
private static Odps getOdps() {
    Account account = new AliyunAccount(accessId, accessKey);
    Odps odps = new Odps(account);
    odps.setEndpoint(endPoint);
    odps.setDefaultProject(project);
    return odps;
}