×
Community Blog Obtaining SQL Query Result in Pages Using MaxCompute

Obtaining SQL Query Result in Pages Using MaxCompute

This article describes several methods to obtain query results in pages or batches on MaxCompute by using the paged query logic of databases.

By Sheng Yuan

Unlike databases, MaxCompute SQL does not have any built-in paged query logic such as select * from table limit x offset y. However, many users hope to obtain query results in pages or batches in some scenarios by using the paged query logic of databases. This article describes several methods to implement this need.

1.  Use the row_number() function as a unique and incrementing ID for filter-based query.

select * from (select row_number() over() as row_id,* from orders_delta)t where row_id between 10 and 20;

1

Sort the data and attach a unique ID to each data record by using the row_number() function. Then specify the pagination range for each query based on this ID.

2.  Download the query result in batches by using the InstanceTunnel of Java SDK.

MaxCompute JavaSDK supports directly exporting the SELECT query result by using SQLTask and InstanceTunnel in combination. For more information, see Tunnel Commands.

In this case, you can download the results of your custom SELECT query by using InstanceTunnel.

   Odps odps = OdpsUtils.newDefaultOdps(); // Initialize the MaxCompute object
    Instance i = SQLTask.run(odps, "select * from wc_in;");
    i.waitForSuccess();
    
    // Create InstanceTunnel
    InstanceTunnel tunnel = new InstanceTunnel(odps);
    // Create DownloadSession based on the instance ID
    InstanceTunnel.DownloadSession session = tunnel.createDownloadSession(odps.getDefaultProject(), i.getId());
 
    long count = session.getRecordCount();
     // The count of outputs results
    System.out.println(count);
 
    // The code for obtaining the data is the same as in Table Tunnel.
TunnelRecordReader reader = session. openRecordReader (0, Count );
    Record record;
    While(Record = reader.Read())! = null) {
      For(IntCol =0; Col <session.GetSchema().GetColumns().Size(); ++ Col ){
        // The fields in the wc_in table are all strings, and the output is printed directly here.
System. out.Println(Record.Get(Col ));
      }
    }
Reader.Close();

Here, a custom select query is submitted through SQLTask, and the query result is directly downloaded using InstanceTunnel.DownloadSession. Within this, the openRecordReader method supports specifying the start position and number of reads for this read record, and the parameter settings of openRecordReader (start, long) can be used to implement the logic for batch download.

2

For example, you can change openRecordReader (0, count) in the preceding example to get the start position and number of records you wish to obtain:

TunnelRecordReader reader = downloadSession. openRecordReader (100,20);

References: MaxCompute SDK Java Doc InstanceTunnel. DownloadSession class openRecordReader method.

Because the data volume of tables processed by MaxCompute is usually very large, the preceding method is generally not recommended for use in interactive paging query scenarios for reports. For users who need to query while interacting, consider using MaxCompute's interactive analysis service (Lightning) to make such queries.

3.  Use the limit/offset syntax of MaxCompute's interactive analysis tool (Lightning) to implement paging

MaxCompute SQL does not support limit/offset syntax, but developers can draw on MaxCompute'sinteractive analysis tool (Lightning) to use limit/offset.

MaxCompute's interactive analysis tool can quickly query MaxCompute data using the PostgreSQL protocol and syntax under the same permission system. PostgreSQL's limit offset syntax can achieve the same paging effect as database queries.

3

0 0 0
Share on

Alibaba Cloud MaxCompute

51 posts | 6 followers

You may also like

Comments