This topic describes how to execute common SQL statements and export data on the MaxCompute client.

MaxCompute allows you to execute SQL statements on the MaxCompute client or DataWorks:

Background information

MaxCompute allows you to perform the following operations by using SQL statements:
  • Use various types of operators.
  • Use data definition language (DDL) statements to manage tables, partitions, and views.
  • Use the SELECT clause to query records in tables and use the WHERE statement to filter records in tables.
  • Use the INSERT statement to insert or update data.
  • Use an equi-join to join two tables and use MAPJOIN to join multiple small tables.
  • Use built-in and user-defined functions (UDFs) to perform data computations.
  • Use regular expressions.
  • MaxCompute SQL does not support transactions, indexes, UPDATE, or DELETE operations. The SQL syntax of MaxCompute is different from that of Oracle and MySQL. Therefore, you cannot seamlessly migrate SQL statements from other databases to MaxCompute.
  • After you submit a MaxCompute job, it is scheduled within dozens of seconds or several minutes. Therefore, MaxCompute is suitable for batch jobs that process large amounts of data. MaxCompute is not suitable for frontend business systems that process thousands or tens of thousands of transactions per second. For more information about job optimization, see Optimize SQL statements.
  • For more information about the detailed examples of SQL operations, see SQL and functions.
  • For more information about the limits of MaxCompute SQL, see MaxCompute SQL limits.

Extract and analyze data

Assume that you want to query the number of single persons with different academic qualifications who have housing loans and save the query results to the result_table table.

  1. Execute the following statements to query the number of single persons with different academic qualifications who have housing loans in the bank_data table and save the query results to the result_table table.
    INSERT OVERWRITE TABLE result_table  
    SELECT education,COUNT(marital) AS num
    FROM bank_data
    WHERE housing = 'yes'
        AND marital = 'single'
    GROUP BY education;
  2. Execute the following statements to query the data in the result_table table.
    SELECT * FROM result_table;
    The following figure shows the results.
This example shows simple data processing. In practice, you may need to use multiple SQL statements to process data in multiple tables. We recommend that you use DataWorks to complete complex data processing.

Export data

Execute the following statement to export the data from the result_table table to disk D on your on-premises machine and save the data as a file named result.txt.
tunnel download result_table D:\result.txt;
result_table is the table that needs to be exported. D:\result.txt is the directory where the exported data is saved. result.txt is the name of the file that contains the exported data. For more information about Tunnel commands, see Tunnel commands.

After the export succeeds, download OK is displayed, as shown in the following figure.