SQL

Last Updated: Dec 26, 2016

Since most users are already familiar with SQL syntax, some special notices will be mentioned here.

It is worth to mention that MaxCompute SQL does not support transactions, index and Update/Delete operations. MaxCompute SQL syntax differs from Oracle and MySQL, so the user cannot migrate SQL statements of other databases into MaxCompute seamlessly. In addition, MaxCompute SQL can not complete the query in minutes even seconds and unable to return the result in millisecond.

Select Statements

  • The key of “group by” statement can be the column name of input table and also can be the expression consisted of input table columns, but it can not be output column of Select statements.
  1. select substr(col2, 2) from tbl group by substr(col2, 2); -- Yes, the key of group by can be the expression consisted of input table column;
  2. select col2 from tbl group by substr(col2, 2); -- No, the key of group by is not in the column of Select statement;
  3. select substr(col2, 2) as c from tbl group by c; -- No, the key of group by can not be the column alias, i.e., the output column of Select statement;

The reason for such a restriction: for usual SQL parsing, “group by” operations are conducted before “select” operations, therefore, “group by” can only use the column or expression of input table as the key.

  • “Order by” statement must be used in combination with “limit”.
  • “Distribute by” statement must be added in front of “sort by”.
  • The key of “order by/sort by/distribute by” must be the output column of “select” statement, i.e., the column alias.
  1. select col2 as c from tbl order by col2 limit 100 -- No, the key of order by is not the output column (column alias) of Select statement.
  2. select col2 from tbl order by col2 limit 100; -- Yes, use column name as the alases if the output column of Select statement has no alias.

The reason for such a restriction: for usual SQL parsing, order by / sort by / distribute by operations are conducted after “select” operations; therefore, they can only use the output column of select statements as the key.

Insert Statement

  • To insert data into a specified partition, the partition column is not allowed in Select list:
  1. insert overwrite table sale_detail_insert partition (sale_date='2013', region='china')
  2. select shop_name, customer_id, total_price, sale_date, region from sale_detail;
  3. -- Return error; sale_date and region are partition columns, which are not allowed in Select statement in static partition.
  • To insert a dynamic partition, the dynamic partition column must be in Select list:
  1. insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
  2. select shop_name,customer_id,total_price from sale_detail;
  3. -- Failed, to insert the dynamic partition, the dynamic partition column must be in Select list.

Join

  • MaxCompute SQL supports the following Join operaton types: {LEFT OUTER|RIGHT OUTER|FULL OUTER|INNER} JOIN.
  • At present, MaxCompute SQL supports up to 16 concurrent Join operations.
  • Support the mapjoin up to six small tables.

Others

  • MaxCompute SQL currently supports up to 128 concurrent union operations;
  • Support up to 128 concurrent insert overwrite/into operations.
Thank you! We've received your feedback.