You can execute SELECT statements to retrieve data from tables.

Syntax

SELECT [ DISTINCT ]
{ * | projectItem [, projectItem ]* }
FROM tableExpression;

Test data

a (VARCHAR) b (INT) c (DATE)
a1 211 1990-02-20
b1 120 2018-05-12
c1 89 2010-06-14
a1 46 2016-04-05

Simple queries

  • Test statement
    SELECT * FROM <Table name>;
  • Test result
    a (VARCHAR) b (INT) c (DATE)
    a1 211 1990-02-20
    b1 120 2018-05-12
    c1 89 2010-06-14
    a1 46 2016-04-05

Rename objects

  • Test statement
    SELECT a, c AS d FROM <Table name>;
  • Test result
    a (VARCHAR) d (DATE)
    a1 1990-02-20
    b1 2018-05-12
    c1 2010-06-14
    a1 2016-04-05

Deduplication queries

  • Test statement
    SELECT DISTINCT a FROM Table name;
  • Test result
    a (VARCHAR)
    a1
    b1
    c1

Subqueries

In most cases, SELECT statements read data from tables, for example, SELECT column_1, column_2 … FROM table_name. SELECT statements can also read data from the results of other SELECT statements. This is known as subqueries.
Note You must specify aliases in subqueries.
  • Test statement
    INSERT INTO result_table
    SELECT * FROM
                   (SELECT   t.a,
                             sum(t.b) AS sum_b
                    FROM     t1 t
                    GROUP BY t.a
                   ) t1 
    WHERE  t1.sum_b > 100;
  • Test result
    a (VARCHAR) b (INT)
    a1 211
    b1 120
    a1 257
    Note The preceding test result is a debugging result. In the result, you can view the computing process. If your job is published and the result table is stored in DataHub, Alibaba Cloud Message Queue for Apache Kafka, or Alibaba Cloud Message Queue, the computing process is displayed. If your job is published and the result table is stored in a relational database such as ApsaraDB for RDS, the records that have the same primary key values are combined into one record.