This topic describes how to execute the SELECT statement to query data from one or more tables.

Syntax

SELECT
    [ALL | DISTINCT]
    select_expr [, select_expr ...]
    [FROM table_references
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}
    [HAVING where_condition]
    [ORDER BY {col_name | expr | position}
      [ASC | DESC], ...]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]
    [FOR UPDATE]
Descriptions of the clauses in the SELECT statement:
  • select_expr specifies the column that you want to query. A SELECT statement must contain at least one select_expr expression.
  • table_references specifies the tables from which data is retrieved.
  • The WHERE clause specifies query criteria. The rows that meet the condition specified in where_condition are returned. If this clause is not specified, all rows are returned.
  • The GROUP BY clause supports the references to column names, expressions, and positions in output columns.
  • The HAVING clause is similar to the WHERE clause. The difference is that you can use aggregate functions in the HAVING clause.
  • The ORDER BY clause specifies the order in which data is sorted. This clause supports the references to column names, expressions, and positions in output columns. You can also specify the sort direction, such as ASC and DESC. ASC specifies the ascending order. DESC specifies the descending order.
  • The OFFSET clause specifies the offset of an output result set. The LIMIT clause specifies the size of an output result set. The LIMIT clause allows you to specify one or two parameters of the numeric data type. The parameters must be integer constants. If you specify two parameters, the first parameter specifies the offset of the first row that you want to return and the second parameter specifies the maximum number of rows that you want to return. The initial offset of the first row is 0 instead of 1. To be compatible with PostgreSQL, MySQL also supports LIMIT # OFFSET #.
  • The FOR UPDATE clause provides an exclusive lock on each row of the query results. This prevents other transactions from concurrently updating the rows. This also prevents other transactions from concurrently reading the rows for which some transaction isolation levels are specified.

Note

  • The expressions that are used in a WHERE clause cannot be used in a HAVING clause. For example, the following SQL statement 1 must be rewritten as SQL statement 2.
    SQL statement 1:
    SELECT col_name FROM tbl_name HAVING col_name > 0;
    SQL statement 2:
    SELECT col_name FROM tbl_name WHERE col_name > 0;
  • You can use aggregate functions in the HAVING clause. You cannot use aggregate functions in the WHERE clause.
    SELECT user, MAX(salary) FROM users
      GROUP BY user HAVING MAX(salary) > 10; 
  • If the LIMIT clause contains two parameters, the first parameter specifies the offset of the first row that is returned and the second parameter specifies the number of rows that are returned. If the LIMIT clause contains only one parameter, this parameter specifies the number of rows that are returned, and the default offset is 0.
  • The GROUP BY clause does not support ASC or DESC.
  • If both GROUP BY and ORDER BY are used, the expressions specified after ORDER BY must be specified in a SELECT clause or a GROUP BY clause. For example, the following SQL statement is not supported:
    SELECT user FROM users GROUP BY age ORDER BY salary;     
  • Aggregate functions and expressions that contain aggregate functions cannot be used in the ORDER BY clause. If you want to use such an expression, define the expression as select_expr, assign an alias to the expression, and then reference the alias in the ORDER BY clause.
  • Empty strings cannot be used as aliases.

JOIN

PolarDB-X supports the following JOIN syntax in table_references of the SELECT statement:

table_references:
    escaped_table_reference [, escaped_table_reference] ...
escaped_table_reference:
    table_reference
  | { OJ table_reference }
table_reference:
    table_factor
  | join_table
table_factor:
    [schema_name.]tbl_name [[AS] alias] [index_hint_list]
  | table_subquery [AS] alias
  | ( table_references )
join_table:
    table_reference [INNER | CROSS] JOIN table_factor [join_condition]
  | table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_condition
join_condition:
    ON conditional_expr
  | USING (column_list)
index_hint_list:
    index_hint [, index_hint] ...
index_hint:
    USE {INDEX|KEY}
      [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
  | IGNORE {INDEX|KEY}
      [FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
  | FORCE {INDEX|KEY}
      [FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
index_list:
    index_name [, index_name] ...

Before you execute the JOIN statements, take note of the following factors:

  • JOIN, CROSS JOIN, and INNER JOIN are syntactic equivalents in PolarDB-X and MySQL.
  • An INNER JOIN statement without an ON clause is equivalent to the comma operator (,). The INNER JOIN statement and the comma operator (,) each indicate a CROSS JOIN. For example, the following SQL statements are equivalent:
    SELECT * FROM t1 INNER JOIN t2 WHERE t1.id > 10
    SELECT * FROM t1, t2 WHERE t1.id > 10      
  • USING(column_list) specifies the names of columns that exist in both tables. PolarDB-X specifies equivalent conditions based on these columns. For example, the following SQL statements are equivalent:
    a LEFT JOIN b USING(c1, c2)
    a LEFT JOIN b ON a.c1 = b.c1 AND a.c2 = b.c2          
  • The JOIN operator has higher precedence than the comma operator (,). The JOIN expression t1, t2 JOIN t3 is used as (t1, (t2 JOIN t3)) instead of ((t1, t2) JOIN t3).
  • LEFT JOIN and RIGHT JOIN must contain ON clauses.
  • index_hint specifies the index used by MySQL. PolarDB-X pushes the hint down to MySQL.
  • STRAIGHT_JOIN and NATURAL JOIN are not supported.

UNION

supports the following UNION syntax:

SELECT ...
UNION [ALL | DISTINCT] SELECT ...
[UNION [ALL | DISTINCT] SELECT ...]        
Note For each SELECT clause in UNION, PolarDB-X does not support multiple columns whose names are the same. The following SQL statement is not supported because the column names in the SELECT clause are duplicates.
SELECT id, id, name FROM t1 UNION SELECT pk, pk, name FROM t2;          

References