全部产品
Search
文档中心

子查询

更新时间: 2020-08-12

与MySQL子查询的差异

在MySQL支持的子查询基础上,DRDS增加了如下限制需要注意:

  • 支持WHERE条件中的子查询以及列子查询,但不支持HAVING子句和JOIN ON条件中的子查询;
  • 等号操作行符的标量子查询(The Subquery as Scalar Operand)不支持ROW语法。示例如下:
    1. 支持:select * from tb1 where id in (select id from tb2)
    2. 支持:select * from tb1 where (id, name) in (select id, name from tb2)
    3. 支持:select * from tb1 where row(id, name) in (select id, name from tb2)
    4. 支持:select * from tb1 where row(id, name) not in (select id, name from tb2)
    5. 不支持:select * from tb1 where row(id, name) = (select id, name from tb2)
  • 目前仅支持SELECT语句中的子查询,但不支持DELETE语句中的子查询。示例如下:

    1. DELETE FROM t1 WHERE ROW(c1,c2) IN (
    2. SELECT c1, c2 FROM t2
    3. );
  • 也不支持INSERT、UPDATE或SET中的子查询。

    效率

    DRDS子查询部分只能转化为比较低效的APPLY 执行器执行,在实际使用中请尽量避免以下类型的子查询。

  • WHERE条件中OR与子查询共存时,执行效率会依外表数据情况大幅降低。示例如下:

    1. 高效:select * from tb1 where id in (select id from tb2)
    2. 高效:select * from tb1 where id in (select id from tb2) and id>3
    3. 低效:select * from tb1 where id in (select id from tb2) or id>3
  • 关联子查询(Correlated Subqueries)的关联项中带函数或非等号运算符。示例如下:

    1. 高效:select * from tb1 a where id in
    2. (select id from tb2 b where a.name=b.name)
    3. 低效:select * from tb1 a where id in
    4. (select id from tb2 b where UPPER(a.name)=b.name)
    5. 低效:select * from tb1 a where id in
    6. (select id from tb2 b where a.decimal_test=abs(b.decimal_test))
    7. 低效:select * from tb1 a where id in
    8. (select id from tb2 b where a.name!=b.name)
    9. 低效:select * from tb1 a where id in
    10. (select id from tb2 b where a.name>=b.name)
  • 关联子查询(Correlated Subqueries)关联项与其它条件的逻辑运算符为OR。示例如下:

    1. 高效:select * from tb1 a where id in
    2. (select id from tb2 b where a.name=b.name
    3. and b.date_test<'2015-12-02')
    4. 低效:select * from tb1 a where id in
    5. (select id from tb2 b where a.name=b.name
    6. or b.date_test<'2015-12-02')
    7. 低效:select * from tb1 a where id in
    8. (select id from tb2 b where a.name=b.name
    9. or b.date_test=a.date_test)
  • 标量子查询(The Subquery as Scalar Operand)带关联项。示例如下:

    1. 高效:select * from tb1 a where id >
    2. (select id from tb2 b where b.date_test<'2015-12-02')
    3. 低效:select * from tb1 a where id >
    4. (select id from tb2 b where a.name=b.name
    5. and b.date_test<'2015-12-02')
  • 跨嵌套层的子查询关联项。示例如下:

    • SQL多层嵌套,但每层子查询关联项仅与次级上层关联,此类高效。

      1. 高效:select * from tb1 a where id in(select id from tb2 b
      2. where a.name=b.name and
      3. exists (select name from tb3 c where b.address=c.address))
    • SQL多层嵌套,但是表c的子查询关联项中与表a的列进行了关联,此类低效。

      1. 低效:select * from tb1 a where id in(select id from tb2 b
      2. where a.name=b.name and
      3. exists (select name from tb3 c where a.address=c.address))
  • 子查询中包含GROUP BY,请确保GROUP BY的分组列包含关联项。示例如下:

    • SQL子查询中包含聚合函数和关联项,关联项b.pk被包含中分组列(pk)中,此类SQL````高效。

      1. 高效:select * from tb1 a where exists
      2. (select pk from tb2 b
      3. where a.pk=b.pk and b.date_test='2003-04-05'
      4. group by pk);
    • SQL子查询中包含聚合函数和关联项,关联项b.date _test不被包含中分组列(pk)中,此类SQL低效。

      1. 低效:select * from tb1 a where exists
      2. (select pk from tb2 b
      3. where a.date_test=b.date_test and b.date_test='2003-04-05'
      4. group by pk);

目前支持的子查询类别

Comparisons Using Subqueries

最常见的子查询类别,语法如下:

  1. non_subquery_operand comparison_operator (subquery)
  2. comparison_operator = > < >= <= <> != <=> like

示例:

  1. ... WHERE 'a' = (SELECT column1 FROM t1)

目前仅支持子查询在比较运算符的右边。

Subqueries with ANY, ALL, IN/NOT IN, EXISTS/NOT EXISTS

语法:

  1. operand comparison_operator ANY (subquery)
  2. operand comparison_operator ALL (subquery)
  3. operand IN (subquery)
  4. operand NOT IN (subquery)
  5. operand EXISTS (subquery)
  6. operand NOT EXISTS (subquery)
  7. comparison_operator:= > < >= <= <> !=

ANY : 如果子查询返回的任意一行满足 ANY 前的表达式,返回 TRUE,否则返回 FALSE。ALL:如果子查询返回所有行都满足 ALL 前的表达式,返回 TRUE,否则返回 FALSE。IN : IN 在子查询前使用时,与 =ANY 是等价的。例如:

  1. SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2);
  2. SELECT s1 FROM t1 WHERE s1 IN (SELECT s1 FROM t2);

NOT IN:NOT IN 在子查询前使用时,与<>ALL是等价的。例如:

  1. SELECT s1 FROM t1 WHERE s1 <> ALL (SELECT s1 FROM t2);
  2. SELECT s1 FROM t1 WHERE s1 NOT IN (SELECT s1 FROM t2);

EXISTS/NOT EXISTS:如果子查询返回任意行,EXISTS子查询结果为TRUE,NOT EXISTS子查询结果为FALSE。如果返回为空,则结果反过来。例如:

  1. SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);

如果子查询中包含任意行,即使只包含NULL的行值,WHERE条件也会返回TRUE。

Row Subqueries

Row Subqueries支持以下表达式:

  1. = > < >= <= <> != <=>

例如:

  1. SELECT * FROM t1
  2. WHERE (col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);
  3. SELECT * FROM t1
  4. WHERE ROW(col1,col2) = (SELECT col3, col4 FROM t2 WHERE id = 10);

以上两个SQL是等价的。只有满足以下条件时,t1表的数据行才会返回。

  • 子查询SELECT col3, col4 FROM t2 WHERE id = 10返回的仅有一行记录。返回多行会报错。
  • 子查询返回的col3和col4与主表的col1和col2相等。

Correlated Subqueries

Correlated Subqueries是指子查询中包含对外层查询表的引用。例如:

  1. SELECT * FROM t1
  2. WHERE column1 = ANY (SELECT column1 FROM t2
  3. WHERE t2.column2 = t1.column2);

示例SQL子查询中并没有包含t1表及其列名column2,此时会向上一层寻找t1表的引用。

将Correlated Subqueries改写为JOIN有可能会提高其性能。

Derived Tables (Subqueries in the FROM Clause)

Derived Tables是指在FROM子句中的子查询。

  1. SELECT ... FROM (subquery) [AS] tbl_name ...

例如:

  1. CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);
  2. INSERT INTO t1 VALUES (1,'1',1.0);
  3. INSERT INTO t1 VALUES (2,'2',2.0);
  4. SELECT sb1,sb2,sb3
  5. FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb
  6. WHERE sb1 > 1;

查询的结果为:2, '2', 4

从场景举例:假如现在需要分组数据SUM后的平均值,直接使用以下SQL无法得到想要的结果。

  1. SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;

此时可使用Derived Tables拿到需要的信息:

  1. SELECT AVG(sum_column1)
  2. FROM (SELECT SUM(column1) AS sum_column1
  3. FROM t1 GROUP BY column1) AS t1;
  • Derived Tables 必须拥有一个别名。
  • Derived Tables 可以返回一个标量,列,行或表。
  • Derived Tables 不可以成为 Correlated Subqueries,即不能包含子查询外部表的引用。