All Products
Search
Document Center

Overview

Last Updated: Sep 29, 2021

A query (also known as an SQL statement) is a means used to obtain data in a database. It can be used with various clauses, such as conditional clauses (for example, WHERE) and order clauses (for example, ORDER BY) to get the query results. A subquery is a query nested in an upper-layer query. An upper-layer query is generally referred to as a parent query or outer query. The subquery result is returned to the parent query or outer query as input. The parent query takes the subquery result into the calculation to determine the final output. The SQL language supports multi-level nested queries, which means that one subquery can be nested with other subqueries. Meanwhile, subqueries can be included in various clauses of SQL statements, such as SELECT, FROM, and WHERE.

Subquery

In the database, subqueries can be divided into dependent subqueries and independent subqueries. A dependent subquery means that the execution of this subquery depends on variables of outer queries. Therefore, dependent queries are usually computed multiple times. An independent subquery means that the execution of this subquery does not depend on variables of outer queries. Independent subqueries are typically computed only once. The following example shows an independent subquery and a dependent subquery.

obclient> create table t1(a int primary key, b int, c int);
Query OK, 0 rows affected (0.09 sec)

obclient> create table t2(a int primary key, b int, c int);
Query OK, 0 rows affected (0.06 sec)

-- An independent subquery
obclient> select * from t1 where t1.a in (select t2.a from t2);
Empty set (0.01 sec)

-- A dependent subquery that uses outer query variable t1.b
obclient> select * from t1 where t1.a in (select t2.a from t2 where t2.b = t1.b);
Empty set (0.01 sec)

Subquery unnesting

Subquery unnesting is a database optimization strategy. It places some subqueries in the outer parent query to convert some subqueries into equivalent multi-table join operations. One obvious benefit of using this strategy is that some access paths, connection methods, and connection sequences may be effectively used to reduce the layers of query statements. In the following subquery unnesting example, the subquery is rewritten as a join statement.

obclient> create table t1(a int primary key, b int, c int);
Query OK, 0 rows affected (0.09 sec)

obclient> create table t2(a int primary key, b int, c int);
Query OK, 0 rows affected (0.09 sec)

--- A dependent subquery that is unnested and rewritten as a join operation.
obclient> explain select * from t1 where t1.a in (select t2.b from t2 where t2.c = t1.c);
| ============================================
|ID|OPERATOR        |NAME |EST. ROWS|COST  |
--------------------------------------------
|0 |MERGE JOIN      |     |9703     |215436|
|1 | TABLE SCAN     |t1   |100000   |64066 |
|2 | SORT           |     |10001    |129621|
|3 |  SUBPLAN SCAN  |VIEW1|10001    |111242|
|4 |   HASH DISTINCT|     |10001    |109862|
|5 |    TABLE SCAN  |t2   |100000   |64066 |
============================================

Outputs & filters:
-------------------------------------
  0 - output([t1.a], [t1.b], [t1.c]), filter(nil),
      equal_conds([t1.a = VIEW1.t2.b], [VIEW1.t2.c = t1.c]), other_conds(nil)
  1 - output([t1.c], [t1.a], [t1.b]), filter(nil),
      access([t1.c], [t1.a], [t1.b]), partitions(p0)
  2 - output([VIEW1.t2.b], [VIEW1.t2.c]), filter(nil), sort_keys([VIEW1.t2.b, ASC], [VIEW1.t2.c, ASC])
  3 - output([VIEW1.t2.b], [VIEW1.t2.c]), filter(nil),
      access([VIEW1.t2.b], [VIEW1.t2.c])
  4 - output([t2.b], [t2.c]), filter(nil),
      distinct([t2.b], [t2.c])
  5 - output([t2.c], [t2.b]), filter(nil),
      access([t2.c], [t2.b]), partitions(p0)