All Products
Search
Document Center

Hierarchical queries

Last Updated: Jan 10, 2020

Hierarchical queries are queries that you can use to display hierarchical data of different levels.

Hierarchical data is a set of data items that are related based on hierarchical relationships. Hierarchical relationships are very common in real life, for example:

  • Relationships between leaders and members in an organization

  • Relationships between upper-level and lower-level departments in an enterprise

  • Relationships between web pages that are connected by hyperlinks.

Syntax for CONNECT BY

The CONNECT BY statements can query the hierarchy of data. The query contains the following clauses that are related to hierarchical queries:

  • START WITH: specifies the root row(s) in the hierarchical query.

  • CONNECT BY: specifies how to determine the parent-child relationship. Equivalent expressions are usually used here, but other expressions are also supported.

  • PRIOR: indicates that the column of the parameter comes from the parent row. PRIOR is a unary operator that has the same precedence as the unary plus (+) and minus (-) operators.

  • NOCYCLE: When this keyword is specified, even if the result contains a cycle, the result can still be returned. The CONNECT_BY_ISCYCLE pseudocolumn can be used to indicate where the loop occurs. If NOCYCLE is not specified, an error is returned if the result contains a cycle.

  • ORDER SIBLINGS BY: specifies the sort order between rows of the same level.

SELECT select_list
FROM table_expression
[WHERE … ]
[START WITH start_expression]
CONNECT BY [NOCYCLE] { PRIOR child_expr = parent_expr | parent_expr = PRIOR child_expr }
[ORDER SIBLINGS BY … ]
[GROUP BY … ]
[HAVING … ]
[ORDER BY …]

Execution process

The key to using hierarchical queries is to understand the execution process. If a query is a hierarchical query, the execution process can be described as follows:

  1. Performs the SCAN or JOIN operations after the FROM keyword

  2. Generates hierarchical relationship results based on the START WITH and CONNECT BY clauses

  3. Follows the usual query execution process to execute the remaining clauses, such as WHERE, GROUP and ORDER BY.

The process of generating hierarchical relationships in step 2 can be described as follows:

  1. Obtains the root rows based on the expression in the START WITH clause.

  2. Selects the child rows of each root row based on the expression in CONNECT BY.

  3. The child rows generated are used as new root rows to further generate child rows. Repeat this process until no new row is generated.

Example

The following example demonstrates the use of hierarchical queries. The job titles are based by geographical locations:

  1. OceanBase (root@oceanbase)> select * from emp;
  2. +--------+--------------------+--------+
  3. | emp_id | position | mgr_id |
  4. +--------+--------------------+--------+
  5. | 1 | Global Manager | NULL |
  6. | 2 | Europe Regional Manager | 1 |
  7. | 3 | Asia Pacific Regional Manager | 1 |
  8. | 4 | Americas Regional Manager | 1 |
  9. | 5 | Italy Regional Manager | 2 |
  10. | 6 | France Regional Manager | 2 |
  11. | 7 | China Regional Manager | 3 |
  12. | 8 | Korea Regional Manager | 3 |
  13. | 9 | Japan Regional Manager | 3 |
  14. | 10 | US Regional Manager | 4 |
  15. | 11 | Canada Regional Manager | 4 |
  16. | 12 | China Regional Manager | 7 |
  17. +--------+--------------------+--------+
  18. 12 rows in set (0.00 sec)

From the preceding table, you can see that the job positions have a clear hierarchical relationship. The relationship can be better represented by a tree structure.

1

The result of the hierarchical query is displayed according to the hierarchical structure.

  1. OceanBase (root@oceanbase)> select emp_id, mgr_id, position, level from emp start with mgr_id is NULL connect by prior emp_id = mgr_id;
  2. +--------+--------+--------------------+-------+
  3. | emp_id | mgr_id | position | level |
  4. +--------+--------+--------------------+-------+
  5. | 1 | NULL | Global Manager | 1 |
  6. | 4 | 1 | Americas Regional Manager | 2 |
  7. | 11 | 4 | Canada Regional Manager | 3 |
  8. | 10 | 4 | US Regional Manager | 3 |
  9. | 3 | 1 | Asia Pacific Regional Manager | 2 |
  10. | 9 | 3 | Japan Regional Manager | 3 |
  11. | 8 | 3 | Korea Regional Manager | 3 |
  12. | 7 | 3 | China Regional Manager | 3 |
  13. | 12 | 7 | Beijing Regional Manager | 4 |
  14. | 2 | 1 | Europe Regional Manager | 2 |
  15. | 6 | 2 | France Regional Manager | 3 |
  16. | 5 | 2 | Italy Regional Manager | 3 |
  17. +--------+--------+--------------------+-------+

To query only the hierarchy relationship of positions in Asia Pacific, you can use the following query.

  1. OceanBase (root@oceanbase)> select emp_id, mgr_id, position, level from emp start with position = "Asia Pacific Regional Manager" connect by prior emp_id = mgr_id;
  2. +--------+--------+-----------------+-------+
  3. | emp_id | mgr_id | position | level |
  4. +--------+--------+-----------------+-------+
  5. | 3 | 1 | Asia Pacific Regional Manager | 1 |
  6. | 9 | 3 | Japan Regional Manager | 2 |
  7. | 8 | 3 | Korea Regional Manager | 2 |
  8. | 7 | 3 | China Regional Manager | 2 |
  9. | 12 | 7 | Beijing Regional Manager | 3 |
  10. +--------+--------+-----------------+-------+