All Products
Search
Document Center

Schedule distributed execution plans

Last Updated: Jun 18, 2021

A simple model for scheduling distributed execution plans:

At the final stage of execution plan generation, an execution plan is divided into several subplans by taking EXCHANGE nodes as the boundaries. Each subplan is encapsulated as a Data Flow Object (DFO). When the degree of parallelism (DOP) is greater 1, two DFOs are scheduled at a time to traverse the DFO tree. When the DOP equals 1, each DFO saves the generated data to the intermediate result manager, and then traverses the DFO tree through postorder traversal.

Single-DFO scheduling

Example: Single-DFO scheduling for a query plan of which the degree of parallelism is 1.

======================================================================================
|ID|OPERATOR                             |NAME                 |EST. ROWS |COST      |
--------------------------------------------------------------------------------------
|0 |LIMIT                                |                     |10        |6956829987|
|1 | PX COORDINATOR MERGE SORT           |                     |10        |6956829985|
|2 |  EXCHANGE OUT DISTR                 |:EX10002             |10        |6956829976|
|3 |   LIMIT                             |                     |10        |6956829976|
|4 |    TOP-N SORT                       |                     |10        |6956829975|
|5 |     HASH GROUP BY                   |                     |454381562 |5815592885|
|6 |      HASH JOIN                      |                     |500918979 |5299414557|
|7 |       EXCHANGE IN DISTR             |                     |225943610 |2081426759|
|8 |        EXCHANGE OUT DISTR (PKEY)    |:EX10001             |225943610 |1958446695|
|9 |         MATERIAL                    |                     |225943610 |1958446695|
|10|          HASH JOIN                  |                     |225943610 |1480989849|
|11|           JOIN FILTER CREATE        |                     |30142669  |122441311 |
|12|            PX PARTITION ITERATOR    |                     |30142669  |122441311 |
|13|             TABLE SCAN              |CUSTOMER             |30142669  |122441311 |
|14|           EXCHANGE IN DISTR         |                     |731011898 |900388059 |
|15|            EXCHANGE OUT DISTR (PKEY)|:EX10000             |731011898 |614947815 |
|16|             JOIN FILTER USE         |                     |731011898 |614947815 |
|17|              PX BLOCK ITERATOR      |                     |731011898 |614947815 |
|18|               TABLE SCAN            |ORDERS               |731011898 |614947815 |
|19|       PX PARTITION ITERATOR         |                     |3243094528|1040696710|
|20|        TABLE SCAN                   |LINEITEM(I_L_Q06_001)|3243094528|1040696710|
======================================================================================

As shown in the following figure, the DFO tree, except for the root DFO, is divided into DFOs 0, 1, and 2. The scheduling order of postorder traversal is 0 > 1 > 2, by which the iteration of the entire plan tree is implemented.

调度1

Dual-DFO scheduling

Example: Dual-DFO scheduling for a query plan of which the degree of parallelism is greater than 1.

Query Plan
=============================================================================
|ID|OPERATOR                                   |NAME    |EST. ROWS|COST     |
-----------------------------------------------------------------------------
|0 |PX COORDINATOR MERGE SORT                  |        |9873917  |692436562|
|1 | EXCHANGE OUT DISTR                        |:EX10002|9873917  |689632565|
|2 |  SORT                                     |        |9873917  |689632565|
|3 |   SUBPLAN SCAN                            |VIEW5   |9873917  |636493382|
|4 |    WINDOW FUNCTION                        |        |29621749 |629924873|
|5 |     HASH GROUP BY                         |        |29621749 |624266752|
|6 |      HASH JOIN                            |        |31521003 |591048941|
|7 |       JOIN FILTER CREATE                  |        |407573   |7476793  |
|8 |        EXCHANGE IN DISTR                  |        |407573   |7476793  |
|9 |         EXCHANGE OUT DISTR (BROADCAST)    |:EX10001|407573   |7303180  |
|10|          HASH JOIN                        |        |407573   |7303180  |
|11|           JOIN FILTER CREATE              |        |1        |53       |
|12|            EXCHANGE IN DISTR              |        |1        |53       |
|13|             EXCHANGE OUT DISTR (BROADCAST)|:EX10000|1        |53       |
|14|              PX BLOCK ITERATOR            |        |1        |53       |
|15|               TABLE SCAN                  |NATION  |1        |53       |
|16|           JOIN FILTER USE                 |        |10189312 |3417602  |
|17|            PX BLOCK ITERATOR              |        |10189312 |3417602  |
|18|             TABLE SCAN                    |SUPPLIER|10189312 |3417602  |
|19|       JOIN FILTER USE                     |        |803481600|276540086|
|20|        PX PARTITION ITERATOR              |        |803481600|276540086|
|21|         TABLE SCAN                        |PARTSUPP|803481600|276540086|
=============================================================================

In the following figure, the DFO tree, except for the root DFO, is divided into 3 DFOs: 0, 1, and 2. The system prioritizes DFO 0 during scheduling. DFOs 1 and 2 are sequentially scheduled and executed after the execution of DFO 0.

调度2