A guide to query operators and their properties - AnalyticDB for MySQL

AnalyticDB for MySQL operators represent individual steps in an execution plan. Each operator handles a specific data processing task — scanning a table, joining datasets, aggregating results, or moving data between nodes. In a distributed cluster, most operators run in parallel across compute nodes.

Use the SQL diagnostics feature to inspect operator properties and view operator-level diagnosis results. To understand how operators compose into stages and plans, see the execution plan hierarchy chart at the operator layer.

Note

Only specific operators have properties.

Data access operators

TableScan

Reads data from the underlying data source. When filter conditions can be pushed down to the storage layer (predicate pushdown), TableScan applies them directly using data indexes — no separate Filter operator is needed.

Property	Description
TableName	The table being scanned.
DataBase	The database the table belongs to.
SelectFields	The fields read by this TableScan node.
FilterPushDown	Whether filter conditions are pushed down to a storage node. Displayed only when the current stage includes pushed-down filter conditions.
PushedDownFilter	The filter conditions pushed down to a storage node. Displayed only when the current stage includes pushed-down filter conditions.

Data processing operators

Aggregation

Performs regular aggregation or grouping-based aggregation using functions such as SUM(), COUNT(), and AVG(). AnalyticDB for MySQL runs aggregate operations in parallel across compute nodes. For more information, see Grouping and aggregation query optimization.

Property	Description
GroupByKeys	The fields used for grouping.
AggregationFunctions	The aggregate functions applied, such as `SUM()`, `COUNT()`, and `AVG()`. Note When no SELECT fields appear in the GROUP BY clause and no other aggregate functions are used, the system invokes `ARBITRARY()` to return a value from the group at random. See ARBITRARY.
Step	The aggregation stage. Valid values: `PARTIAL` (partial aggregation), `FINAL` (final aggregation), `SINGLE` (single-step aggregation that performs final aggregation directly).

DistinctLimit

Equivalent to DISTINCT LIMIT in SQL statements.

Filter

Filters data at the computing layer when predicate pushdown is not available. By default, AnalyticDB for MySQL creates indexes for all fields and pushes filter conditions down to the storage layer (visible as FilterPushDown properties on a TableScan operator). The Filter operator appears when pushdown is blocked.

Pushdown is blocked in the following cases:

The no_index_columns or filter_not_pushdown_columns hint is used in a query, or adb_config filter_not_pushdown_columns is configured for the cluster.
Filter conditions use functions such as CAST.
The filtered columns have no indexes — for example, the table was created with the no_index keyword, or indexes were dropped afterward using no_index.

Property	Description
Filter	The filter condition applied by this operator.

Join

Equivalent to JOIN in SQL statements. The join algorithm depends on the data and available indexes.

When creating a distributed table in AnalyticDB for MySQL, specify distribution keys using the DISTRIBUTED BY clause. The type of data redistribution is determined by whether the join key is used as a distribution key. For more information about data redistribution, see the RemoteExchange section.

Property	Description
Criterias	The join conditions. Example: `tb_1.col_a=tb_2.col_a`.
Type	The join type: `INNER JOIN`, `LEFT JOIN`, `RIGHT JOIN`, or `FULL JOIN`.
Method	The join algorithm: `Hash` (caches the smaller table in memory and matches rows using a hash table) or `Index` (uses join key indexes based on AnalyticDB for MySQL's full-column indexing feature).
JoinFilter	Additional filter conditions applied during the join.

Limit

Equivalent to LIMIT in SQL statements.

MarkDistinct

Handles COUNT(DISTINCT) when a query contains two or more COUNT(DISTINCT) operations. For a single COUNT(DISTINCT), AnalyticDB for MySQL uses the Aggregation operator instead.

Project

Applies projection expressions to specific fields — for example, CASE WHEN THEN expressions or the CONCAT() function.

Property	Description
ProjectExpression	The projection expression applied to the output fields.

Sort

Equivalent to ORDER BY in SQL statements. Sorts fields in the ORDER BY clause.

Property	Description
OrderBy	The field used to sort data.
Orderings	The sort direction and NULL placement. Valid values: `ASC_NULLS_FIRST`, `ASC_NULLS_LAST`, `DESC_NULLS_FIRST`, `DESC_NULLS_LAST`.

SortMerge

During distributed sorting, merges and re-sorts partially sorted data received from multiple upstream nodes.

TopN

Equivalent to ORDER BY LIMIT m,n in SQL statements. Combines sorting and limiting in a single operator for efficiency.

Property	Description
OrderBy	The field used to sort data.
Count	Equivalent to the ORDER BY LIMIT operations in SQL statements.
Orderings	The sort direction and NULL placement. Valid values: `ASC_NULLS_FIRST`, `ASC_NULLS_LAST`, `DESC_NULLS_FIRST`, `DESC_NULLS_LAST`.
Step	The execution step. Valid values: `PARTIAL` (sorts data locally on each node), `SORTMERGE` (merges and sorts partially sorted data), `SINGLE` (performs final merge and sort directly).
Offset	The starting row offset (the OFFSET value).

Union

Equivalent to UNION in SQL statements.

Window

Equivalent to window functions in SQL statements. For supported functions, see Window functions.

TopnRowNumber

对应窗口函数中的ORDER BY LIMIT m,n查询。

Data movement operators

RemoteExchange

Transfers data between compute nodes — from upstream stages to downstream stages. The method depends on the query structure and distribution key alignment.

Method	Behavior
Broadcast	Copies data from each upstream node to all downstream nodes. Every downstream node ends up with a full copy.
Repartition	Partitions data from each upstream node by a specified rule and sends each partition to the target downstream node.
Gather	Concentrates data from all upstream nodes onto a single downstream node.

For more information about how stages exchange data, see the execution plan hierarchy chart at the stage layer.

RemoteSource

Marks the point where data enters the current stage from remote nodes over the network.

Property	Description
OutputColumns	The fields passed into the current stage by this operator.

StageOutput

Transfers data processed in the current stage to downstream stage nodes over the network.

Write operators

TableWriter

Writes query results to a table. Appears in queries that perform extract, transform, and load (ETL) operations, such as INSERT INTO and REPLACE INTO.