This topic describes the common operators supported by AnalyticDB for MySQL and their corresponding properties.

Background information

Operators in AnalyticDB for MySQL complete basic data processing logic. A set of operators complete a set of processing rules for data based on an execution plan. AnalyticDB for MySQL uses a distributed system. Most operators can complete computing tasks in parallel on multiple nodes to improve processing efficiency.

You can use the SQL diagnostics feature of AnalyticDB for MySQL to view operators that have properties and operator-level diagnosis results. For more information, see Execution plan hierarchy chart at the operator layer and Operator-level diagnosis results.

Note Only some operators have properties.

Aggregation

AnalyticDB for MySQL is a distributed database service that allows parallel aggregation operations on multiple nodes. For more information about regular aggregation and grouping-based aggregation, see Grouping and aggregation query optimization. The Aggregation operator uses functions such as sum(), count(), and avg() to perform regular aggregation or grouping-based aggregation on data.

The following table describes the properties of the Aggregation operator.
Property Description
GroupByKeys The fields that are used for grouping.
AggregationFunctions The aggregate functions that are used, such as sum(), count(), and avg().
Note If no SELECT fields are included in the GROUP BY clause and no other aggregate functions are used, the system invokes the arbitrary() function to randomly select a value from a group and return the value. For more information, see ARBITRARY.
Step The current stage of aggregation. Valid values:
  • PARTIAL: partial aggregation.
  • FINAL: final aggregation.
  • SINGLE: single-step aggregation, which indicates that final aggregation is directly performed.

DistinctLimit

Equivalent to the DISTINCT LIMIT operations in SQL statements.

Filter

AnalyticDB for MySQL supports two types of data filtering:
  • Data source filtering: Data indexes at the storage layer are used to filter data.
    Note No independent operators are used to represent the data source filtering process. Filter conditions in this process are reflected in the TableScan operator. For more information about the TableScan operator, see TableScan.
  • Non-data source filtering: Data at the storage layer is not indexed. You must use the Filter operator to filter data at the computing layer.
By default, AnalyticDB for MySQL creates indexes for all fields. However, the filter conditions are not pushed down in the following scenarios:
  • If the no_index_columns or filter_not_pushdown_columns hint is used in query statements, or the adb_config filter_not_pushdown_columns configuration is used in clusters, the filter condition pushdown feature is disabled.
  • Functions such as CAST are used in filter conditions.
  • Related columns in filter conditions do not have indexes. For example, the no_index keyword is used when you create a table, or the no_index statement is executed to delete indexes after a table is created.
In this case, AnalyticDB for MySQL filters data by using the Filter operator.

The following table describes the property of the Filter operator.

Property Description
Filter The filter condition of the Filter operator.

Join

Equivalent to the JOIN operations in SQL statements. When you create a distributed table in AnalyticDB for MySQL, you must specify distribution fields by using the DISTRIBUTED BY clause. The type of data redistribution is determined by whether the join key is used as a distribution field. For more information about data redistribution, see RemoteExchange.

The following table describes the properties of the Join operator.

Property Description
Criterias The join conditions. Example: tb_1.col_a=tb_2.col_a.
Type The join type. Four types of joins are available in AnalyticDB for MySQL: INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN.
Method The join algorithm. Two types of join algorithms are available in AnalyticDB for MySQL:
  • Hash: The hash join algorithm caches a small table to the memory and joins the small table with another table by using the hash table of the small table to find matches.
  • Index: The index join algorithm uses indexes of join keys based on the full-column indexing feature of AnalyticDB for MySQL.
JoinFilter The filter conditions in a join.

Limit

Equivalent to the LIMIT operations in SQL statements.

MarkDistinct

Equivalent to the COUNT(DISTINCT) operations in SQL statements.
Note
  • The MarkDistinct operator is used only when an SQL statement contains two or more COUNT(DISTINCT) operations.
  • If an SQL statement contains only a single COUNT(DISTINCT) operation, AnalyticDB for MySQL automatically uses the Aggregation operator. For more information, see Aggregation.

Project

Equivalent to the projection operations on specific fields in SQL statements, such as the CASE WHEN THEN control flow statement and the concat() function.

The following table describes the property of the Project operator.
Property Description
ProjectExpression The projection expression.

RemoteExchange

This operator indicates the method used to transfer data from upstream stages to downstream stages. The following methods are available:
  • Broadcast: The data of each compute node in an upstream stage is copied to all compute nodes in a downstream stage.
  • Repartition: The data of each compute node in an upstream stage is partitioned based on specific rules and then distributed to specified compute nodes of a downstream stage.
  • Gather: The data of each compute node in an upstream stage is concentrated on a specific compute node in a downstream stage.

For more information, see Execution plan hierarchy chart at the stage layer.

RemoteSource

This operator indicates that the input data of the current stage is transferred from remote nodes over networks.

The following table describes the property of the RemoteSource operator.
Property Description
OutputColumns The fields that are exported by the RemoteSource operator.

StageOutput

This operator transfers the data processed in the current stage to nodes of a downstream stage over networks.

Sort

Equivalent to the ORDER BY operations in SQL statements. This operator sorts the fields in the ORDER BY clause.

The following table describes the properties of the Sort operator.
Property Description
OrderBy The field that is used to sort data.
Orderings The order in which data is sorted. Valid values:
  • ASC_NULLS_FIRST: Data is sorted in ascending order in which NULL values are returned before non-NULL values.
  • ASC_NULLS_LAST: Data is sorted in ascending order in which NULL values are returned after non-NULL values.
  • DESC_NULLS_FIRST: Data is sorted in descending order in which NULL values are returned before non-NULL values.
  • DESC_NULLS_LAST: Data is sorted in descending order in which NULL values are returned after non-NULL values.

SortMerge

During distributed sorting, this operator merges and sorts data transferred from multiple nodes in an upstream stage.

TableScan

This operator reads data from data sources. The underlying data sources can use indexes to filter data.

The following table describes the properties of the TableScan operator.
Property Description
TableName The name of the table to which the scanned data belongs.
SelectFields The fields that are scanned by a TableScan node.
DataBase The name of the database to which the scanned data belongs.
FilterPushDown Specifies whether filter conditions are pushed down to a storage node.
Note This property is displayed only when the current stage contains filter conditions that are pushed down.
PushedDownFilter The filter conditions that are pushed down to a storage node.
Note This property is displayed only when the current stage contains filter conditions that are pushed down.

TableWriter

After data is queried by using SQL statements for extract, transform, load (ETL) such as INSERT INTO or REPLACE INTO, this operator writes data to tables.

TopN

Equivalent to the ORDER BY LIMIT m,n operations in SQL statements.

The following table describes the properties of the TopN operator.
Property Description
OrderBy The field that is used to sort data.
Count Equivalent to the ORDER BY LIMIT operations in SQL statements.
Orderings The order in which data is sorted. Valid values:
  • ASC_NULLS_FIRST: Data is sorted in ascending order in which NULL values are returned before non-NULL values.
  • ASC_NULLS_LAST: Data is sorted in ascending order in which NULL values are returned after non-NULL values.
  • DESC_NULLS_FIRST: Data is sorted in descending order in which NULL values are returned before non-NULL values.
  • DESC_NULLS_LAST: Data is sorted in descending order in which NULL values are returned after non-NULL values.
Step The execution step of the TopN operator. Valid values:
  • PARTIAL: During distributed sorting, data on each node is partially sorted.
  • SORTMERGE: Final merging and sorting are performed on data that has been partially sorted.
  • SINGLE: Final merging and sorting are directly performed on data.
Offset Equivalent to the ORDER BY OFFSET operations in SQL statements.

Union

Equivalent to the UNION operations in SQL statements.

Window

Equivalent to the window functions in SQL statements. For more information, see Window functions.