Operators in AnalyticDB for MySQL complete basic data processing logic. A set of operators complete a set of processing rules for data based on an execution plan. AnalyticDB for MySQL uses a distributed system. Most operators can be completed in parallel on multiple nodes to improve processing efficiency.

Sort
Executes ORDER BY clauses in SQL statements and sorts the fields in the ORDER BY clauses.
Aggregate
Aggregates or groups and aggregates data by using functions such as SUM, COUNT, and AVG. AnalyticDB for MySQL uses a distributed system where multiple nodes can aggregate data in parallel. The aggregation process in AnalyticDB for MySQL consists of the Partial and Final phases. In the Partial phase, partial aggregation is performed on each compute node. Partial aggregation results are redistributed between networks based on group fields. Redistributed data is finally aggregated in the Final phase.
Tablescan
Reads data from data sources. VisualPlan allows you to view the database name, table name, and filter conditions optimized by predicate pushdown. The data filtering process displayed by this operator is completed by the underlying data sources by using indexes. For more information about VisualPlan, see VisualPlan.
Filter
Filters data that is not imported from data sources. By default, AnalyticDB for MySQL creates indexes for all fields. However, some filter conditions cannot be used with indexes to filter data. For example, filter conditions contain functions on fields, or a user manually deletes the indexes for the filter conditions. In this case, the Filter operator is used to filter data.
Window
Executes window functions.
Limit
Performs the LIMIT operations.
TopnRowNumber
Equivalent to the ORDER BY LIMIT M,N query in window functions.
Distinctlimit
Equivalent to the GROUP BY DISTINCT LIMIT query in SQL statements.
TopN
Equivalent to the ORDER BY LIMIT M,N query in SQL statements.
Tablewriter
Writes data into tables after extract-transform-load (ETL) SQL statements, such as INSERT INTO or REPLACE INTO, are executed to query data.
Join
Equivalent to the JOIN operations in SQL statements. The join types in AnalyticDB for MySQL include inner join, left outer join, right outer join, full outer join, and semi or anti join. When AnalyticDB for MySQL creates a distributed table, the Distributed By field must be specified. Data redistribution types vary based on whether the Distributed By field is set to the join key. For more information about the data redistribution types, see the "RemoteExchange" section of this topic. AnalyticDB for MySQL uses hash joins or index joins based on different table schemas and data characteristics. Hash joins cache small tables to the memory and join tables by using hash tables to find their matching rows. Index joins use indexes on the join keys based on the full index feature of AnalyticDB for MySQL. Join conditions in the Join operator use criteria in VisualPlan.
Union
Equivalent to the UNION operations in SQL statements.
RemoteExchange
Indicates that the data of the current stage comes from other stages. This operator has no mappings in SQL statements. This operator pulls data from the upstream stages to the current stage. The following methods are used:
  • Repartition indicates that a compute node of the current stage receives data only from specific partitions of compute nodes in upstream stages. When a large amount of data is involved, this method is used to aggregate values that have the same key.
  • Replicated indicates that the data of each compute node in a downstream stage is broadcast and replicated in all compute nodes of the downstream stage. When the amount of data is small, this method is used to prevent data of large tables from being transferred between networks during JOIN operations, or to prevent data skew by replicating small tables during JOIN operations.
  • Gather indicates that the data of each compute node of downstream stages is not replicated or broadcast, but is aggregated to a node of the current stage.
MarkDistinct
Performs the DISTINCT operations.