By Zhuweng, OceanBase R&D Director
The source code is the most fundamental part of OceanBase. This series of articles focuses on the interpretation of the source code to help users understand the nature of the database. The first article of this series (An Interpretation of OceanBase Database Source Code-Module Structure) was about the overall architecture of OceanBase. It was intended to help readers quickly understand the modules of OceanBase and their functions.
The second article of the series focuses on the main path of an execution process of SQL in OceanBase, including the process of receiving, processing, and feedback to the client. It aims at discussing the module of the SQL engine of OceanBase with developers.
It can be seen that src/observer contains three subdirectories. Among them, mt in omt represents multi-tenant, which internally implements an abstract worker of the observer thread model. Each tenant creates a thread pool on its tenant node to process SQL requests. The implementation of each__all_virtual virtual table of sys tenant is under virtual_table. A virtual table is a view, which abstracts some memory data structures into table interfaces and exposes them for diagnosis and debugging. The MySQL directory is the MySQL protocol layer, which implements the MySQL 5.6-compatible message processing protocol.
In addition to establishing and disconnecting connections, most MySQL protocols are simple request response models. Each request is similar to a COM_XXX command, and the processing function of each command corresponds to a corresponding class in this directory. Please see the following figure:
For example, the most commonly used COM _QUERY represents an SQL request. The processing class is located in obmp_query.h/cpp. The typical interaction process is to connect, query, query... query, and quit.
Note: All types of SQL statements, including DML, DDL, and multi-statement, are processed using a query.
The process of establishing a connection is in obmp_connect, which performs user authentication. If the authentication is completed, the system creates an ObSQLSession object (located in src/sql/session) to represent the connection to only one database. All other command processing will access the session object.
The preceding figure shows the processing class ObMPQuery of query, which is the beginning of an SQL. The process method shows the entry to the processing class.
Mpquery initializes the class TraceId at the entry of the processing class. TraceId is a tool class and placed in thread local variables. All modules involved in the subsequent processing of SQL statements can be accessed globally. It is the only identifier of SQL in processing. If threads are switched or RPC is executed during execution, TraceId is included. All debug logs printed in oblog contain a hexadecimal string beginning with Y (guess why it starts with Y). This is TraceId, which can string logs printed in different positions. Developers for OceanBase are used to grep all logs related to TraceId when checking problems.
If the format of an SQL string is stmt;stmt; or multi-statement, it is a special optimization of the MySQL protocol. After the optimization, multiple statements can be sent to be executed at a time, and multiple result sets can be returned (if any). Depending on whether the SQL is multi-statement, the observer will process it differently. If it is a DML single statement with autocommit = 1, the system cannot respond to the client until it has executed transactions to commit and write logs. The observer suspends relevant contexts and executes a callback after the log is committed to release thread resources as soon as possible. Some special code logic here is dealing with this optimization.
ObParser first extracts each statement from them through a quick parsing entry split_multiple_stmt and then performs process_single_stmt for each statement for multi-statement processing. The transaction control logic can be temporarily ignored. Finally, through do_process, the observer enters the SQL module.
In general, SQL modules are clearly divided. The total entry is the ObSql class of sql/ob_sql.h/cpp. Do_process calls the stmt_query method of this class: enter an SQL statement string and output a ResultSet that contains physical execution plans and metadata. The outer layer opens and iterates the result set and then sends each row of results to the client. Therefore, the protocol and execution plan processing is streaming and does not need to query all the results before returning to the client.
Next are submodules of SQL. The parser module performs syntax analysis to parse an SQL string into an abstract syntax tree consisting of a ParseNode. Its interface class is the ObParser class. The parser is the C language code generated by bison and flex. The C language code of OceanBase is located here. The parser has a quick parsing mode. Its goal is not to generate syntax trees but to parameterize SQL strings. Specific principles will be explained in detail in the subsequent entries of this series.
The resolver plays its role after the parsing by the parser. The abstract syntax tree has no SQL semantics, so the resolver analyzes it and gives it SQL semantics based on data dictionary metadata (OceanBase's code is called schema module). Most semantic errors occur at this stage, such as the table already exists and the length of the primary key exceeds the limit. The interface class of the resolver module is ObResolver, and the output of the resolver module is ObStmt. This module is object-oriented, with a Resolver and an Stmt for each statement type. Based on the type of statements, Resolver and Stmt are located in different subdirectories, such as dml, ddl, and tcl.
For statements other than SELECT and DML, most DDL statements can be parsed here and executed. These simple statements are collectively referred to as commands and are directly executed by executors in engine/cmd. DDL is executed through rootservice (RS), so its executor sends RPC. Transaction control statements directly call the transaction layer in the native machine.
An execution plan is needed for SELECT, DML, and DDL statements with data operations. The interface class of the optimizer (sql/optimizer) is ObOptimizer. ObDMLStmt (including SELECT) generated in the preceding step is seen as the input to perform cost-based optimization and generate a logical execution plan (ObLogPlan). A logical execution plan is a tree structure composed of relational operation operator ObLogicalOperator of OceanBase. Rewriting (sql/rewrite) is part of the optimizer and performs equivalent relational operation rewriting to generate potentially better execution plan candidates. This involves a series of rewriting rules. The entry class of rewriting rules is ObTransformerImpl, and their input and output are ObDMLStmt.
Next, the code_generator(cg) module converts a logical execution plan into a physical execution plan that can be efficiently executed. Its interface class is ObCodeGenerator. The more complicated process here is the generation of expressions. The engine directory is a SQL execution engine also known as a physical execution plan (ObPhysicalPlan). It is a tree structure composed of physical operators (ObPhyOperator). Its execution process is a pipeline of volcanic models.
The same physical execution plan can be executed by multiple threads in parallel. The physical execution plan generated by the cg in the preceding step is generally saved to the plan cache (sql/plan_cache). The parser mentioned above has a special quick parsing mode. The quickly parsed SQL tries to fish the available physical execution plan directly from the plan cache. If no proper plan exists, the observer executes the hard parsing process from resolver to cg as described above.
We only briefly mentioned the life of SQL in this topic. The third article of this series will help you understand the storage layer of OceanBase and the life of a partition.
OceanBase - September 9, 2022
OceanBase - September 9, 2022
OceanBase - September 9, 2022
OceanBase - May 30, 2022
OceanBase - September 14, 2022
OceanBase - September 13, 2022
Fully managed and less trouble database servicesLearn More
AnalyticDB for MySQL is a real-time data warehousing service that can process petabytes of data with high concurrency and low latency.Learn More
An on-demand database hosting service for SQL Server with automated monitoring, backup and disaster recovery capabilitiesLearn More
ApsaraDB Dedicated Cluster provided by Alibaba Cloud is a dedicated service for managing databases on the cloud.Learn More
More Posts by OceanBase