Implementation and Acquisition Mechanism of MDL Lock


In order to meet the transaction isolation and consistency requirements of the database under concurrent requests, and to play a role for multiple MySQL plug-in storage engines, MySQL implements the Metadata Locking (MDL) mechanism at the Server layer. The effect achieved is, for example, that when a transaction accesses a certain resource of the database, other concurrent transactions are restricted from deleting the resource. This is a lock in a logical sense. Different from the limited types of mutex provided by the operating system kernel, MDL can flexibly customize the object of the lock, the type of the lock, and the priority of different lock types. Dynamically adjust the compatibility of different lock types, which greatly facilitates the reasonable concurrency control of various query requests by the database.

This article will introduce the commonly used data structures and meanings in the MDL system, then discuss the MDL acquisition mechanism and deadlock detection from the perspective of implementation, and finally share how to monitor the MDL status in practice.

Two basic concepts

1 MDL_key

MDL objects are described in the form of key-value pairs (key-value), and each key value uniquely represents a locked object (value represents a certain resource of the database). The key is represented by MDL_key, which represents the name of the object in the form of a string.

The complete string consists of namespace, the name of each level of the hierarchy, and multiple namespaces can distinguish different types of objects with the same name. Namespaces consist of GLOBAL, SCHEMA, TABLE, FUNCTION, PROCEDURE etc. different object types that can be created in the database.

Depending on the type, the object's name can consist of multiple hierarchies. For example, a table object is uniquely described by the database name and table name; if it is a SCHEMA object, there is only one level of database name. Names are separated by the string terminator . Therefore, the string composed of these parts can be used as a key to uniquely represent some kind of object in the database.

2 enum_mdl_type

For the same database object, different queries have different access modes. For example, the SELECT statement wants to read the content of the object, the INSERT / UPDATE statement wants to modify the content of the object, and the DDL statement wants to modify the object. Structure and Definition. These statements have different impacts on objects and have different requirements for concurrency isolation, so MySQL defines different types of MDLs and their compatibility to control the concurrent access of these statements.

Different Types of MDL Compatibility

MySQL divides lock types into range locks and object locks.

1) range lock

There are fewer types of scope locks (IX, S, X), and they are mainly used for objects in the GLOBAL, COMMIT, TABLESPACE, BACKUP_LOCK, and SCHEMA namespaces. The compatibility of these types is simple, mainly to limit concurrent operations as a whole, such as global read locks to block transaction submissions, and DDL to update meta-information of table objects by requesting SCHEMA-wide intent exclusive locks (IX) to block the SCHEMA level modification operation.

These types of MDL compatibility relationships are defined by two matrices. For the same object, one is the compatibility of the obtained MDL type to the new request type; the other is the compatibility of the waiting MDL request type to the new request type that has not been obtained. Since IS(INTENTION_SHARE) is compatible with other locks in all cases, it can be ignored in MDL systems.

2) Object lock

Object locks contain rich MDL types, which are applied to most of the basic objects in the database. Their compatibility matrix is as follows:

During the MDL acquisition process, through these two compatibility matrices, it can be judged whether there is an MDL in the granted/pending state that is incompatible with the requested MDL, to determine whether the request can be satisfied, and if it cannot be satisfied, it will enter pending waiting state.

The MDL system also judges the strength of the lock type through the compatibility matrix, as follows:

The way the expression is written is a bit convoluted. It can be understood that if the type type is not compatible with an MDL that is compatible with a m_type type, then the type type is stronger; otherwise, the m_type type is the same or stronger. Or weaker types are incompatible MDL types, and stronger MDLs are not compatible.

Three important data structures

1 Relationship Diagram

2 MDL_request
Represents the statement's request to MDL, composed of MDL_key, enum_mdl_type and enum_mdl_duration, MDL_key and enum_mdl_type determine the MDL object and lock type.

There are three types of enum_mdl_duration, which represent the holding period of MDL, including a single statement-level period, a transaction-level period, and an explicit period.

The lifetime of MDL_request is outside the MDL system, controlled by the user, and can be a temporary variable. However, the MDL life cycle obtained through this request is persistent, controlled by the MDL system, and will not be released with the destruction of MDL_request.

3 MDL_lock

For an object in the database, only one lock object MDL_lock corresponding to its name (MDL_key) exists. When a database object is accessed for the first time, a lock-free HASH creates and manages an MDL_lock in its memory; when subsequent accesses come, access to the same object will refer to the same MDL_lock.

In MDL_lock, there are both the m_waiting queue that is currently waiting for the lock object and the m_granted queue that the object has been granted to. The elements in the queue are represented by MDL_ticket.

MDL_lock_strategy composed of static bitmap objects is used to store the compatibility matrix of the range lock and object lock mentioned above, and the compatibility of the lock can be obtained according to the MDL_lock namespace.

4 MDL_ticket

MDL_lock and enum_mdl_type together form MDL_ticket, which represents the current thread's access rights to database objects. MDL_ticket is created when each query requests an MDL lock, memory is allocated by the MDL system, and destroyed at the end of the transaction.

MDL_ticket contains two sets of pointers that respectively connect all the tickets obtained by the thread and the tickets in the waiting state or granted state of the lock object that the ticket participates in.

5 MDL_context
A context for acquiring MDL locks by a thread, one for each connection, including all MDL_tickets acquired by the connection. They are stored in their own linked lists according to different life cycles and managed by MDL_ticket_store.

All locks acquired by a connection can be divided into three types according to the life cycle: statement level, transaction level and explicit lock. Both statement-level and transaction-level locks have an automatic life cycle and scope, and they are accumulated during a transaction. Statement-level locks are automatically released after the outermost statement ends, transaction-level locks are released after COMMIT, ROLLBACK and ROLLBACK TO SAVEPOINT, and they will not be released manually. Tickets with an explicit lifetime are acquired for locks across transactions and checkpoints, including HANDLER SQL locks, LOCK TABLES locks, and user-level locks GET_LOCK()/RELEASE_LOCK(). Statement-level and transaction-level locks will be added to the front of the corresponding linked list in reverse chronological order. When we roll back to a certain checkpoint, the corresponding ticket will be released from the stack from the front of the linked list until the checkpoint The last ticket obtained before creation.

When a thread wants to acquire an MDL lock, it will first check in its own MDL_ticket_store whether it has already acquired a stronger type of MDL_ticket for the same lock object within the transaction. Therefore, MDL_ticket_store will provide an interface to find MDL_tickets according to MDL_request requests, one is to search in MDL_ticket linked lists with different life cycles; if the number of MDL_tickets obtained by the current thread exceeds the threshold (default 256), all MDL_tickets will be maintained in additional

Four MDL acquisition process

Almost all query statements (including the first stage of DML and DDL) are in the parse stage. LEX and YACC initialize the MDL lock request for the table to be accessed according to the type of statement. For example, the SELECT statement is SR, the INSERT statement is SW, ALTER The TABLE statement is SU. This process is in the following call stack:

Before the statement is executed, all the tables that need to be accessed will be opened through the open_tables_for_query function to obtain the TABLE table object. In this process, the MDL lock will be obtained first, and then the table resource will be obtained to prevent concurrent reading and writing of the meta information of the same table. All requests for MDL locks are made by calling MDL_context::acquire_lock from the context MDL_context of the current thread, and the call stack is as follows:

Next, let's focus on the process of MDL_context::try_acquire_lock_impl. This function includes the acquisition of various types of locks (good compatibility, poor compatibility) and lock conflict detection. The incoming parameter is the current MDL_request, and the output parameter is the obtained MDL_ticket.

First, according to the MDL_request, it will search for a ticket with a stronger type and the same or different life cycle in the same object MDL_ticket already held by the current thread. If you already have the same lifecycle, return it directly; if you have a different lifecycle, clone a ticket with the same lifecycle and return it.

We mentioned earlier that according to the compatibility of the lock type, it can be divided into unobtrusive and obtrusive locks. During the lock acquisition process, they also correspond to fast path and slow path respectively, which means that the difficulty of acquisition is different.

For MDL requests of some weak types (unobtrusive, such as SR/SW, etc.), since this part of the requests accounts for the vast majority and has good compatibility, it is not necessary to record which specific MDL_ticket it is after obtaining it, but only how many requests there are obtained. Therefore, the integer atomic variable std::atomic m_fast_path_state is used in MDL_lock to count the number of all unobtrusive lock types granted by the lock. Each unobtrusive lock has a different numerical representation, leaving a fixed bit range to store the cumulative value of this lock type. The final result is equivalent to using a longlong type to count the number of all unobtrusive locks granted, and at the same time, it can be modified without locks through CAS. In addition, in the high bit of m_fast_path_state, there are three status indication bits, which are IS_DESTROYED/HAS_OBTRUSIVE/HAS_SLOW_PATH.

According to the request type of MDL_request, obtain the unobtrusive integer increment value of the corresponding type. If the increment value is 0, it means that it is an obtrusive lock and needs to go through the slow path.

If it is not 0, it means that this type of lock is unobtrusive, and the fast path will be followed, and the corresponding integer value can be directly incremented to MDL_lock::m_fast_path_state through CAS. However, one condition needs to be confirmed, that is, the object is not locked by other threads in an obtrusive manner, because some unobtrusive and obtrusive lock types are mutually exclusive, and only when there is no obtrusive lock exists, other unobtrusive locks are compatible with each other. It can be obtained directly without judging the lock holding status of other threads.

After the CAS is completed, set the state and reference of the relevant data structure, and add the current MDL_ticket to the thread's MDL_ticket_store to return:

For MDL requests of some relatively strong types (obtrusive, such as SU/SRO/X, etc.), the corresponding MDL_ticket will be stored in the m_granted linked list corresponding to MDL_lock. Therefore, it is also necessary to traverse this linked list and other bitmaps to determine whether there is a lock conflict with the MDL_ticket that other threads have acquired or are waiting for.

Before taking the slow path to acquire the lock, the current thread needs to materialize the lock obtained by the current thread through the fast path in MDL_lock::m_fast_path_state, remove it from the bitmap, and add it to MDL_lock::m_granted. Because the bitmap contained in MDL_lock::m_fast_path_state cannot distinguish threads, and the multiple locks acquired by the current thread do not constitute a lock conflict, so before judging through the bitmap, you need to ensure that the tickets of MDL_lock::m_fast_path_state are all belonging to other threads.

After the materialization is completed, the current request can be judged by the ticket type (m_waiting), the granted ticket type (m_granted) and the unobtrusive lock type state (MDL_lock::m_fast_path_state) that the current lock is waiting for, combined with the previous compatibility matrix Whether the lock type can be obtained, this process is mainly in MDL_lock::can_grant_lock.

In m_waiting and m_granted, in addition to having linked lists to connect tickets, bitmaps are also used to collect the types of all tickets in the linked list for direct comparison. After the incompatible type is found in m_granted, it is necessary to traverse the linked list to determine whether the incompatible type of ticket is acquired by the current thread. Only when the incompatible type of ticket is acquired by the non-current thread will there be a lock conflict. If the unobtrusive lock can be acquired, it will be directly added to the MDL_lock::m_granted list.

2 Lock waiting and notification

In the above process, if the MDL_ticket can be successfully obtained, the acquisition of MDL is completed, and the query process can be continued. If it cannot be obtained (whether the unobtrusive lock is forced to take the slow path due to the obtrusive lock, or the obtrusive lock itself cannot be obtained), you need to wait for the lock. The process of waiting for the lock does not distinguish whether it is unobtrusive or obtrusive. , to be processed uniformly.

The MDL_context of each thread contains an MDL_wait member, because lock waiting and deadlock detection are all thread-based objects, and subscription notifications are made by adding the corresponding requested MDL_ticket to the lock waiter queue. There is a set of mutex, condition variable and enumeration state to complete the waiting and notification between threads. There are five waiting states:

WS_EMPTY is the initial state, and the others are the result states of waiting. As can be seen from the command, the waiting results may be:

GRANTED, the thread acquired the waiting MDL lock
VICTIM, the thread that is the victim of a deadlock, requires re-execution of the transaction
TIMEOUT, wait for timeout
KILLED, the thread is killed while waiting
The waiting thread first adds the ticket it wants to get to the m_waiting queue of MDL_lock, and then calls the function of MDL_wait to wait for timeout according to the configured waiting time:

When other threads holding incompatible type locks complete their query or the transaction ends, all the locks held will be released together, and at the same time, the state of MDL_lock::m_fast_path_state and MDL_lock will be restored according to whether it is acquired by fast path or slow path: :m_granted linked list. In addition, if MDL_lock::m_waiting has a waiting ticket, it will call MDL_lock::reschedule_waiters() to wake up the thread that can acquire the lock, and set the waiting state to GRANTED:

If the awakened waiting thread finds that the ticket is in the GRANTED state, it will continue to execute; otherwise, an error will be reported according to different situations.

3 deadlock detection
Before each thread enters the lock waiting, it will perform a deadlock detection to prevent the current thread from falling into deadlock. Before detecting deadlock, first materialize the unobtrusive locks acquired by the current thread, so that these locks will appear in the MDL_lock::m_granted list, and deadlock detection can be detected. And set the waiting lock MDL_context::m_waiting_for of the current thread as the current ticket, each thread that enters the waiting will set the waiting object, and the deadlock can be detected along this waiting chain.

An abstract class representing an edge in the waiting graph, which will be traversed by the deadlock detection algorithm. MDL_ticket is derived from MDL_wait_for_subgraph, by implementing the accept_visitor() function to let the auxiliary detection class look for the waiting ring along the edge.

Auxiliary class for detecting waiting rings in the waiting graph, including state information during the detection process, such as the starting thread m_start_node of deadlock detection; after a deadlock occurs during the search process, the victim thread m_victim selected according to the weight; the search thread Depth, if the thread's waiting chain is too long and exceeds the threshold (default 32), even if no deadlock is detected, it will be considered as a deadlock.

Detection process

The idea of ​​deadlock detection is that first, breadth first, starting from the lock that the current thread is waiting for, traverses the waiting queue and grant queue of MDL_lock to see if there is a lock that is not acquired by the current thread and is incompatible with the waiting lock. If the holding thread is the same as the starting thread of the algorithm traversal, then there is a deadlock in the lock waiting chain; secondly, depth-first, starting from the holding or waiting thread of an incompatible lock, if the thread is also in the waiting state, then recursively repeat the above process, Until the thread waiting for the starting point is found, otherwise it is judged that there is no deadlock. The code logic is as follows:

Victim weight

After a deadlock is detected, when exiting along the thread waiting chain, the victim with the smallest weight will be selected according to the weight of each thread waiting for the ticket, and let it give up waiting and release the held lock. Deadlock_detection_visitor::opt_change_victim_to function.

The weight is still relatively rough, regardless of the stage of the transaction and the content of the executed statement, it only has a preset weight according to the type of lock resource and lock type, in the MDL_ticket::get_deadlock_weight() function .

It can be seen that when a deadlock occurs, it is more inclined to let the DML statement report an error and roll back, and let the DDL statement continue to execute. When the same type of statement forms a deadlock, it is more inclined to let the thread that entered the waiting chain become the victim, and let the thread that has been waiting for a long time continue to wait.

After the current thread sets the state of the victim thread on the deadlock ring to VICTIM and wakes up, the current thread can enter the waiting state.

Five MDL Monitoring
You can clearly monitor the acquisition of the current MDL lock through MySQL performance_schema. performance_schema is a read-only variable, and the setting needs to be restarted. Add to the configuration file:

Six PolarDB optimization on MDL
In the MySQL Community Edition, the access operation (DML) and the partition maintenance operation (DDL) of the partition table data are mutually blocked. The main reason is that DDL needs to obtain the MDL_EXCLUSIVE lock on the partition table. This makes partition maintenance operations only possible during off-peak business hours, and the need to create/delete partitions on partition tables is relatively frequent, which greatly limits the use of partition tables.

In PolarDB, we have introduced partition-level MDL locks, which reduces the lock granularity acquired by DML and DDL to the partition level, improves concurrency, and realizes the "online" partition maintenance function. The data access and partition maintenance of the partition table do not affect each other, and users can perform partition maintenance more freely without affecting the business flow of the partition table, which greatly enhances the flexibility of the partition table use.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us