MySQL Transaction Isolation

Date: Oct 26, 2022

Related Tags:1. ApsaraDB RDS for MySQL
2. Data Lake Analytics

Abstract: When it comes to MySQL affairs, I believe that students who know MySQL can talk a few words. Whether it is job interviews or daily development, MySQL affairs are closely related to us.

The ACID of a transaction (i.e. Atomicity, Consistency, Isolation, Durability) can be said to cover all the knowledge points of a transaction. Therefore, we must not only know what ACID is, but also understand the implementation behind ACID. Only in this way, whether it is in daily development or job interviews, it can be beneficial.

For everyone's better reading experience, the in-depth analysis of ACID will be divided into two parts.

This article is the first article, mainly around the I in ACID, that is, "isolation", from the basic concept, to the realization of isolation, and finally a practical case for integration.

Well, after reading all the content, you can make sense, then talking to the interviewer about isolation for half an hour is no problem.

Basic Concepts of Transaction Isolation

1.1 What is Isolation in ACID, isolation
Isolation, isolation, also known as concurrency control (concurrency control). The isolation of transactions requires that the objects read and written by each transaction are isolated from other transactions, that is, before the transaction is committed, the modified content of this transaction is invisible to other transactions. The isolation of transactions is mainly to solve the mutual read and write influence between different things.

The so-called reading and writing impact attention is divided into three types:

Dirty read: I read changes that other transactions have not committed (commit), others have not committed, I read them.

Non-repeatable read: Changes committed by other transactions are read by the current transaction. Then, the results of multiple selects of this transaction are different, and the content submitted by other transactions is read.

Phantom reading: I also read the content submitted by other transactions, but the difference from the above is that I read records that did not originally exist.

Note that non-repeatable reading is mainly to read the content of other transaction updates. The phantom read is to read the content of other transaction inserts.

1.2 Isolation level of isolation
In order to solve the problem of transaction isolation, the database generally has different isolation levels to solve the corresponding read and write impact.

Read uncommitted: A transaction B has not yet committed, and its changes have been read by another transaction A.

Read committed: After a transaction B commits, its changes are seen by other transactions A.

Repeatable read: Before and after a transaction B is committed, transaction A cannot read the changes of transaction B.

Serialization: For the same row of records, when there is a read-write conflict between different things, it is solved by serialization, and the latter transaction must wait for the previous transaction to complete before execution.

Different isolation levels can solve different isolation problems.

Note that this is the definition of the standard transaction isolation level. In MySQL's innodb engine, at the repeatable read level, the problem of phantom reading is solved through mvcc, and we will talk about the specific implementation later.

At the same time, it should be noted that so far, the reads we have been talking about are all "snapshot reads", which are ordinary selections. Later we will also refer to "current reading", which is different.

Implementation of transaction isolation

To achieve transaction isolation, you need to understand two aspects, one is locks, and the other is multi-version concurrency control (MVCC).

2.1 row locks for transactions
In InnoDB, two standard row-level locks are implemented:

A shared lock (S Lock), also called a read lock, allows a transaction to read a row of data.

Exclusive lock (X Lock), also known as write lock, allows a transaction to delete or update a row of data (note that insertion is not mentioned here, insertion involves phantom reading, you can see the instructions at the end of the article)

Ordinary select statements will not have any locks, so how to obtain shared locks and exclusive locks?

The Select … lock in share mode statement can acquire a shared lock

Select ... for update (special select, it is often used for simple distributed lock implementation with mysql), Update, and delete statements can obtain exclusive locks

When a transaction A has obtained the shared lock of row r, then another transaction B can immediately obtain the shared lock of row r, because the value of r will not be changed, which is called lock compatibility.

If transaction C wants to obtain the exclusive lock of row r at this time, it must wait for transaction A and transaction B to release the shared lock of row r before obtaining the exclusive lock, which is called lock incompatibility.

Ordinary select will not lock the row, but select…lock in share mode will have a shared lock, and select…for update will have an exclusive lock.

For the normal select read mode, it is called "snapshot read", also called "consistent non-locking read".

For select read with lock, or update tb set a = a+1 (reading the current value of a), it is called "current read", also called "consistent lock read".

If you cannot select during update and insert, then the concurrent access performance of the service is too poor. Therefore, our daily queries are all "snapshot reads", which will not be locked. They will only be locked when update"current read". In order to solve the concurrent access problem of "snapshot read", MVCC is introduced.

2.2 Multi-version concurrency control MVCC
If the above row lock is a pessimistic lock, then MVCC is an implementation of optimistic locking, and it is a very common implementation of optimistic locking.

The so-called multi-version means that a row of records stores multiple versions in the database, and each version uses the transaction ID as the version number. Each transaction in InnoDB has a unique transaction ID, which is applied to the InnoDB transaction system at the beginning of the transaction, and is strictly incremented according to the order of application. If a row of records is updated by multiple transactions, then multiple versions of the records will be generated.

Take a row of data as an example:

After two transaction operations, the value has changed from 22 to 19, and at the same time, three transaction ids, 15, 25, and 30, are retained.

On the basis of multiple versions of each record, the "consistency view" needs to be used to judge the visibility of versions.

Here, we have to distinguish two "view" concepts in MySQL:

One is view, which is implemented through the syntax create view ... and mainly creates a virtual table for executing query statements.

One is the consistent read view that InnoDB uses to implement mvcc, a purely logical concept with no physical structure that defines which versions of data you can see during a transaction.

The "views" mentioned in our full text are all of the second type, mainly to support InnoDB's concurrent access problems at the "read committed" and "repeatable read" levels.

At the "read not mentioned" level, there is no consistency view

At the "read committed" level, a consistent view is created at the start of each SQL execution

At the "repeatable read" level, a consistent view is created at the beginning of each transaction

At the "serialization" level, concurrency problems are avoided directly by locking

Below, we briefly introduce the logic of creating a consistent view.

Take the "repeatable read" level as an example.

When a transaction is opened, it will apply to the system for a new transaction id

At this point, there may be multiple ongoing other transactions that have not been committed, so at an instant, there are multiple active uncommitted transaction ids

Form these uncommitted transaction IDs into an array, the smallest transaction ID in the array is recorded as the low water level, and the maximum value of the transaction IDs created by the current system + 1 is recorded as the high water level

This array array and the high water mark form a "consistency view".

With the consistency view, we can judge the multi-version visibility of a row of data. Whether it is "read committed" or "repeatable read", the visibility judgment rules are the same, the difference lies in creating a snapshot (consistency). view) time.

In the current transaction, there are five cases in which the record of another row is read, and the visibility of the version number in it is judged in five cases (it is very important to follow it by yourself):

If the version number is less than the "low water mark", it means that the transaction has been committed, and it must be visible;

If the version number is greater than the "high water mark", it means that the transaction id version of this row of data was generated after the snapshot, and it is definitely not visible;

If the version number is in the transaction array array, it means that the transaction has not been committed, so it is not visible;

If the version number is not in the transaction array array and is lower than the high water level, it means that the transaction has been committed, so it is visible;

Of course, any change in your own transaction id is visible at any time

Take a look at the following example for easier understanding.

Transaction IDs created by the system: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

Transaction A starts and takes a snapshot

Uncommitted transaction ids at this time are: 7, 8, 9

Consistency view: array array[7,8,9] + high water mark 16 (15+1)

For the visibility judgment of any row of data:

less than 7, visible

If it is greater than 16, it means that it is generated after the snapshot and is not visible.

10-15, not in the array array, indicating that it has been submitted, visible

7, 8, 9 are in the array, the description is not submitted, not visible

Two important conclusions:

InnoDB takes advantage of the feature that "all data has multiple versions" to achieve the ability to "create snapshots in seconds".

The implementation of MVCC is to create a "consistency view" based on the transaction id of the current transaction, and use the consistency view to determine the visibility of the data version.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us