The underlying principle of ACID

Atomic underlying implementation principle
A (Atomicity), either it's all done, or it doesn't work at all. The underlying implementation is realized through the undo log log. When the transaction modifies the database, InnoDB generates the corresponding undo log. There are multiple versions of the undo log, and the operation opposite to the previous version is stored. It will record this SQL Execution-related information, if SQL execution fails and rollback occurs, innodb will do the opposite work based on the undo log content, for example, if I perform an insert operation, then when rolling back, it will perform an opposite operation, which is delete , corresponding to update, the opposite update is also executed when rolling back. This is the underlying implementation of atomicity.
Consistency Implementation Principle
Once the transaction is completed, regardless of success or failure, the data is in a consistent state, rather than partially completed and partially failed. Before and after the execution of the transaction, the integrity constraints of the database are not violated, and it is a legal data state before and after the execution of the transaction. The AID of the transaction is a characteristic of the database, that is, it depends on the specific implementation of the database. And only this C, in fact, it depends on the application layer, that is, on the developer. Consistency here refers to data jumping from one correct state to another correct state.
Example: Account A transfers 1000 to account B, the amount transferred by A must be less than or equal to its own account balance, that is, when the transaction is submitted, the account balance of A cannot be negative, and the database constraint can be used to ensure that the field value of the account amount is greater than or equal to 0 .
How InnoDB performs consistent non-locking reads (MVCC principle)
Consistent nonlocking read (consistent nonlocking read) means that the InnoDB storage engine reads the data of the row in the database at the current execution time through row multiversioning. If the read row is performing a DELETE or UPDATE operation, the read operation will not wait for the release of the row lock. Instead, the InnoDB storage engine reads a snapshot of the row. It is called a non-locking read because there is no need to wait for the release of the X lock on the row being accessed. The snapshot data refers to the data of the previous version of the row, which is implemented through the undo segment. And undo is used to roll back data in a transaction, so the snapshot data itself has no additional overhead. In addition, reading snapshot data does not require locking, because no transaction needs to modify historical data. Snapshot data is actually the historical version before the current row of data, and each row of records may have multiple versions. As shown in the figure on the right, each row record may have more than one snapshot data, and this technology is generally called row multi-version technology. The resulting concurrency control is called multi-version concurrency control.
For the transaction isolation level of READ COMMITTED, it always reads the latest version of the row, and if the row is locked, reads the latest snapshot (freshsnapshot) of the row version. For the transaction isolation level of REPEATABLE READ, the row data at the beginning of the transaction is always read.
For the transaction isolation level of READ COMMITTED, from the perspective of database theory, it violates the characteristics of I in transaction ACID, that is, isolation.
The undo log version chain means that after a row of data is modified by multiple transactions in sequence, after each transaction is modified, Mysql will retain the data undo rollback log before modification, and use two hidden fields trx_id and roll_pointer to connect these undo logs in series together to form a history version chain.
At the repeatable read isolation level, when the transaction is started, any query sql will generate a consistent view read-view of the current transaction, which will not change until the end of the transaction (if the read-committed isolation level is executed every time It will be regenerated when querying sql), this view is composed of all uncommitted transaction id arrays when executing the query (the smallest id in the array is min_id) and the largest transaction id (max_id) that has been created, any sql query results in the transaction need to be obtained from The latest data in the corresponding version chain is compared with read-view one by one to get the final snapshot result. Version chain comparison rules:
If the row's trx_id falls in the green part ( trx_idIf the trx_id of the row falls in the red part ( trx_id>max_id ), it means that this version is generated by a transaction started in the future and is invisible (if the trx_id of the row is the current own transaction is visible);
If the trx_id of the row falls in the yellow part (min_id <=trx_id<= max_id), it includes two cases
a. If the trx_id of the row is in the view array, it means that this version is generated by a transaction that has not yet been submitted and is not visible (if the trx_id of the row is the current own transaction is visible);
b. If the trx_id of the row is not in the view array, it means that this version is generated by a transaction that has already been committed, and it can be seen.
For the case of deletion, it can be considered as a special case of update. The latest data on the version chain will be copied, and then the trx_id will be modified to the trx_id of the deletion operation. At the same time, the deleted_flag in the record header (record header) of the record ) mark bit to write true to indicate that the current record has been deleted. When querying, find the corresponding record according to the above rules. If the delete_flag mark bit is true, it means that the record has been deleted, and no data will be returned.
Note: The begin/start transaction command is not the starting point of a transaction. After executing them, the first statement to modify and operate the InnoDB table will start the transaction, and then apply for the transaction id to mysql. Mysql strictly follows the transaction Start sequence to assign transaction ids.
Summary: The MVCC mechanism is implemented through the read-view mechanism and the undo version chain comparison mechanism, so that different transactions will read different versions of the same data on the version chain according to the data version chain comparison rules.
BufferPool cache mechanism
Why can't Mysql directly update the data on the disk and set up such a complex mechanism to execute SQL?
Because a request directly performs random read and write on the disk file, and then updates the data in the disk file, the performance may be quite poor.
Because the performance of disk random read and write is very poor, directly updating disk files cannot make the database resistant to high concurrency. The Mysql mechanism looks complicated, but it can ensure that each update request updates the memory BufferPool, and then writes log files sequentially, while also ensuring data consistency under various abnormal conditions. The performance of updating the memory is extremely high, and the performance of sequentially writing log files on the disk is also very high, which is much higher than that of randomly reading and writing disk files. It is through this mechanism that our MySQL database can resist several thousand read and write requests per second on a machine with a higher configuration.
The underlying implementation principle of persistence
Once a transaction completes, its results are unaffected no matter what system error occurs, and the results of the transaction are written to persistent storage. The underlying implementation principle is: Redo log mechanism to achieve, mysql data is stored on this disk, but every time you read data, you need to go through this disk io, the efficiency is very low, using innodb provides a cache buffer, this The buffer contains a mapping of some data pages on the disk. As a buffer for accessing the database, when reading a piece of data from the database, it will be obtained from this buffer first. If there is no data in the buffer, it will be obtained from this disk. After reading Then put it into this buffer buffer. When the database writes data, it will also write data into this buffer first, and periodically refresh the data in the buffer to the disk for a persistent operation. If the data in the buffer has not had time to be synchronized to the disk, and MySQL is down at this time, the data in the buffer will be lost, resulting in data loss, and the persistence cannot be guaranteed. Use redolog to solve this problem. When the data in the database needs to be added or modified, in addition to modifying the data in the buffer, this operation will also be written into the redolog. If msyql is down, you can Use redolog to restore data. Redolog is a pre-written log. It will first write all changes to the log, and then update it to the buffer. This ensures that the data will not be lost and the persistence of the data. Redolog is a record The modification operation is mainly used for submitting or restoring data!
Transaction isolation is achieved by the locks described earlier. Redo log is called redo log, which is used to ensure the durability of transactions. Redo is usually a physical log, which records the physical modification operation of the page. Redo logs are used to achieve transaction persistence, that is, D in transaction ACID. It consists of two parts: n one is the redo log buffer (redo logbuffer) in memory, which is volatile; n second is the redo log file (redologfile), which is persistent. InnoDB is a storage engine for transactions, which implements transaction persistence through the Force Log at Commit mechanism, that is, when a transaction is committed (COMMIT), all logs of the transaction must first be written to the redo log file for persistence. The COMMIT operation is completed to be considered complete. In order to ensure that each log is written to the redo log file, the InnoDB storage engine needs to call an fsync operation after each redo log buffer is written to the redo log file. Since the redo log file is opened without the O_DIRECT option, the redo log buffer is first written to the file system cache. In order to ensure that redo logs are written to disk, an fsync operation must be performed. Since the efficiency of fsync depends on the performance of the disk, the performance of the disk determines the performance of transaction submission, that is, the performance of the database.
Strategy for flushing redo logs to disk
The strategy for flushing redo logs to disk is controlled by the parameter innodb_flush_log_at_trx_commit. n The default value of this parameter is 1, which means that the fsync operation must be called once when the transaction is committed. You can also set the value of this parameter to 0 and 2.
n 0 means that the redo log operation will not be written when the transaction is committed. This operation is only completed in the master thread, and the fsync operation of the redo log file will be performed every 1 second in the master thread. n 2 indicates that the redo log is written into the redo log file when the transaction is committed, but only written into the cache of the file system, and no fsync operation is performed. Under this setting, when the MySQL database goes down and the operating system does not go down, the transaction will not be lost. When the operating system is down, the part of the transaction that has not been flushed from the file system cache to the redo log file will be lost after the database is restarted. Example: Insert 500,000 pieces of data one by one. innodb_flush_log_at_trx_commit = 1: 2 minutes and 13 seconds. 500,000 redo log writes; 500,000 fsync operations. innodb_flush_log_at_trx_commit = 0: 23 seconds. About 23 writes such as redo log; fsync operation about 23 times. innodb_flush_log_at_trx_commit = 2: 35 seconds. 500,000 redo log writes (cached only); 0 fsync operations. Although users can improve the performance of transaction commit by setting the parameter innodb_flush_log_at_trx_commit to 0 or 2, it should be kept in mind that this setting method loses the ACID characteristics of the transaction. For the above stored procedure, in order to improve the commit performance of the transaction, a COMMIT operation should be performed after inserting 500,000 rows into the table, instead of performing a COMMIT operation after each record is inserted. The advantage of this is that it can also make the transaction method roll back to the initial definite state of the transaction when it is rolled back. The correct method: innodb_flush_log_at_trx_commit = 1, distribute and submit 500,000 pieces of data in one transaction or multiple transactions, and reduce the number of fsyncs.
Principle of Isolation Implementation
Multiple transactions process the same data concurrently, so each transaction should be isolated from other transactions to prevent data corruption. The underlying implementation principle: write-write operation: through locking, the principle is the same as the locking mechanism in java. Write-read operation: MVCC multi-version concurrency control. By default, the two operations of reading and writing a row of data do not guarantee isolation through locking and mutual exclusion, avoiding frequent locking and mutual exclusion. After a row of data is modified by multiple transactions in sequence, after each transaction is modified, Mysql will keep the data undo rollback log before modification, and use two hidden fields trx_id and roll_pointer to concatenate these undo logs to form a historical record version chain. At the repeatable read isolation level, when the transaction is started, executing any SQL query will generate the consistent view read-view of the current transaction, that is, the first select generates a version, and the read-view view will not change until the end of the transaction . If it is the read submitted isolation level, the view read-view will be regenerated every time the query sql is executed, that is, a version is generated every time select. When executing the query, start from the latest data in the corresponding version chain to compare with read-view one by one, and compare the current transaction id with the created minimum transaction id and the created maximum transaction id in the readview view array. There are three cases here. In the first case, the id of the current transaction is smaller than the smallest id in the array, indicating that this version is generated by a committed transaction, indicating that the data is visible. Second, the current transaction is larger than the largest transaction that has been created The service id is even larger, indicating that this version has not opened the transaction, indicating that it is not visible. The third type, if it happens to be in this range, and the transaction id being accessed is between the minimum transaction id and the maximum transaction id, there are two cases. The first type is that this version is generated by a transaction that has not yet been committed and is not visible , the second type, indicating that this version is generated by a transaction that has already been committed, visible. Do the comparison to get the final snapshot result. This mechanism ensures isolation.
isolation level
The four isolation levels defined by the SQL standard are:
❑READ UNCOMMITTED (resulting in dirty reads)
❑READCOMMITTED (leading to phantom reading)
❑REPEATABLE READ (used by default to avoid phantom reads and dirty reads)
❑SERIALIZABLE (higher level of isolation, avoid phantom reads, avoid dirty reads)
The default isolation level supported by the InnoDB storage engine is REPEATABLE READ, but unlike standard SQL, the InnoDB storage engine uses the Next-Key Lock algorithm under the REPEATABLE READ transaction isolation level, thus avoiding phantom reads. This is different from other database systems such as Microsoft SQLServer databases. Therefore, the InnoDB storage engine can fully guarantee the isolation requirements of transactions under the default transaction isolation level of REPEATABLE READ, that is, the SERIALIZABLE isolation level of the SQL standard. The lower the isolation level, the fewer locks a transaction requests or holds for a shorter period of time. This is why the default transaction isolation level of most database systems is READCOMMITTED.
At the transaction isolation level of SERIALIABLE, the InnoDB storage engine will automatically add LOCK IN SHARE MODE to each SELECT statement, that is, add a shared lock for each read operation. Therefore, under this transaction isolation level, reads occupy locks, and consistent non-locking reads are no longer supported.
Dirty read/Non-repeatable read/Phantom read
If the isolation of transactions is not considered, several problems will occur:
The first problem is dirty reads, where data in another uncommitted transaction is read during one transaction. For example, the company paid the salary, and the leader sent 40,000 yuan to my account, but the transaction was not submitted, and I happened to check the account, and found that the salary had already arrived, and it was 40,000 yuan. I was very happy. But unfortunately, the leader found out that the amount of salary paid to me was wrong, which was 35,000 yuan, so he quickly revised the amount and submitted the matter. In the end, my actual salary was only 35,000 yuan, so I was happy for nothing.
The second problem is non-repeatable reading. A certain data is queried multiple times within the scope of a transaction but returns different results. In plain English, transaction T1 reads the data, and transaction T2 immediately modifies the data and submits the transaction to the database. , transaction T1 reads this data again and gets a different result, and a non-repeatable read occurs. For example, I used my salary card to make purchases, and the system read that there was indeed 100 yuan in the card. At this time, my girlfriend just used my salary card to transfer money online, and transferred the 100 yuan from my salary card. I switched to another account and submitted the transaction before me. When I deducted the money, the system checked that my salary card had no money, and the deduction failed. Liao Zhiwei was very puzzled because there was money in the card.
The third problem is phantom reading. Transaction T1 modifies the data of a table from "1" to "2". At this time, transaction T2 inserts another piece of data into this table, and the value of this data is still " 1" and submit it to the database, and the user who operates transaction T1 checks the data just modified and finds that there is still one row that has not been modified. For example, when I use my salary card to consume, once the system starts to read the information of the salary card and the transaction starts at this time, it is impossible for my girlfriend to modify the record, that is, my girlfriend cannot be in this Time to transfer. This avoids non-repeatable reads. Suppose my girlfriend works in the banking department, and she often checks my salary card consumption records through the bank's internal system. One day, she was inquiring that the total consumption amount of my credit card for the month (select sum(amount) from transaction where month = this month) was 80 yuan, and I happened to pay at the cash register after eating and drinking outside at this time. 1,000 yuan, that is, a new consumption record of 1,000 yuan (insert transaction ... ) was added, and the transaction was submitted, and then my girlfriend printed the details of my monthly salary card consumption on A4 paper, but found that the total consumption was 1,080 yuan , my girlfriend was very surprised, thinking that there was a hallucination, and the phantom reading just happened.

The underlying principle of ACID

Related Articles

A detailed explanation of Hadoop core architecture HDFS

What Does IOT Mean

6 Optional Technologies for Data Storage

What Is Blockchain Technology

Explore More Special Offers

Short Message Service(SMS) & Mail Service

Sales Support

Technical Support

Connect & Report Abuse