What exactly is MVCC?

Definition: Multi-Version Concurrency Control , literally, the use of multiple versions for reasonable control during concurrency (that’s how I took it literally anyway), obviously this thing is an abstract concept, and it is. It is mainly found in some data management software. Maintaining multiple versions of a data makes the read and write operations conflict-free.

  • Why is this thing there?

We all know that the data management program provides the function of querying and modifying data, but how to solve the conflict problem during reading and writing it, in order to maintain data consistency and maintain high performance, to even when there is a read-write conflict, but also to do without locking, non-blocking concurrent reading, MVCC This concurrency control algorithm has emerged.

MVCC Fundamentals

MVCC is allowing multiple versions of an object to exist simultaneously. That is, it has a “current” version and one or more previous versions. You can use different versions of it as needed to solve the problem you are facing when you get the version. During this run, the “author” can create and publish a new version of the object, which will become the latest version of the object, and the “reader” can still use the previous version as well.

Which versions are available to “readers”? This is related to your real needs, for example, the isolation level in Mysql, different levels of the same concurrent operation after the results see inconsistent, we can according to the needs of the reasonable display data object version, in different isolation levels to achieve different results.

Since we know the abstraction of MVCC, let’s take a look at its implementation. MVCC has been embraced by many data management programs, which undoubtedly proves that it is a good design idea, such as the Innodb engine in Mysql, Etcd storage, PostgreSQL, oracle, etc. Let’s take the Innodb engine as an example and observe how it is implemented MVCC.

Innodb Engine MVCC Exploration

In the Innodb engine, we need to understand some basic concepts first: undolog, version chaining and ReadView.

undolog

We know that transactions are atomic in nature. But when the system fails, or when you roll back manually, how to ensure that all the committed data is restored this time, sometimes it is possible that the transaction is only half executed, we want to ensure that it is the same as the original, the transaction does not seem to do anything, to give an example in life: such as chess, when you play the wrong piece you can apply for a repentance, repentance is a kind of rollback operation, in fact, is the previously executed operation again You insert a record, the log of the rollback operation corresponds to the deletion of the record; you update a record, the rollback operation corresponds to the update of the record to the old value; you delete a record, the corresponding rollback operation is the insertion.

In the real InnoDB, undo logs are not as simple as we described above, and the format of undo logs is different for different types of operations, but let’s put aside these specific details that are easy to make people’s brains confused for a while and pay attention to the focus of our article.

Version chaining

Each time a record is updated, the old value is put into an undo log, forming an old version of the record. As the number of updates increases, all versions are linked into a chain by the roll_pointer property of the data line, which we call the version chain. The head node of the version chain is the latest value of the current record. In addition, each version also contains the transaction ID that generated the version, and the approximate logic is shown below.

mvcc Version chaining

ReadView

In the RC and RR isolation levels (READ UNCOMMITTED, SERIALIZABLE is not used in Read View), it must be guaranteed that the transaction has been read to the already committed modified records, that is, assuming that another record has been modified but not yet committed, is not directly read the latest version of the record, the core problem is. The core problem is that we need to determine which version of the version chain is visible to the current transaction. The concept of Read View is designed to solve this problem, and it contains four elements.

  1. m_ids generates a list of reads and writes transaction IDs that are active on the system at the time of `ReadView.
  2. min_trx_id generates the smallest transaction ID of the active transactions on the system at the time of ReadView, i.e. the smallest value in m_ids.
  3. max_trx_id should be assigned to the next transaction ID in the system at the time ReadView is generated.
  4. creator_trx_id generates the transaction id for ReadView only when changes are made to rows in the table (when executing INSERT, DELETE, UPDATE statements), otherwise the transaction id value defaults to 0 in a read-only transaction.

With this ReadView, when accessing a record, you only need to follow the following steps to determine if a version of the record is visible.

  • If the value of the trx_id attribute of the accessed version is the same as the value of creator_trx_id in the ReadView, it means that the current transaction is accessing its own modified record, so that version can be accessed by the current transaction.
  • If the value of the trx_id attribute of the accessed version is less than the value of min_trx_id in ReadView, it means that the transaction that generated the version was committed before the current transaction generated ReadView, so the version is accessible to the current transaction.
  • If the value of the trx_id attribute of the accessed version is greater than or equal to the value of max_trx_id in ReadView, the transaction that generated the version was opened after the current transaction generated ReadView, so the version is not accessible by the current transaction.
  • If the value of the trx_id property of the accessed version is between min_trx_id and max_trx_id of the ReadView, then you need to determine if the value of the trx_id property is in the m_ids list, and if it is, the transaction that generated the version when the ReadView was created is still active, and the version If it is not, then the transaction that generated the version when the ReadView was created has been committed and the version can be accessed.

If a version of the data is not visible to the current transaction, then follow the version chain to the next version of the data and continue to follow the above steps to determine visibility. And so on, until the last version in the version chain. If the last version is not visible, it means that the record is not visible to the transaction at all, and the query result does not contain the record.

The timing of the ReadView generation also directly affects the result of the query operation. In Mysql, the biggest difference between RC and RR isolation levels is the timing of the ReadView generation.

READ COMMITTED – Generate a ReadView before each data read

Let’s say there are two transactions with ids 100 and 200 executing in the system.

1
2
3
4
5
6
# Transaction 100
BEGIN;

UPDATE hero SET name = 'Guan Yu' WHERE number = 1;

UPDATE hero SET name = 'Zhang Fei' WHERE number = 1;
1
2
3
4
5
# Transaction 200
BEGIN;

# 更新了一些别的表的记录
...

Tip: Once again, the transaction execution process will only be assigned a separate transaction id, which is incremental, when the record is actually modified for the first time (e.g. using INSERT, DELETE, UPDATE statements). That’s why we update some other table’s records in Transaction 200, with the purpose of having it assigned a transaction id.

At this moment, the row with number of 1 in the table hero gets the version chain as shown below.

version chain

Suppose now that a transaction using the READ COMMITTED isolation level starts executing.

1
2
3
4
5
# 使用READ COMMITTED隔离级别的事务
BEGIN;

# SELECT1Transaction 100200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'Liu Bei'

The execution of SELECT1 is as follows.

  • The SELECT statement is executed as a ReadView, and the contents of the m_ids list of the ReadView is [100, 200], min_trx_id is 100, max_trx_id is 201, and creator_trx_id is 0.
  • Then pick the visible records from the version chain. From the figure, we can see that the column name of the latest version is ‘Zhang Fei’, and the trx_id value of this version is 100, which is inside the m_ids list, so it does not meet the visibility requirement, and jump to the next version according to roll_pointer.
  • The column name of the next version is ‘Guan Yu’, the value of trx_id of this version is also 100, which is also in the m_ids list, so it also does not meet the requirement, and continues to jump to the next version.
  • The next version of column name is ‘Liu Bei’, and the trx_id value of this version is 80, which is smaller than the min_trx_id value of 100 in ReadView, so this version is compliant, and the last version returned to the user is this record with column name as ‘Liu Bei’.

After that, let’s commit the transaction with transaction id 100, like this.

1
2
3
4
5
6
7
8
# Transaction 100
BEGIN;

UPDATE hero SET name = 'Guan Yu' WHERE number = 1;

UPDATE hero SET name = 'Zhang Fei' WHERE number = 1;

COMMIT;

Then go to the transaction with transaction id 200 and update the record with number 1 in the hero table.

1
2
3
4
5
6
7
8
9
# Transaction 200
BEGIN;

# 更新了一些别的表的记录
...

UPDATE hero SET name = 'Zhao Yun' WHERE number = 1;

UPDATE hero SET name = 'Zhuge Liang' WHERE number = 1;

At this moment, the version chain of the record with number 1 in table hero looks like this.

version chain

Then go to the transaction that just used the READ COMMITTED isolation level and continue to look for the record with number 1 as follows.

1
2
3
4
5
6
7
8
# 使用READ COMMITTED隔离级别的事务
BEGIN;

# SELECT1Transaction 100200均未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'Liu Bei'

# SELECT2Transaction 100提交,Transaction 200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'Zhang Fei'

The execution of this SELECT2 is as follows.

  • A separate ReadView is generated when the SELECT statement is executed, and the contents of the m_ids list of that ReadView is [200] (the transaction with transaction id 100 has already been committed, so the snapshot is generated again without it), min_trx_id is 200, max_trx_id is 201, and creator_trx_id is 0. idis 201 andcreator_trx_id` is 0.
  • Then pick the visible records from the version chain. As you can see from the figure, the column name of the latest version is ‘Zhuge Liang’, and the trx_id value of this version is 200, which is inside the m_ids list, so it does not meet the visibility requirement, and jump to the next version according to roll_pointer.
  • The column name of the next version is ‘Zhao Yun’, the value of trx_id of this version is 200, which is also in the m_ids list, so it also does not meet the requirement, and continues to jump to the next version.
  • The next version of column name is ‘Zhang Fei’, and the trx_id value of this version is 100, which is smaller than the min_trx_id value of 200 in ReadView, so this version is compliant, and the last version returned to the user is the record with this column name as ‘Zhang Fei’.

And so on, if after the transaction id 200 records are also submitted, again in the use of READ COMMITTED isolation level transactions in the query table hero in the number value of 1 record, the results are ‘Zhuge Liang’, the specific process we will not analyze. To summarize, a transaction using the READ COMMITTED isolation level generates a separate ReadView at the beginning of each query.

REPEATABLE READ – Generates a ReadView on the first read, ensuring consistent results across multiple queries in a single transaction

For transactions using the REPEATABLE READ isolation level, only one ReadView will be generated on the first execution of the query statement, and it will not be repeated for subsequent queries. Let’s use the example to see how this works.

Let’s say there are now two transactions with transaction ids 100 and 200 in the system executing.

1
2
3
4
5
6
# Transaction 100
BEGIN;

UPDATE hero SET name = 'Guan Yu' WHERE number = 1;

UPDATE hero SET name = 'Zhang Fei' WHERE number = 1;
1
2
3
4
5
# Transaction 200
BEGIN;

# 更新了一些别的表的记录
...

At this moment, the version chain obtained for the record with number 1 in table hero is shown below.

version chain

Suppose now that a transaction using the REPEATABLE READ isolation level starts executing.

1
2
3
4
5
# 使用REPEATABLE READ隔离级别的事务
BEGIN;

# SELECT1Transaction 100200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'Liu Bei'

The execution of SELECT1 is as follows.

  • The SELECT statement is executed as a ReadView, and the contents of the m_ids list of the ReadView is [100, 200], min_trx_id is 100, max_trx_id is 201, and creator_trx_id is 0.
  • Then pick the visible records from the version chain. From the figure, we can see that the column name of the latest version is ‘Zhang Fei’, and the trx_id value of this version is 100, which is inside the m_ids list, so it does not meet the visibility requirement, and jump to the next version according to roll_pointer.
  • The column name of the next version is ‘Guan Yu’, the value of trx_id of this version is also 100, which is also in the m_ids list, so it also does not meet the requirement, and continues to jump to the next version.
  • The next version of column name is ‘Liu Bei’, and the trx_id value of this version is 80, which is smaller than the min_trx_id value of 100 in ReadView, so this version is compliant, and the last version returned to the user is this record with column name as ‘Liu Bei’.

After that, we commit the transaction with transaction id 100, like this.

1
2
3
4
5
6
7
8
# Transaction 100
BEGIN;

UPDATE hero SET name = 'Guan Yu' WHERE number = 1;

UPDATE hero SET name = 'Zhang Fei' WHERE number = 1;

COMMIT;

Then go to the transaction with transaction id 200 and update the record with number 1 in the hero table.

1
2
3
4
5
6
7
8
9
# Transaction 200
BEGIN;

# 更新了一些别的表的记录
...

UPDATE hero SET name = 'Zhao Yun' WHERE number = 1;

UPDATE hero SET name = 'Zhuge Liang' WHERE number = 1;

At this moment, the version chain of the record with number 1 in table hero looks like this.

version chain

Then go to the transaction that just used the REPEATABLE READ isolation level and continue to look for the record with number 1, as follows.

1
2
3
4
5
6
7
8
# 使用REPEATABLE READ隔离级别的事务
BEGIN;

# SELECT1Transaction 100200均未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值为'Liu Bei'

# SELECT2Transaction 100提交,Transaction 200未提交
SELECT * FROM hero WHERE number = 1; # 得到的列name的值仍为'Liu Bei'

The execution of SELECT2 is as follows.

  • Because the isolation level of the current transaction is REPEATABLE READ, and the ReadView was already generated during the execution of SELECT1, the previous ReadView is directly reused at this time, and the contents of the m_ids list of the previous ReadView are [100, 200], min_trx_id is 100, max_trx_id is 201, and creator_trx_id is 0. idis 100,max_trx_idis 201, andcreator_trx_id` is 0.
  • Then pick the visible records from the version chain. From the figure, we can see that the column name of the latest version is ‘Zhuge Liang’, and the trx_id value of this version is 200, which is inside the m_ids list, so it does not meet the visibility requirement, and jump to the next version according to roll_pointer.
  • The column name of the next version is ‘Zhao Yun’, the value of trx_id of this version is 200, which is also in the m_ids list, so it also does not meet the requirement, and continues to jump to the next version.
  • The next version of column name is ‘Zhang Fei’, the trx_id value of this version is 100, and the m_ids list contains transaction ids with the value of 100, so this version also does not meet the requirement, and similarly the next version of column name is ‘Guan Yu’. Proceed to the next version.
  • The next version of column name is ‘Liu Bei’, and the trx_id value of this version is 80, which is less than the min_trx_id value of 100 in ReadView, so this version meets the requirement, and the last version returned to the user is the record with column c as ‘Liu Bei’.

That is to say, the results obtained from two SELECT queries are duplicated, and the column c values of the records are all ‘Liu Bei’, which is the meaning of repeatable reads. If we then submit the record with transaction id 200, and then continue to find the record with number of 1 in the transaction using REPEATABLE READ isolation level, the result will still be `Liu Bei’, and you can analyze the specific execution process by yourself.

To summarize

After reading the above undolog, version chain and ReadView, we have some general ideas, ReadView is mainly to make visibility rules and rule verification for the record history, while the record history is stored in undolog using a chain table structure, the combination of the three solves the read and write problem in MVCC. The combination of the three solves the read/write problem in MVCC. The so-called MVCC in mysql refers to the process of accessing the version chain while executing ordinary SELECT operations using RC and RR transactions with two isolation levels, which enables read/write and write-write operations of different transactions to be executed concurrently, thus improving system performance, and their biggest difference lies in the different timing of generating ReadView.