Redis' RDB persistence principle

Redis provides two persistence mechanisms, RDB and AOF. This chapter first describes how the Redis server saves and loads RDB files, focusing on the implementation of the SVAE and BGSAVE commands. After that, the chapter continues with a description of how the Redis server’s auto-save feature is implemented. The components are described, and the structure and meaning of these components are explained. At the end of this chapter, we will analyze and interpret actual RDB files, and put what we have learned about RDB files into practical applications. Some pseudo-code will be included to facilitate understanding. The source of this article is the book redis Design and Implementation.

Basic Introduction

RDB persistence can be performed either manually or periodically depending on the server configuration options, which allows saving the database state at a point in time to an RDB file. The generated RDB file is a compressed binary file that can be restored. The generated RDB file is a compressed binary file that can be used to restore the database state at the time the RDB file was generated.

RDB file creation and loading

There are two Redis commands that can be used to generate RDB files, one is SAVE and the other is BGSAVE.

The SAVE command blocks the Redis server process until the RDB file is created, and the server cannot process any command requests while the server process is blocked.
The BGSAVE command spawns a child process, which is then responsible for creating the RDB file, while the server process (the parent process) continues to process the command requests.

The actual work of creating the RDB file is done by the rdb.c/rdbsave function. The SAVE command and the BGSAVE command call this function in different ways, and the difference between these two commands can be clearly seen by the following pseudo code:

def SAVE():

    rdbSave()


def BGSAVE():

    pid = fork()

    if pid == 0:

        # 子进程保存 RDB
        rdbSave()

    elif pid > 0:

        # 父进程继续处理请求，并等待子进程的完成信号
        handle_request()

    else:

        # pid == -1
        # 处理 fork 错误
        handle_fork_error()

The loading of RDB files is performed automatically at server startup, and the actual work of loading RDB files is done by the rdb.c/rdbLoad function. So Redis does not have a command specifically for loading RDB files, as long as the Redis server detects the existence of an RDB file at startup, it will automatically load the RDB file.

redis rdb

See the above output is printed when the RDB file is successfully loaded. It is also worth mentioning that because the AOF file is usually updated more frequently than the RDB file, so :

If the server has AOF persistence enabled, then the server will use the AOF file to restore the database state first.
The server will use the RDB file to restore the database state only when the AOF persistence feature is off.

The following diagram shows the flow of judgment when the server loads a file.

flow of judgment when the server loads a file

The different states of the server when the SAVE and BGSAVE commands are executed

SAVE

As mentioned earlier, the Redis server is blocked while the SAVE command is executing, so all command requests sent by the client are rejected while the SAVE command is executing. Only after the server finishes executing the SAVE command and starts accepting command requests again will the commands sent by the client be processed.

BGSAVE

Because the saving of BGSAVE command is executed by the child process, the Redis server can still continue to process the command requests from the client during the process of creating RDB files by the child process, however, during the execution of BGSAVE command, the server will handle the SAVE, BGSAVE and BGREWRITEAOF commands in a different way than usual.

First, during the execution of the BGSAVE command, the SAVE command sent by the client will be rejected by the server. The server prohibits the SAVE command and the BGSAVE command from executing at the same time to avoid the parent process (server process) and the child process from executing two rdbSave function calls at the same time to prevent competing conditions.

Second, during the execution of the BGSAVE command, the BGSAVE command sent by the client will be rejected by the server because the simultaneous execution of two BGSAVE commands will also create a race condition. Finally, the BGREWRITEAOF and BGSAVE commands cannot be executed at the same time.

If the BGSAVE command is executing, then the BGREWRITEAOF command sent by the client is delayed until the BGSAVE command has finished executing.
If the BGREWRITEAOF command is executing, then the BGSAVE command sent by the client will be rejected by the server.

Since the actual work of both BGSAVE and BGREWRITEAOF is performed by the subprocesses, there is no operational conflict between the two commands, and it is only a performance consideration that they cannot be executed simultaneously. It is not a good idea to issue two subprocesses and have both of them perform a lot of disk write operations at the same time.

The server will remain in a blocking state while the RDB file is loaded until the load is complete.

Automatic interval saving

This is to use the BGSAVE command to set the relevant conditions to execute the command, for example, we redis generally have the following configuration.

1
2
3

save 900 1
save 300 10
save 60 10000

Explanation of the above configuration

The server has made at least 1 changes to the database within 900 seconds
The server has made at least 10 changes to the database in 300 seconds
The server has made at least 10000 changes to the database in 60 seconds

Autosave pseudocode

struct redisServer {

    // 记录保存条件的数组
    struct saveparam *saveparams;

    // 修改计数器
    long long dirty;

    // 上一次执行保存的时间
    time_t lastsave;

    // ....
}

struct saveparam {

    // 秒数
    time_t seconds;

    // 修改数
    int changes;
}

Roughly, the diagram looks like this.

redis rdb

In addition to the saveparams array, the server state maintains a dirty counter, and a lastsave attribute.

The dirty counter records how many changes (including writes, deletes, updates, etc.) the server has made to the database state (all databases on the server) since the last successful execution of a SAVE command or BGSAVE command.
The lastsave property is a UNIX timestamp that records when the server last successfully executed a SAVE command or a BGSAVE command.

Example.

1
2
3

SET message "hello" # 程序此时将 dirty计数器增加1

SADD database Redis MongoDB MariaDB # 程序此时将 dirty计数器增加3

redis rdb

The above figure shows the dirty counter and the lastsave attribute contained in the server state, illustrated as follows.

The dirty counter has a value of 123, indicating that the server has made 123 changes to the database state since the last save.
The lastsave property records the timestamp of the last time the server performed a save operation.

Check if the save condition is met

Redis’s server-periodic operation function servercron is executed by default every 100 milliseconds to see if the condition has been met and, if so, to execute the BGSAVE command.

The following pseudo-code shows the servercron function checking for a save condition.

def serverCron():
    # 遍历所有条件
    for saveparam in server.saveparams:

        # 计算距离上次执行保存操作有多少秒
        save_interval = unixtime_now() - server.lastsave

        # 如果数据库状态的修改次数超过条件所设置的次数 并且距离上次保存的时间超过条件所设置的时间 那么执行保存操作
        if server.dirty >= saveparam.changes and save_interval > saveparam.seconds:
            BGSAVE()

The above code shows that the program will iterate through and check all the save conditions in the saveparams array, and as long as any of the conditions are met, then the server will execute the BGSAVE command.

RDB file structure

The following shows the various parts of a complete RDB file.

RDB

REDIS

At the beginning of the RDB file is the REDIS section, which is 5 bytes long and holds the five characters "REDIS". With these five characters, the program can quickly check if the file loaded is a RDB file when it is loaded.

db_version

db_version is a four-byte character integer that records the RDB version number used by the file. The current version of the RDB file is 0009. Since different versions of RDB files are not compatible with each other, you need to choose different read methods depending on the version when reading into the program.

databases

The databases section contains zero or any number of databases, and the key-value pairs in each database:

If the server’s database status is empty (all databases are empty), then this section is also empty and is 0 bytes long
If the server’s database status is non-empty (at least one database is non-empty), then this section is also non-empty, and the length of this section varies depending on the number, type, and content of the key-value pairs stored in the database.

EOF

The length of the EOF constant is 1 byte. This constant marks the end of the body of the RDB file, and when the read program encounters this value, it knows that all key-value pairs for all databases have been loaded.

CheckSum

check_sum is an 8-byte unsigned integer that holds a checksum, which is calculated by the program from the contents of REDIS, db_version, databases, and EOF. When the server loads the RDB file, it will compare the checksum calculated from the loaded data with the checksum recorded by check_sum to check whether there is any error or corruption in the RDB file.

Starting from Version 5, if rdbchecksum yes is enabled in the configuration file, the checksum of the whole file content will be calculated by CRC64 with 8 bytes at the end of the RDB file.

For an example

This is my latest pull of redis with empty data, let’s analyze it in turn: od -c rdb.rdb.

0000000   R   E   D   I   S   0   0   0   9 372  \t   r   e   d   i   s
0000020   -   v   e   r 005   6   .   0   .   9 372  \n   r   e   d   i
0000040   s   -   b   i   t   s 300   @ 372 005   c   t   i   m   e 302
0000060   M 301 264   _ 372  \b   u   s   e   d   -   m   e   m 302   @
0000100 345  \f  \0 372  \f   a   o   f   -   p   r   e   a   m   b   l
0000120   e 300  \0 377   g 311 203 274 200   T 211 376
0000134

These fields are actually the file header contents of the AOF and RDB generic sections.

the first 5 bytes are fixed as REDIS
the first four bytes are RDB version number from 6 to 9
next is redis-ver and its value, i.e. redis version
then redis-bits and its value, i.e. the number of bits of redis, the value is 32 or 64
next is ctime and its value, the RDB file creation time
then used-mem and its value, RDB file creation time
and finally aof-preamble and its value, the value is 0 or 1, 1 means RDB is valid.

But the RDB file header has three more items before aof-preamble as follows.

repl-stream-db The database selected in the server.master client
repl-id The current instance replication ID
repl-offset The offset of the current instance replication

Summary

The RDB file is used to save and restore all key-value pairs of data in all databases of the Redis server.
The SAVE command performs the save operation directly by the server process, so this command blocks the server.
The BGSAVE command performs the save operation by a child process, so the command does not block the server.
All save conditions set with the save option are stored in the server state, and the server will automatically execute the BGSAVE command when any of the save conditions are satisfied.
An RDB file is a compressed binary file consisting of multiple parts.
For different types of key-value pairs, the RDB file will use different ways to save them.

Table of Contents

Basic Introduction

RDB file creation and loading

The different states of the server when the SAVE and BGSAVE commands are executed

SAVE

BGSAVE

Automatic interval saving

Autosave pseudocode

Check if the save condition is met

RDB file structure

REDIS

db_version

databases

EOF

CheckSum

For an example

Summary