Memcached is the oldest in-memory caching server. Because it doesn’t support as rich a data structure as Redis, Memcached is used less and less. But I think Memcached is a great example of the Unix philosophy of doing one thing at a time and doing it well. And Memcached is focused on the one thing that is memory caching. This is best demonstrated by the introduction of Meta Commands in Memcached in version 1.6. Today, we’ll introduce you to Meta Commands.

The structure of a Meta Commands is as follows.

1
<cm> <key> <datalen*> <flag1> <flag2> <...>\r\n

Here <cm> denotes command, which uses two letters to denote different commands. <key> indicates the key value to be operated on. <datalen*> indicates the length of the data part of this request. Some meta-commands do not need to transfer additional data, so this field is not needed. Then comes the essence of the Meta command: flag (tag). Each meta command can have multiple flags attached to it to influence the behavior of Memcached and the content of the return value.

The structure of the return value of a meta command is as follows.

1
<RC> <datalen*> <flag1> <flag2> <...>\r\n

Here <RC> means return code, which is the return code. Like <cm>, <RC> uses two letters. There is also the optional <datalen*>, which is followed by multiple tokens.

Here are a few examples of meta commands.

Set cache.

1
2
3
4
ms foo 2 T90 F1\r\n
hi\r\n

HD\r\n

The meta directive starts with m. ms denotes set, which is used to set the cache. In this example, foo is the key name, and 2 indicates the length of the data. This is followed by two tag parameters: T90 for setting the TTL to 90 seconds, and F1 for setting the cache to 1 for the corresponding business data. Start another line to transfer the data with the length of two.

Query cache.

1
2
3
4
mg foo t f v\r\n
VA 2 s2 t78 f1\r\n

hi\r\n

mg denotes get, which is used to query the cache information. Three token parameters are used in this example: t indicates that the cached TTL is returned, f indicates that the cached business token data is returned, and v indicates that the cached data is returned. Yes, mg does not return cached data by default. Because in many scenarios, the cached content will be large and we only need to see the TTL/CAS or something like that, so mg does not return the cached data by default and needs to add the v flag to return it.

To delete a cache.

1
2
3
md foo I\r\n

HD\r\n

md stands for delete, which is used to delete the cache. However, it is also possible to adjust the server’s behavior when deleting by marking it. For example, I here tells the server to mark the cache as expired (stale) and not to actually delete it. We’ll talk more about the specific usage below.

With the above as a foundation, we will then describe how the meta directive specifically addresses various types of caching issues. In total, there are three types of problems, as follows.

  • Cache-pass-through problems
  • Hot Key problems
  • Data transfer and memory usage problems

Cache Penetration Problem

Cache penetration is a system performance problem caused by a large number of requests going back to the source after a cache has been invalidated or deleted. There are two ways to deal with this problem: one is to use a source lock, so that only requests with concurrent locks can go back to the source; the other is to avoid cache invalidation. These two methods can be easily implemented by using meta directives.

Let’s start with the back-source concurrency lock.

We can use the N token when querying the cache.

1
2
mg foo c v N30
VA 0 c2 W

This means that the cache corresponding to foo is queried, and if it does not exist, a new empty cache is created. If more than one request executes this statement at the same time, only one request has a W token in the return value, indicating that it got a back source lock and the request needs to update the cache via back source. The other requests receive the Z token. Since this query also returns the CAS version via the c flag, the cache should be updated with the c flag. As follows.

1
2
ms foo 2 c2
hi

If another process updates the cache during the back-source process, the CAS version of the cache will also be updated and the back-source update will fail to avoid writing dirty data.

The above is for scenarios where the cache has been invalidated. There are only two cases of cache invalidation, one is when the expiration time has passed, and the other is when the source data has changed and the cache is actively deleted. The meta directive provides a back-to-source control mechanism for each of these two cases.

The first one is called Early Recache. So the so-called Early Recache is to obtain concurrent locks when the cache is about to expire, and try to update the cache before it expires, so as to avoid penetration. The specific method is as follows.

1
2
3
mg foo v t R30
VA 2 t29 W
hi

The mg here adds the R30 flag to indicate that concurrent locks are acquired if the TTL of the cache is less than 30 seconds. When multiple requests are executed simultaneously, only one request will return the W flag. This request can comfortably do the source return and update the cache. When updating the cache, you need to pass T to set the new TTL, thus avoiding the cache expiration problem.

Another type of cache is called a Stale Cache. That is, when we delete a cache, we don’t really delete it, but we mark it as dirty.

1
2
md foo I T30
HD

The I flag here means mark only, not delete. T30 means the dirty cache is kept for a maximum of 30 seconds. Once the query encounters a dirty cache, it will return an additional X tag.

1
2
3
mg foo t c v
VA 4 t29 c777 W X
data

An X flag indicates that the current cache is out of date, but the business code can still decide whether to continue using it. More importantly, only one request out of all concurrent requests will receive the W flag, begging for help to update the cache back to its source. Dirty caching allows for more precise back-source control.

The Hot Key Problem

Another prominent problem with caching is the hot Key problem. If a Key is particularly heavily accessed, it can put a lot of pressure on a single node. Usually people will spread the pressure to different instances by sharding. But this requires a prerequisite that you need to know in advance which Keys are hot Keys. so this can only be described as an after-the-fact remedy. The meta directive provides us with the ability to discover hot Keys automatically.

In fact, it is just the introduction of two tokens.

1
2
3
mg foo v h l c
VA 4 h1 l5 c4
data

The tags here are h and l respectively. Where h1 indicates that the current key has been accessed since it was created, and l5 indicates that the last access was made 5 seconds ago. With these two tokens, the business code can decide whether a local cache needs to be set or not. For example, we can do something like this.

1
2
3
if (h == 1 && l < 5 && random(1000) == 0) {
  add_to_local_cache(it)
}

The random(1000) here can be other data indicators. If a key has been accessed recently, it may be a hot key and can be considered to be cached locally for a short period of time.

Once the data is cached locally, it becomes a problem to update it. Because it is a hot key, the cache time cannot be set too small. One option is to use a message queue to broadcast data updates, but this is too heavy. We can use mg to start a local timing process that periodically queries the CAS version of the hotkey.

1
2
mg foo c
HD c500

Here mg only returns the version number. If the version has changed, the local cache is updated again. This solves the hotkey discovery and update problem in a relatively simple way.

Data transfer and memory usage issues

There are also two optimizations for reducing data transfer, one is to pass only the necessary data, and the other is to support returning multiple data in one request.

As we said earlier, the mg directive, by default, does not return cached data and keys. this way, if one only wants to see the TTL of a certain key, one only needs to execute it.

1
2
mg foo t
HD t94

Here HD means header, followed by t94 which means the current TTL is 94 seconds. We can also use a single command to query multiple messages.

1
2
mg foo t c
HD t94 c8

The c here represents the CAS version, which changes when the cache data is updated.

If you need to check multiple caches, the traditional directive needs to be written like this.

1
2
3
4
get bar foooooooooooooooooooooooooooooooo baz
VALUE foooooooooooooooooooooooooooooooo 0 2
hi
END

Only foooooooooooooooooooooooooooooooo has a value here, and END means end. But mg does not support querying more than one key. the same effect needs to be written like this.

1
2
3
mg bar v\r\nmg foooooooooooooooooooooooooooooooo v\r\ngm baz v\r\n
VA 2
hi

Since mg does not return the key by default, it is not clear which key hi corresponds to, and we can certainly ask the server to return the corresponding key by adding the k tag.

1
2
3
mg bar k v\r\nmg foooooooooooooooooooooooooooooooo k v\r\ngm baz k v\r\n
VA 2 kfoooooooooooooooooooooooooooooooo
hi

But here foo+ is a bit too wasteful compared to the data hi. To reduce data transfer, mg supports O tags to pass in business data.

1
2
3
mg bar O1 v\r\nmg foooooooooooooooooooooooooooooooo O2 v\r\ngm baz O3 v\r\n
VA 2 O1
hi

This allows the business code to complete the mapping with the O flag in the return value. And because mg does not return END like get, to explicitly mark the completion of the data transfer, we need to pass an additional mn command, and the server will return the MN response directly. The complete command is as follows.

1
2
3
4
mg bar O1 v\r\nmg foooooooooooooooooooooooooooooooo O2 v\r\ngm baz O3 v\r\nmn\r\n
VA 2 O1
hi
MN

This achieves the effect of querying multiple keys with minimal data transfer.

Because the meta instruction can have multiple token information attached to it, we can complete multiple operations in a single handoff. As an example. If we want to use the cache to control spike inventory with a maximum inventory quantity of 100, we can perform the following operation.

1
2
3
ma foo M- N120 J99 v
VA 2
99

Here ma means arithmetic, which means arithmetic operation. M- means that the number is subtracted by one; N120 means that if there is no cache, a new one is created with a TTL of 120 seconds; J99 means that the newly added cache value is 99; and v means that the result is returned. Since Memcached does not perform a minus one operation when creating a cache, the initial value of 100 is set to ensure that it will be minus one every time. But after it gets to 0, it doesn’t keep decreasing, which means that the result after decreasing by one is 0, regardless of whether the current value is 1 or 0. So we need to treat the return value 1 as a sign that we are out of stock. If there is no ma command, we need to use decr and add commands to achieve the above effect.

The above is the main functionality and design of Memcached Meta Commands. There are many more meta-commands that can completely replace the original ones. If you are interested, you can continue to read the official protocol and also the official introduction to meta-commands. If you think of Memcached as an in-memory database, there is no way to compete with Redis. But if you think of Memcached only as an in-memory cache, then it’s basically the best you can do. Although Memcached is used less than before, AWS’s Elasticache actually supports Memcached version 1.6, which means that services on AWS can now use meta directives. This shows that Memcached should not be so easily replaced by Redis.