Distributed ID generation algorithms are used to generate globally unique ID identifiers in distributed systems, and one well-known algorithm is the snowflake algorithm proposed by twitter, which generates a 64-bit globally unique integer each time, with a very clever underlying idea.

1
2
3
 0        1010......101     1010101010     101010101010
\_/       \___________/     \________/     \__________/
1           2               3               4
  1. The first bit is not used
  2. 41-bit millisecond timestamp
  3. 10-bit machine ID
  4. 12-bit serial number

Except for the first bit at the beginning, which is not used, the next 41 bits of the timestamp are the number of milliseconds elapsed from the specified start time to the current time. For example, if you set the system start time to 0:00 on March 15, 2022, then at 12:00:00.123 on April 1, 2022, the value of this timestamp should be 1,512,000,123, and the entire timestamp fragment supports a maximum of 69.7 years, which is clearly beyond the lifetime of most IT systems.

The 10-bit machine ID, which corresponds to a distributed cluster of up to 1024 ID generator instances, and the 12-bit sequence number incrementing continuously from 0 to 4095, can support 4096 ID generation requests per millisecond for a single instance, which means that the entire cluster of ID generator instances can theoretically support up to 4194304 ID generation per millisecond, which is very efficient. This is very efficient.

The theoretical basis for the global uniqueness of IDs generated by the snowflake algorithm is the combination of global uniqueness and single-instance uniqueness, where global uniqueness is guaranteed by unique machine IDs, and different machine IDs guarantee that IDs generated by different instances must not be the same, while single-instance uniqueness is guaranteed by the same millisecond combined with different sequence numbers, where the sequence numbers can only reach the theoretical upper limit, i.e., theoretically there will not be more than 4096 requests in a millisecond.

Time redirection problem of snowflake algorithm

The time redirection problem refers to the system time jumping back to a certain time in the past, either actively or passively, possibly due to network time calibration or manual settings, during the operation of the system.

time redirection

Since the snowflake algorithm relies heavily on the current time of the machine, it is possible that if a time rewind occurs, the generated ID may be duplicated with one of the previously generated IDs (provided that the serial number also happens to be the same when the ID is generated in the same millisecond), which is the most frequently discussed problem of the snowflake algorithm - time callback. In the original implementation of the snowflake algorithm, for this problem, the algorithm itself just returns the error, and the application decides the processing logic separately. If it is in a business system with low concurrency or low request volume, the error waiting or retry strategy is not a big problem, but if it is in a highly concurrent system, this strategy seems too brutal.

An idea to solve the time redirection problem of the snowflake algorithm - multiple clocks

There are many ideas and discussions on the Internet for solving the time redirection problem of the snowflake algorithm, I introduce here is an idea based on the extended bit, but for the sake of understanding, I named myself the multi-clock snowflake algorithm.

Algorithm ideas are also relatively simple, since the time dial back the essence of the problem is the time back to the “past”, then even if back in the past, as long as the realization of “this time is not the other time” is not to achieve the time unique? If we think along this line, an intuitive way of thinking is: Since I have found the time rewind, I will consider the original “clock” is no longer available, use a new “clock” can be, and the new current time is still the new clock’s time.

Algorithm description

Similar to the classic snowflake algorithm, the multi-clock based improved snowflake algorithm requires a small number of bits for storing the clock ID, the required number of bits can only be split from the original timestamp, machine ID or sequence number, the specific business implementation needs to be combined with the concurrency of the business, cluster size and other comprehensive considerations, here for the convenience of the discussion, assuming 2 bits each from the machine ID and sequence number, for a 4-bit clock ID.

1
2
3
 0       1010......101    0001     10101010   1010101010
\_/      \___________/    \__/     \______/   \________/
1         2                3        4           5
  1. 1st bit not used
  2. 41-bit millisecond timestamp
  3. 4-bit clock ID
  4. 8-bit machine ID
  5. 10-bit serial number

Thus, the new algorithm theoretically.

  • still supports a maximum runtime of 69+ years.
  • the distributed instance size is reduced to 256.
  • single instance supports up to 1024 requests per millisecond.
  • a single instance supports up to 16 callbacks to the same time range (the time callback problem is theoretically perfectly solved if the time callbacks occur in mutually non-overlapping time periods).

In the specific implementation logic, mainly in each found time back (i.e., the previous last generated ID timestamp is less than or equal to the current timestamp), it will be clock ID plus 1, similar to the sequence number, and so on.

1
2
3
4
5
6
timeNow := 当前系统时间
if last_time >= timeNow:  // Clock back
    clock_id = (clock_id + 1) & (1<<4-1)
last_time = timeNow
seq := 下一个序列号
return encode(timeNow, machineID, seq, clock_id)

Scenario analysis

Still the example in the picture above, if at a certain moment, the system generates ID, the time is 10:15:00, then the system experiences 5 seconds later, at another moment 10:15:05, the system generates ID again, in this time, the system has been using the clock for 1. In the next generation of ID, the system found that lastTime is 10:15:05, and the system query machine current time is 10:15:00, determine the time back, so switch the clock to 2, at this time the generated ID will correspond to 10:15:00 of clock 2, and the previous clock 1 10:15:00 in logic is already two different times, so the generated IDs are naturally different.

Extreme scenario: process restart + time redirection

The above algorithm is able to cope with the time redirection problem during operation, but if by some unfortunate coincidence, the system happens to finish time redirection during the process restart when it happens to encounter a crash, how can we ensure that the same ID will not be generated because the same clock is used? One way to continue to improve the idea is to try to get the clock ID from the local disk file before the restart when the ID generator is initialized, and add 1, meaning that each restart must not use the clock before the process exits, and in the running process, each time the clock is switched, the new clock ID should be written to disk, taking into account the performance-friendly, this operation is done asynchronously as possible.

High concurrency optimization idea: clock ID reuse

This multi-clock-based optimization algorithm, because of the need for additional bits to store the clock ID, and occupy the bits used to control the concurrency of the sequence number, if the system does have a high concurrency, the optimization can be considered here is to reuse the clock ID: each time the current clock, once the current sequence number reaches the upper limit reset, are switched to the next clock, so that, in theory, the same millisecond In this way, the concurrency can theoretically scale up to 2^14, i.e. 16384, in the same millisecond. In most cases, clock rollbacks should be a rare occurrence, and this method of multiplexing clock IDs makes fuller use of the 4 bits of the clock ID.

Some problems with the multi-clock snowflake algorithm

Of course, this algorithm is not perfect, it is based on a number of assumptions, while some of the problems it still cannot avoid need to be carefully considered before using.

  • time redirection does not occur frequently in the same time period (in the example given above, time redirection does not happen more than 16 times repeatedly on the same time period).
  • Incremental problem: when time redirection occurs, ID incrementality is broken, and for scenarios that require strict incrementality, other solutions need to be considered, such as improved algorithms based on “historical time”.
  • Evaluation of the machine ID and sequence number space: how to ensure that a globally unique machine ID is obtained is also a complex issue, in addition to the introduction of clock IDs, which will occupy additional bits, need to consider from which bit fragments to free up these bits to be reserved for clock IDs.
  • multi-clock snowflake algorithm only alleviates the clock redirection problem, but can not completely solve the time redirection problem, so still need to consider an extreme case of fault tolerance program, but since it is an extreme scenario, the use of simple programs such as error retry is sufficient.

Other ideas for solving the snowflake algorithm time redirection problem

  • Use historical time
  • Waiting for clock correction
  • Time catch-up
  • Error Retry

Summary

  • Time uniqueness in time redirection scenarios is achieved by introducing multiple clocks, which can cope with time redirection infinitely in ideal scenarios
  • Multi-clock snowflake algorithm can optimize the efficiency of bit utilization by multiplexing clock IDs
  • Multi-clock snowflake algorithm also needs to be combined with other fault-tolerant processing to cope with time redirection in extreme scenarios