What are the flaws in the TCP protocol that we are talking about today? There are four main areas.

  • The difficulty in upgrading TCP.
  • Delays in establishing TCP connections.
  • TCP suffers from queue head blocking problems.
  • The need to re-establish TCP connections for network migration.

Next, these four areas are addressed in detail.

Upgrading TCP is hard work

The TCP protocol was created in 1973 and is still being implemented today with many new features.

However, the TCP protocol is implemented in the kernel and can only be used and not modified by applications.

The trouble is not that upgrading the kernel is troublesome, but because kernel upgrades involve updating the underlying software and libraries, our service programs need to return to test whether they are compatible with the new kernel version, so kernel upgrades for servers are conservative and slow.

Many new features of the TCP protocol require both client and server support to take effect. For example, the TCP Fast Open feature was proposed in 2013, but some old systems cannot support this feature because many PC-side systems have a serious upgrade lag.

Therefore, even if TCP has a better feature update, it is difficult to promote it quickly, and users often need several years or ten years to experience it.

TCP connection establishment delay

Application protocols implemented based on TCP require three handshakes to be established before data can be transferred, such as HTTP 1.0/1.1, HTTP/2, and HTTPS.

Nowadays, most websites use HTTPS, which means that after the TCP three handshakes, four TLS handshakes are required before HTTP data transmission can take place, which increases the latency of data transmission to some extent.

TCP three handshake and TLS handshake latency, as shown in the figure.

TCP three handshake and TLS handshake latency

The TCP triple handshake delay is solved by the TCP Fast Open feature, which reduces the TCP connection establishment delay during the “second connection establishment”.

Regular HTTP requests and Fast Open HTTP requests

The process is as follows.

  • In the first connection establishment, the server generates a Cookie (encrypted) in the second handshake and sends it to the client together with SYN and ACK packets, so the client will cache this Cookie, so the first time it initiates an HTTP Get request, it still requires a delay of 2 RTT.

  • In the next request, the client sends Cookie to the server with the SYN packet, so that the three handshakes can be skipped in advance, because some information is maintained in Cookie, and the server can obtain TCP-related information from Cookie, and the HTTP GET request initiated at that time only needs 1 RTT of latency.

TCP Fast Open is a good feature, but it requires both the server and client operating systems to support it in order to experience it, and TCP Fast Open was proposed in 2013, so there are still many old operating systems on the market that do not support it, and upgrading the operating system is very troublesome, so it is difficult for TCP Fast Open to be popularized.

Another point is that for HTTPS, TLS is the handshake implemented at the application layer, while TCP is the handshake implemented at the kernel, and these two handshakes cannot be combined together.

It is also true that TCP is implemented in the kernel, so TLS cannot encrypt TCP headers, which means that TCP sequence numbers are transmitted in clear text, so there is a security problem.

A typical example is when an attacker forges an RST message to force a TCP connection to close, and the key to a successful attack is that the sequence number in the TCP field is within the sliding window of the receiver and the message is legitimate.

For this reason TCP also has to perform three handshakes to synchronize the respective sequence numbers, and the sequence numbers are initialized in a random way (not completely random, but linearly increasing with the passage of time, and then rolled back at the end of 2^32) to make it more difficult for the attacker to guess the sequence numbers to increase security.

But this approach can only avoid the attacker to predict the legitimate RST message, but not the attacker to intercept the client’s message, and then forge the legitimate RST message in the middle of the attack way.

RST

If TCP sequence numbers could be encrypted, there might not really be a need for three handshakes. The initial sequence numbers of both the client and the server start from 0, so there is no need to do the work of synchronizing sequence numbers, but to implement this, the whole protocol stack has to be modified, which is too much trouble, and even if it is implemented, many old network devices may not be compatible.

TCP has a queue head blocking problem

TCP is a byte stream protocol, TCP layer must ensure that the received byte data is complete and ordered , if the TCP segment with lower sequence number is lost in network transmission, the application layer cannot read this part of data from the kernel even if the TCP segment with higher sequence number has been received. As shown in the following figure.

tcp

In the figure, the sender sends many packets, each packet has its own serial number, which you can think of as the TCP sequence number, where packet #3 is lost in the network, even if packet #4-6 is received by the receiver, because the TCP data in the kernel is not continuous, so the receiver’s application layer cannot read it from the kernel, and only Only after packet #3 is retransmitted can the receiver’s application layer read the data from the kernel.

This is the TCP queue head blocking problem, but you can’t blame TCP for this because it’s the only way to keep the data in order.

HTTP/2 multiple requests are running in one TCP connection, so when TCP loses packets, the whole TCP has to wait for retransmission, then it blocks all requests in that TCP connection, so the HTTP/2 queue head blocking problem is caused by the TCP protocol.

tcp

Network migration requires re-establishing TCP connections

HTTP protocol based on TCP transport protocol, as a TCP connection is determined by a quadruplet (source IP, source port, destination IP, destination port).

quadruplet

Then when the mobile device’s network switches from 4G to WIFI, which means the IP address changes, then it has to disconnect and re-establish the TCP connection.

The process of establishing the connection includes the time delay of TCP triple handshake and TLS quadruple handshake, as well as the deceleration process of TCP slow start, which gives the user the impression that the network suddenly lags for a while, so the migration cost of the connection is high.