The TCP/IP protocol cluster establishes a conceptual model for communication protocols in the Internet, and the two main protocols in the cluster are TCP and IP. TCP in the TCP/IP protocol cluster guarantees the reliability and order of data segments, and with a reliable transport layer protocol, application layer protocols can directly use TCP to transfer data without the need to care about Loss and duplication of data segments

The IP protocol solves the routing and transmission of packets, and the upper layer TCP protocol is no longer concerned with routing and addressing. Then the TCP protocol solves the reliability and sequential problems of transmission, and the upper layer does not need to care whether the data can be transmitted to the target process, as long as the data written into the TCP protocol’s buffer, the protocol stack can almost always guarantee the delivery of data.

When the application layer protocol uses TCP protocol to transmit data, the TCP protocol may split the data sent by the application layer into multiple packets to be sent sequentially, and the data segment received by the receiver of the data may consist of multiple ‘application layer packets’, so when the application layer finds sticky packets when reading data from the TCP buffer, it needs to split the received data.

Sticky packets are not caused by the TCP protocol, but by the misunderstanding of the TCP protocol by the application layer protocol designers, who ignore the definition of the TCP protocol and lack experience in designing application layer protocols. In this article, we will look at the TCP protocol and the application layer protocol to analyze how sticky packets occur in the TCP protocol, which we often refer to.

  • The TCP protocol is a byte-stream oriented protocol, which may combine or split the data of the application layer protocols.
  • The application layer protocol’s does not define the boundaries of the message resulting in the inability of the receiver of the data to stitch the data.

Many people may think that sticky packet is a relatively low-level or even not worth discussing, but in the author’s opinion this question is still interesting. Not everyone has systematically learned TCP-based application layer protocol design, and not everyone has that deep understanding of TCP protocol, and I believe that many people learn programming from the bottom up, so the author thinks it is a question worth answering The authors believe that this is a question worth answering, and that we should pass on the right knowledge rather than negative and condescending sentiments.

Byte-stream oriented

The TCP protocol is a connection-oriented, reliable, byte-stream-based transport layer communication protocol. The data handed over to the TCP protocol by the application layer is not transmitted to the destination host as messages; this data is in some cases combined into a data segment and sent to the destination host.

The Nagle algorithm is an algorithm that improves TCP transmission performance by reducing the number of packets. Because of the limited network bandwidth, it does not send small blocks of data directly to the destination host, but waits for more data to be sent in the local buffer. This strategy of sending data in bulk reduces the possibility of network congestion and reduces additional overhead, although it affects real-time performance and network latency.

In the early days of the Internet, Telnet was a widely used application, however, using Telnet generated a large amount of valid data with only 1-byte load, each packet had an additional overhead of 40 bytes, and the bandwidth utilization was only ~2.44%, Nagle algorithm was designed in this scenario.

When the application layer protocol transmits data over TCP, the data to be sent is actually written to the TCP buffer first. If the user turns on the Nagle algorithm, the TCP protocol may not send the written data immediately, but will wait until the data in the buffer exceeds the maximum data segment (MSS) or the previous data segment is ACKed before sending the data in the buffer.

Network congestion was a problem decades ago, but today’s network bandwidth resources are not as tight as they used to be, and by default, the Linux kernel disables the Nagle algorithm by default using the following.

1
TCP_NODELAY = 1

The Linux kernel uses the tcp_nagle_test function as shown below to test whether we should send the current TCP data segment, and the interested reader can use this code as an entry point to learn more about the Nagle algorithm as it is implemented today.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
static inline bool tcp_nagle_test(const struct tcp_sock *tp, const struct sk_buff *skb,
				  unsigned int cur_mss, int nonagle)
{
	if (nonagle & TCP_NAGLE_PUSH)
		return true;

	if (tcp_urg_mode(tp) || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN))
		return true;

	if (!tcp_nagle_check(skb->len < cur_mss, tp, nonagle))
		return true;

	return false;
}

The Nagle algorithm does improve network bandwidth utilization and reduce the additional overhead of TCP and IP protocol headers when packets are small, but using this algorithm may also cause data written multiple times by the application layer protocol to be merged or split and sent, and when the receiver reads data from the TCP stack and finds unrelated data in the same segment, the application layer protocol may not have a way to split and reorganize it.

In addition to the Nagle algorithm, there is another option in the TCP stack to delay sending data, TCP_CORK. If we turn this option on, then when the data sent is less than MSS, the TCP protocol will delay sending the data by 200ms or wait for the data in the buffer to exceed MSS5.

Both TCP_NODELAY and TCP_CORK improve bandwidth utilization by delaying the sending of data, they split and reorganize the data written by the application layer protocol, and the most important reason why these mechanisms and configurations are possible is that - TCP protocol is a byte stream based protocol, it does not have the concept of packets itself and does not send data according to packets.

Message Boundaries

If we have systematically studied the TCP protocol and the design of TCP-based application layer protocols, we will have no problem designing an application layer protocol that can be arbitrarily split and assembled by the TCP stack into packets. Since the TCP protocol is byte-stream based, this really means that the application layer protocol has to draw its own message boundaries.

If we can define the boundaries of messages in the application layer protocol, then no matter how the TCP protocol splits and reassembles the packet process of the application layer protocol, the receiver can recover the corresponding message according to the rules of the protocol. The two most common solutions in application layer protocols are length-based or terminator-based (Delimiter).

There are two ways to implement a length-based implementation, one is to use a fixed length, where all application layer messages use a uniform size, and the other is to use a variable length, but a field indicating the load length needs to be added to the protocol header of the application layer protocol so that the receiver can separate the different messages from the byte stream, and the message boundary of the HTTP protocol is implemented based on length.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
 HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 138
...
Connection: close

<html>
  <head>
    <title>An Example Page</title>
  </head>
  <body>
    <p>Hello World, this is a very simple HTML document.</p>
  </body>
</html>   

In the above HTTP message, we use the Content-Length header to indicate the load size of the HTTP message, from which the complete HTTP message can be separated once the application layer protocol has parsed enough bytes, and we can follow this rule to complete the reorganization of the HTTP message regardless of how the sender processes the corresponding packet

When HTTP uses the Chunked Transfer mechanism, the HTTP header no longer contains Content-Length, it uses the HTTP message with a load size of 0 as the terminator to indicate the message boundary.

For example, when sending JSON data using the TCP protocol, the receiver can determine whether the message is terminated based on whether the received data can be parsed as a legitimate JSON.

Summary

TCP protocol sticky packet problem is caused by the wrong design of application layer protocol developers, they ignore the core mechanism of TCP protocol data transmission - based on byte stream, which itself does not contain the concept of messages, packets, etc., all data transmission is streaming, the application layer protocol needs to design its own message boundaries, that is, Message Framing (Message Framing), let’s review the core causes of the sticky packet problem: 1.

  1. the TCP protocol is a transport layer protocol based on byte streams, in which the concepts of messages and packets do not exist.
  2. application layer protocols do not use length-based or terminator-based message boundaries, resulting in sticky multiple messages.

The learning process of network protocols is very interesting, and constantly thinking about the issues behind them can give us a deeper understanding of the definitions. Towards the end, let’s look at some more open and relevant questions, and interested readers can think carefully about the following.

  • How should an application layer protocol based on the UDP protocol be designed? Are there any sticky packet problems?
  • What application-layer protocols use length-based framing? And which ones use terminator-based framing?