Today, let’s talk about an interesting question: Unplug the cable for a few seconds, then plug it back in, does the original TCP connection still exist?
Some people may say that if the cable is unplugged, it means the physical layer is disconnected, so the upper transport layer should also be disconnected, so the original TCP connection will not exist. Just like when we make a wired phone call, if one party’s phone line is unplugged, the call is completely disconnected.
Is this really the case?
There is a problem with this logic. The problem is that the wrong assumption is made that unplugging the cable affects the transport layer, when in fact it does not.
In fact, a TCP connection is a structure in the Linux kernel called
struct socket, which contains information such as the state of the TCP connection. When the cable is unplugged, the OS doesn’t change anything in that structure, so the state of the TCP connection doesn’t change either.
I did a small experiment on my computer, I connected to my cloud server with ssh terminal, then I simulated the disconnection scenario by disconnecting the wifi, at this time the status of the TCP connection did not change, it was still in ESTABLISHED state.
From the results of this experiment, we know that the action of unplugging the cable does not affect the state of the TCP connection.
The next step is to see what the two parties do after the cable is unplugged.
To address this issue, we need to discuss it in different scenarios.
- After unplugging the network cable, there is data transmission.
- After unplugging the network cable, there is no data transfer.
Data transmission after unplugging the network cable
After the client unplugs the network cable, the data message sent by the server to the client will not get any response. After waiting for a certain length of time, the server will trigger the timeout retransmission mechanism to retransmit the unansmitted data message.
If the client happens to plug the network cable back in during the retransmission of the message from the server, since unplugging the network cable does not change the TCP connection status of the client, and it is still in ESTABLISHED state, the client can normally receive the data message from the server, and then the client will return the ACK response message.
At this point, the TCP connection between the client and the server still exists, and it feels like nothing has happened.
However, if the client does not plug the cable back in during the retransmission of the message from the server and the number of retransmissions from the server reaches a certain threshold, the kernel will determine that there is a problem with the TCP and tell the application through the socket interface that there is a problem with the TCP connection, and the TCP connection from the server will be disconnected.
After the client is plugged back in, if the client sends data to the server, the kernel on the server side will reply with an RST message, and the client will release the TCP connection after receiving it, since there is no more TCP connection with the same quaternion as the client.
At this point, both the client and server TCP connections are disconnected.
How many times are the TCP data messages retransmitted?
On Linux systems, a configuration item called tcp_retries2 is provided, and the default value is 15, as shown below.
This kernel parameter controls the maximum number of timeout retransmissions in case a TCP connection is established.
However, just because tcp_retries2 is set 15 times, it does not mean that the TCP connection will be terminated after 15 retransmissions, but the kernel also determines this based on the “maximum timeout”.
Each timeout round is multiplied, for example, the first timeout retransmission is triggered after 2s, the second after 4s, the third after 8s, and so on.
The kernel will calculate a maximum timeout based on the value set by tcp_retries2.
If a message is retransmitted and no response is received from the other party, the retransmission will be stopped and the TCP connection will be disconnected after one of the two conditions, “maximum number of retransmissions” or “maximum timeout”, is reached first.
No data transfer after disconnecting the network cable
For the scenario where there is no data transfer after the cable is unplugged, it depends on whether the TCP keepalive mechanism (TCP keepalive mechanism) is enabled or not.
If TCP keepalive is not enabled, the TCP connection between the client and the server will always exist after the client is unplugged and no data is transferred from either side.
If TCP keepalive is enabled, after the client is unplugged, even if neither side is transmitting, TCP will send probe messages after a period of time.
if the opposite end is working properly. When a TCP alive probe message is sent to the other side, the other side will respond normally so that TCP alive time will be reset and wait for the next TCP alive time to come.
If the peer host crashes, or the peer is unreachable for other reasons. When a TCP alive probe message is sent to the other end and sinks in stone with no response, several times in a row, after reaching the number of alive probes, TCP reports that the TCP connection is dead.
Therefore, the TCP keepalive mechanism can be used to determine if the other TCP connection is alive by probing messages when there is no data interaction between the two parties.
What exactly does the TCP keepalive mechanism look like?
The mechanism works like this.
If there is no connection-related activity during this time period, the TCP keepalive mechanism kicks in and sends a probe message every time interval, which contains very little data. If there is no response to several consecutive probe messages, the current TCP connection is considered dead and the kernel notifies the higher-level application of the error.
There are parameters in the Linux kernel to set the live time, the number of live probes, and the time interval of live probes, the following are the default values.
- tcp_keepalive_time=7200: indicates that the keepalive time is 7200 seconds (2 hours), which means that if there is no connection-related activity for 2 hours, the keepalive mechanism will be activated.
- tcp_keepalive_intvl=75: indicates that the interval between each detection is 75 seconds.
- tcp_keepalive_probes=9: means that if there is no response after 9 detections, the other side is considered unreachable and the connection is broken.
In other words, it takes at least 2 hours, 11 minutes and 15 seconds to find a “dead” connection on a Linux system.
Note that applications that want to use the TCP keep-alive mechanism need to set the
SO_KEEPALIVE option through the socket interface for it to take effect. If it is not set, then the TCP keepalive mechanism cannot be used.
The TCP keepalive mechanism takes too long to detect, right?
Yes, it is a bit long.
TCP keepalive is implemented at the TCP layer (kernel state) as a bottom-up solution for all TCP-based transport protocols.
In fact, our application layer can implement its own detection mechanism that can detect whether the other side is alive or not in a shorter period of time.
For example, web service software generally provides the
keepalive_timeout parameter to specify the timeout for HTTP long connections. If the timeout for HTTP long connections is set to 60 seconds, the web service software will start a timer and if the client does not initiate a new request within 60 seconds after sending an HTTP request, the timer will trigger a callback function to release the connection when the time is up.
When the client unplugs the network cable, it does not directly affect the TCP connection status. Therefore, whether the TCP connection still exists after the cable is unplugged depends on whether there is data transfer after the cable is unplugged.
Cases with data transfer.
After the client unplugs the cable, if the server sends a data message, then the client plugs back in the cable before the number of retransmissions from the server reaches the maximum, then the original TCP connection between the two sides can still exist normally as if nothing has happened.
If the server sends a data message after the client unplugs the cable, the server will disconnect the TCP connection when the number of retransmissions reaches the maximum before the client plugs the cable back in. When the client sends data to the server after plugging back in the network cable, the server will return the RST message because it has already disconnected the TCP connection with the same quaternion as the client, and the client will disconnect the TCP connection after receiving it. At this point, both TCP connections are disconnected.
In case of no data transfer.
If both sides do not enable TCP keepalive mechanism, then after the client unplugs the network cable, if the client does not plug back the network cable, then the TCP connection status between the client and the server will always exist.
If both sides have enabled TCP keepalive mechanism, then after the client unplugs the cable, if the client does not plug back in, the TCP keepalive mechanism will detect that the TCP connection of the other side is not alive, and will disconnect the TCP connection. If the client plugs back in during the TCP probe, the original TCP connection between the two parties can still exist normally.
In addition to the scenario where the client unplugs the network cable, there are also two scenarios where the client “goes down and kills the process”.
In the first scenario, the fact that the client is down is not perceived by the server as well as the unplugged cable, so if there is no data transfer and the TCP keepalive mechanism is not enabled, the TCP connection on the server will remain in ESTABLISHED connection until the server restarts the process.
So, we can learn one point. In the case where the TCP keepalive mechanism is not used and both sides are not transferring data, just because one side’s TCP connection is in the ESTABLISHED state does not mean that the other side’s TCP connection is still normal.
In the second scenario, after killing the client’s process, the client’s kernel sends a FIN message to the server, with four waves from the client.
So, even if TCP keepalive is not enabled and there is no data interaction between the two sides, if one of the processes crashes, the operating system can sense this process and will send a FIN message to the other side and then perform four TCP waves with the other side.