Let’s start with an interview question: We know that there are some common shortcuts in the terminal,
Ctrl+E to move to the end of a line,
Ctrl+W to delete a word,
Ctrl+B to move a letter forward, and pressing the up key to bring up the last shell command used. Among these 4 shortcuts, there is one that is implemented differently from the others, which one is it?
The answer is
Ctrl+W is provided by something called TTY, and the other three are provided by the shell. Okay, I admit that I might get beaten up for asking someone such a question, but here it is just to catch the reader’s interest.
Let’s look at another interesting question: If you are on
host1 and logged into
host2 using the
ssh command, and then executed the
sleep 9999 command. What happens when you press
Ctrl+C at this time?
host1will be stopped
host2will be stopped and the
sshsession will remain
Anyone who has used the
ssh command should know that the phenomenon is (2) that we can just
Ctrl+C inside the shell provided by ssh without any effect on ssh.
So how does this work?
We know that
Ctrl+C sends a signal with an int value of 2, called SIGINT. So we can guess: is it possible that the ssh process received the SIGINT and forwarded it to the ssh remote program, but won’t handle the signal itself?
We can verify this conjecture using the killsnoop program, which prints out the signals between processes.
First we start the killsnoop program.
Then open a new shell, press
Ctrl+C and you will see that the shell (pid=1549) received signal=2, i.e. SIGINT.
Then we ssh to the local machine and press
Ctrl+C inside ssh :
If our guess is correct, the shell (pid=1549) should still be receiving SIGINT and forwarding it to the ssh process.
But killsnoop shows that only the shell that ssh opened received SIGINT, the ssh process itself and the original shell with pid=1549 did not receive any.
Obviously, our conjecture is not valid. So how is it possible that
Ctrl+C does not affect ssh itself but affects the programs inside ssh? I believe that you will have an answer after reading this article.
Hopefully, it has attracted enough interest to start with TTY, so let’s start the archaeology now.
TTY is a product of history
The first thing to be clear is that TTY is a historical artifact. Just like Unix systems now have so many
/bin directories. It’s because many programs exist by default, older programs need them to run, and newer programs will be compatible with them by default. If you write a completely redesigned Terminal or directory organization without regard to historical reasons and compatibility, you don’t need so many
/bins and you don’t need TTYs.
Here’s a brief history of the time when TTY was needed and why it was indispensable in that case, along with the various subcomponents.
The full name of TTY is Teletype, what is Teletype?
This, then, is Teletype.
This video shows how it works.
Simply put, a long time ago, many people used one computer together (you’ve heard of Unix as a multi-user, multi-tasking operating system, right?) . Everyone had a “terminal” (Terminal, TTY, in this context). Here you type down the command you want to run, send it to the system for execution, get the result from the system, and print the result on paper.
So, at the time, TTY was a piece of hardware, and as a piece of hardware, how was it connected to the computer?
First there is a wire, but this wire is not actually connected directly to the computer, but to a piece of hardware called a Universal Asynchronous Receiver and Transmitter (UART). the UART Driver can read information from the hardware and send it to the TTY Driver. the TTY reads from it TTY reads it from it and sends it to the program. (In fact, UARTs are still in use today, so if you’ve played with Arduino or Raspberry Pi, you may have come across them.)
Something like this.
Up to this point, it’s actually relatively straightforward for us “modern people”. The input from the hardware is copied through the Driver layer by layer to the application.
Wait, there is something called “Line discipline” on top. What the hell is that?
As its name says, it is used to “discipline” the line. The command is actually stored in the TTY after it is typed and before the
Enter key is pressed. A line that exists in TTY can be “disciplined” by Line discipline. For example, it provides the function to delete by
Ctrl+U, that is, after you press
Ctrl+U, TTY will not send characters to the following program, but will delete the whole line in the current cache. Similarly,
Ctrl+W deletes a character, a feature provided by Line discipline. (Wow! Now you pass my interview!) I’ll prove later that this is a TTY feature.
This function is simply too boring for us “modern” people. Can’t we just leave it to bash? Is it necessary to handle such things as a subsystem of the Kernel?
Whenever you want to criticize someone, remember that not all people in this world have the same advantages you have.
Yes, back in the days of Unix, there was no such condition.
A long time ago, it was too tiring for computers to read in every character and send it immediately to the program that followed. If 20 people were typing at 60 words per minute, it would take about 100 context switches and disk swaps per second, so the computer would spend 100% of its time processing these people’s keystrokes and would have no time to do anything else. (PS This is actually what I can see from dev.to a comment, it’s really wonderful, I read a lot of articles before I saw this comment but I didn’t understand why I needed Line discipline.)
The biggest use of Line discipline is actually a programmable middleman. It can buffer the contents of 20 TTYs until one person presses Enter, then it actually sends the contents to the back-end program. A Line discipline module can cache 20 TTYs, so if we need 30s to enter a command, that’s about 1.5s per user. That’s almost 100 times faster.
Line discipline works a bit like Emacs, with a function table of size=127, and each key has a bound function. For example: enter buffer; send command out, etc.
You can set TTY to raw mode, so that Line discipline will not interpret the characters it receives, but will send them directly to the program behind it (the foreground process group, session, to be exact) (in fact, this is the reason why ssh does not receive SIGINT, but the program inside ssh does. (I’ll show you later). Nowadays, many programs use raw mode for TTY, such as ssh and Vim. But a long time ago, Vim ran in cooked mode (i.e., Line discipline worked). When you typed some text in the middle of a line, like
asdffwefs, the screen would go haywire and the text would overwrite what came after it until you pressed
Esc to exit editing.
Today’s computers have become a million times more powerful than the hardware of that time, so Line discipline has little meaning. But at that time, if one wanted to delete and edit the currently typed command, where was the most appropriate place to implement this function? Obviously the buffer!
The performance issues here are history, but TTY and Line discipline are here to stay because (I’m guessing) many programs are written with TTY by default, such as bash, and TTY continues to retain Line discipline without the user feeling anything about it.
So what exactly is a TTY today? Essentially, it is no longer a piece of hardware, but just a piece of software (kernel subsystem). At the user level of the system, it is - a file. Of course, what is not a file in Unix?
tty command allows you to see which TTY is used by the current shell.
As a “file”, you can write directly to it. The content written to the TTY will be read out by the output device. (The diagram below shows the shell writing below and appearing in the shell above)
Of course, it’s possible to read. But when you read from the TTY, you are in competition with the output device, because you are both trying to read from this TTY, which had only one reader, and now has two. I pressed the numbers 1-9 in the shell above, and each time I entered a number I wasn’t sure which side it would be read from.
Once it is read by
cat, the key you pressed will not be displayed in the current shell.
Got a bad, bad idea? Yes, we can use the
w command to see who is logged in to the machine, then go to
cat their TTY and they will surely think their keyboard is broken! (Tip, when a user logs in, the TTY file permissions used will be set to read and write only to themselves, and the owner set to himself, so you have to be root for this prank to work!)
Having understood what TTY is, what is it good for today?
We can think about this question in reverse: Can we do without TTY?
The answer is yes.
I can demonstrate that you can use the terminal without TTY.
Imagine a scenario where you break into someone’s machine, such as the server where kawabangga.com is located, and you find a way to execute python code inside it, but you can only inject the code into it and execute it without seeing the output, what do you do?
There is something called reverse shell. In layman’s terms, our ssh is usually a shell that we run to a remote computer to control, and reverse, as the name implies, is a shell that I open on a remote machine and then give it to you to control.
For the following demonstration, I opened a tcp port in the following terminal using nc, and then executed the following command in the terminal above.
You can see that this python code actually opens a
sh program and then connects stdin/stdout/stderr all to the tcp socket. For the nc end, the stdin/stdout/stderr of the nc sends into the socket, so my nc becomes a shell that can control the other side!
This way, I can execute commands on the other side’s host at will, very convenient!
It is possible to open reverse shell using other languages.
As you can see from the image above, this is a shell without TTY. what’s wrong with it? Let’s run a TUI program, like
Note the problem in the top left corner, it is actually trying to hit
hostname after pressing
q, and sh has lost its mind and can’t even display the characters I hit properly. In addition, this shell without TTY has the following disadvantages:
- it can’t use TUI programs like Vim, htop, etc.
- can’t use tab completion
- you can’t use the up arrow to see the history command
- no job control
(Actually, reverse shell can also have TTY)
So, today, we can run an incomplete shell without TTY, after all, our hardware today has nothing to do with teletyping.
However, TTY still serves an important function as a kernel module. Terminal can tell TTY to move the pointer, clear the screen, reset the size, and so on.
Eh? Wait a minute, why do the
tty commands we see in the image above start with
/dev/pts/ and not
/dev/tty? What’s the difference?
This is actually a “pretend” TTY, called Pseudo terminal.
I don’t know if you realize that one of the important points about TTY we discussed above is that TTY is a module (subsystem, drive) of the kernel, and TTY is in kernel space, not user space, so how can our modern Terminal programs, ssh programs, etc., interact with TTY?
The answer is PTY.
The explanation will be simplified here to make it easier to understand. When a program like iTerm2 needs a TTY, it asks the Kernel to create a PTY pair for it. Note that it is a pair, which means that PTYs always come in pairs. The slave is given to the program (as mentioned earlier, programs like bash assume the existence of a TTY by default and work with it in an interactive state), and the program does not know whether it is a PTY slave or a real TTY, it just reads and writes. The PTY master is returned to the program that asked for it (usually ssh, terminal emulator graphics software, tmux, etc.), which gets it (actually an fd) and can read and write the master PTY. The kernel is responsible for copying the contents of the master PTY to the slave PTY, and the contents of the slave PTY to the master PTY. pts means pseudo-terminal slave, which means that the login device of these interactive shells device is the pseudo-terminal slave.
So, the programs we see under the GUI, like Xterm/iTerm2 (which actually uses ttyS, so I won’t go into details here), like the shell opened in tmux, like the ssh opened shell, all of them are PTY. So, these terminals under the GUI, similar to konsole, Xterm, are called “terminal emulators”, they are not real terminals, they are emulated.
How do I get to a real TTY? Simple, in Ubuntu desktop system,
Ctrl+Alt+F1 pressed, is a graphical interface, but
Ctrl+Alt+F2 (actually F2-F6 are), is a terminal, this terminal, is TTY, you log in there and press
tty command, it will tell you this is tty device up.
I happen to have a virtualbox virtual machine, only command line, no GUI, log in, then you can see that this is a TTY.
Finally, let’s go back to the second question at the beginning of this article: Why does pressing
Ctrl+C in ssh not stop ssh, but stops the programs inside ssh?
Let’s review what happened when we pressed
- the kernel driver receives the
Ctrl+Cinput, ignoring any unrelated modules in between.
- then it reaches TTY, TTY receives this input and sends a SIGINT signal to the current process group in the foreground of TTY (in fact, it sends it to whichever session TTY is currently assigned to). if bash is currently in the foreground, bash will receive this signal, and if it is
sleepwill receive it.
SIGTERM is a signal that can be handled by the program itself, bash decides to ignore it after receiving it, and sleep exits after receiving it.
stty program allows us to modify tty’s function table,
Ctrl+C Here it is about a function called
enable interrupt, quit, and suspend special characters
This actually means that if TTY receives an input like
Ctrl+C (the original symbol is
^C, correspondingly, you can use the
stty -a command to check, the default quit is
^\ and the default suspend is
^Z), instead of sending it to the program behind it, convert it to SIGINT and send it to the process group behind the current TTY . So we can use
stty -isig to turn off this behavior.
Now, if you press
Ctrl+C in the
sleep program, TTY will send the
^C character to the
sleep program as is, and
sleep will not receive any signal. We can’t use
Ctrl+C to end the sleep program.
Back to the ssh problem, our reasonable guess is that ssh will first disable
isig for the shell it is currently in when it gets the remote shell, so that
Ctrl+C will be sent to ssh as a character, the ssh client will send this character to the remote ssh server, ssh server sends it to its own TTY (which is actually a PTY master), and finally the remote TTY sends a SIGINT signal to the current remote foreground process.
How can we verify our suspicions?
We can use
stty to check the TTY settings of the shell, and then use this shell to log in via ssh and check the TTY settings again.
In this diagram, we use the shell above to view the shell TTY configuration below. You can see that the first view is before the ssh login and
isig is on. The second view is after the ssh login,
isig becomes off. If ssh logs out,
isig becomes on again.
To prove the opposite, if we force the TTY where ssh is located to turn
isig on before the ssh login, then pressing
Ctrl-C will end the ssh process itself, not the program running inside ssh.
Since I’m using ssh to log in locally, I’ve changed the command line prompt of the local shell to distinguish between the current local shell and ssh.
This image is from ssh after logging in and running
stty --file /dev/pts/0 isig in another shell to open
isig on the shell where ssh is located. Then press
Ctrl+C in ssh (the current foreground program is
sleep 9999). At this point ssh exits directly, and we are back in the local shell, rather than ending sleep in ssh.
We can use the
strace program directly to trace the ssh system calls.
You can see that when ssh starts, there is a line that says
is changing the TTY setting to
-isig, and some other settings.
Then, when ssh exits, there is a line that says
Change the settings back.
In fact, if you use Terminal enough, you must have encountered this situation: after running some TUI program, it exits abnormally (for example, it gets stuck, crashes, or gets
SIGKILL), and then you go to Terminal and find that Terminal is all messed up, carriage return does not work,
Ctrl+W does not work, and so on. This is probably because the program did not execute the reset tty code that should have been executed at the time of exit. Use the
reset command to reset the current Terminal and bring it back to its senses.
So back to the first question, how do you prove which shortcuts are provided by TTY and which are provided by the shell?
This is even easier, in fact
stty -a already prints out all the stty configurations
raw mode, even the Enter key is newline, and will not give you the ability to move the cursor to the beginning of the line.
If you cancel
Ctril+W, this function is naturally gone. Typing a
Ctrl+W is really a
What about those shell shortcuts (like
Ctrl+E)? We can use the
sh program to verify that they are functions provided by the shell, not by TTY.
sh is a very silly program and does not explain the
Ctrl+A or up keys. Pressing the left arrow brings up
^[[D and pressing
Ctrl+A brings up
^A (it feels like many people have seen these characters before, and when the shell is stuck, pressing the arrow will put these raw characters on the screen). However, under normal TTY (cooked TTY, you can use reset command to restore the TTY we played with before),
Ctrl+W function is still available under