I recently wrote a few simple scripts in Python to process some data, and since it’s just a simple function, I just use print to print the log.
The task occasionally throws some exceptions when running.
Because I have print logs in different places, the place where the error is reported is not quite the same every time, which leads to very strange results; sometimes this piece of code is not running, and the next time it may be another piece of code that is not triggered.
Although I noticed the key exception
Broken pipe, I didn’t pay special attention to it, because there were some places in the code where
http requests were sent, and I always thought it was a network
IO problem, so I didn’t think about
It was only when this problem recurred that I took a good look at the exception and took a good look at
IO operation, is it really a problem with the print function that comes with it?
But in the local, test environment I ran countless times also failed to find anomalies; so I looked for operations and maintenance to get the way to run on line.
It turns out that in order to facilitate the maintenance of the script tasks submitted up by everyone, operations and maintenance themselves have to maintain a unified script, in which the script uses.
to trigger the task, which is the only difference from my local, development environment.
For this reason I simulated the exception in the development environment:
Wait 20s and the exception will definitely recur.
Why does this anomaly occur?
First, you need to understand how the function
os.popen(command[, mode[, bufsize]] works.
according to the official documentation, this function executes
fork a child process to execute the command
command, while piping the standard output of the child process to the parent process.
This is also the file descriptor returned by the method.
Here’s a diagram to better understand the principle.
SIGPIPE signal is also mentioned there.
Since we know the cause of the problem, it is relatively simple to solve, mainly the following programs.
- Use the
read()function to read the data in the pipe, read it all, and then close it.
- The standard output of
commandcan also be redirected to
/dev/nullif the output from the child process is not needed.
- It can also be run using the
Here is a demonstration using the first option
task.py, no more exceptions are thrown, and the output of
command is printed out.
I did not use this solution for the online fix, but used the standard logging framework to output the logs to
es for easy viewing in
Since the logging framework does not use pipelines, there is no such problem.
Although the problem is solved, which still involves some of our usual less attention to the knowledge points, this time we will review together.
The first is the content of the parent-child process, which is more common in
c/c++/python, and more common in
Java/golang, which directly uses multithreading and co-processing.
Python creates a child process, and since it is a child process, it must need to communicate with the parent process to achieve the purpose of working together.
It’s easy to think that the parent and child processes can communicate with each other through the pipes (anonymous pipes) mentioned above.
Python program as an example, when
task.py is run, two processes are created.
Go to the
/proc/pid/fd directory of each of these two programs to see the file descriptors opened by these two processes.
You can see that the standard output of the child process is associated with the parent process, which is the file descriptor returned by
stdin/stdout/stderrof a process, respectively.
It is also important to note that when we open a file descriptor in the parent process, the child process will inherit it as well.
For example, add a new code to
Later, when you look at the file descriptor, you will see that both the parent and child processes will have this file.
But on the contrary, the parent process will not have the files opened in the child process, which should be easy to understand.
Some basics are especially important when troubleshooting some weird problems, such as the parent-child process pipeline communication involved in this case, and finally to summarize.
os.popen()function is executed asynchronously, if you need to get the output of the child process, you need to call the
read()function by yourself.
- The parent and child processes communicate through anonymous pipes. When the read side is closed, the write side will receive a
SIGPIPEsignal when the output reaches the maximum cache of the pipe, thus throwing a
- The child process will inherit the file descriptors of the parent process.