A question about stack allocation

I was writing a new project a while ago, and I didn’t use a large library like Qt in order to improve some speed. When I was doing process management, I encountered a strange crash problem.

Since I seldom write such code, I thought it was normal to have problems, but after a long time of troubleshooting, I couldn’t find the problem.

After revisiting how the OS manages processes, I found the problem.

Let’s start with a simple example.

#include <iostream>
#include <unistd.h>
#include <sys/types.h>

int child()
{
    int pid = fork();
    switch (pid) {
        case 0:
          std::cout << "[child] I'm child." << std::endl;
          sleep(5);
          std::cout << "[child] I'm quit." << std::endl;
          break;
        case -1:
          std::cout << "fork() failed." << std::endl;
          break;
        default:
          std::cout << "[parent] I'm meself." << std::endl;
          std::cout << "[parent] I will wait child." << std::endl;
          wait(nullptr);
          std::cout << "[parent] I'm quit." << std::endl;
          break;
    }

    return pid;
}

int main(int argc, char *argv[])
{
    child();
    return 0;
}

Let’s run this code and you can see the output of the process.

`1`	`g++ child.cpp`

$ ./a.out
[parent] I'm meself.
[parent] I will wait child.
[child] I'm child.
[child] I'm quit.
[parent] I'm quit.

The above is a very simple and basic usage of the fork() system call, and so far it has worked fine here.

In addition to the fork() system call, there is also the clone() system call, and their roles are as follows

fork creates a complete copy of the parent process, copying all the resources of the parent process.
clone can also create a new process, but it can control the resources shared with the child process in a more fine-grained way than fork, so the arguments are a bit more complex, and we can usually use it to implement threads.

In my case, I need to control the child process to run in a new proc namespace, so I will choose to use the clone() system call to control the namespace to which the child process belongs.

The approximate code is as follows.

#include <sched.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <iostream>

#define CHILD_STACK 8192

int count = 0;

int child_run(void *arg)
{
    printf("count in child: %d\n", ++count);
    return 0;
}

int main(int argc, char *argv[])
{
    int   pid;
    int   status;
    void *child_stack = malloc(CHILD_STACK);
    if (!child_stack) {
        fprintf(stderr, "failed to allocate child stack\n");
        exit(1);
    }

    printf("count before clone: %d\n", count);
    /* Simulate vfork */
    pid = clone(child_run, (void *) ((char *) child_stack + CHILD_STACK),
                CLONE_NEWPID, 0);

    if (pid == -1) {
        fprintf(stderr, "failed to clone\n");
        perror("clone failed: ");
        exit(2);
    }
    else {
        waitpid(pid, &status, 0);
        printf("count after clone: %d\n", count);
    }
    return 0;
}

This is a very common use of clone(), and as an example, it was fine. Until I ran a lot of functions and it crashed.

gdb traced it and it crashed in a function call to std. It seemed strange, I didn’t write any particularly strange code, then I started to streamline the code, using dichotomous simple location, found a function not called, it will not crash, then I followed in to see the code, and did not find anything strange inside, just some std code.

After a wave of debug, finally located that the stack space may not be enough, and then killed by the operating system. So I turned the stack up a bit and found that it worked fine, which means the problem is really here.

Then I went to review the linux process memory allocation knowledge.

The topmost segment of the process address space is the stack, which is used by most programming languages to store function arguments and local variables. Calling a method or function presses a new stack frame onto the stack, which is cleaned up when the function returns. Since the data in the stack follows a strict FIFO order, this simple design means that instead of using complex data structures to keep track of the contents of the stack, a simple pointer to the top of the stack is all that is needed, making the process of pushing and popping very fast and accurate. Each thread in a process has its own stack.

By continuously pushing data onto the stack beyond its capacity, the memory area corresponding to the stack is exhausted, which triggers a page fault that is handled by Linux’s expand_stack(), which calls acct_stack_growth() to check if there is still room for the stack to grow. If the stack size is below RLIMIT_STACK (usually 8MB), then typically the stack will be lengthened and the program continues to execute without feeling that anything is happening. This is a regular mechanism for extending the stack to the desired size. However, if the maximum stack space size is reached, the stack overflows and the program receives a segmentation fault.

Dynamic stack growth is the only case where access to unmapped memory areas is allowed; any other access to unmapped memory areas triggers a page error, which results in a segmentation error. Some of the mapped areas are read-only, so attempts to write to them will also result in a segmentation fault.

I didn’t end up using this solution again, so the problem didn’t need to be solved, but the problem gave me a better understanding of the memory layout of Linux processes.