Supervisord

I’m sure you all have experience in managing Linux host processes and have used a variety of tools, but you are certainly no stranger to the Python version of Supervisor, which is a very useful tool. It can monitor different process states and can restart automatically. This article introduces the open source package “Supervisord” written in Go language, the author mentions why this tool is developed in Go language, the reason is simple. The reason is very simple, is that through the cross-platform advantage of Go language, write a set of programs, can directly run in any platform, the administrator no longer need to bother about the Python environment.

Usage Scenario

It’s 2021 and many tools and services are now managed through Docker. The tool that is needed is no longer the Supervisor, but there is a reason why the Supervisor is necessary. There are several situations in our team that require this tool.

  1. managing multiple processes in a Container
  2. no Docker environment

The first point is that if you manage Processes in a Container, you want to see the Python environment installed first, and the whole container will become very fat, which is the situation we need to consider. The second situation, like our team has an environment with no network at all, is also prohibited to use Docker, because Docker will make it very difficult for IT to manage the use of other colleagues, causing some privilege errors, so the use of Docker is prohibited.

When developers have to manage multiple services, having a Go language version of the Supervisor is a great help to the SRE team, so they don’t need to consider the Python version.

Installation method

You can download the corresponding OS version from the Release page, or you can compile it yourself if you are a GO developer.

1
2
go generate
GOOS=linux go build -tags release -a -ldflags "-linkmode external -extldflags -static" -o supervisord

Put supervisord in the /usr/local/bin directory. By default, the executable reads the supervisor.conf configuration file, which can be read by -c or automatically from the directory below

  1. $CWD/supervisord.conf
  2. $CWD/etc/supervisord.conf
  3. /etc/supervisord.conf
  4. /etc/supervisor/supervisord.conf (since Supervisor 3.3.0)
  5. ../etc/supervisord.conf (Relative to the executable)
  6. ../supervisord.conf (Relative to the executable)

Usage

Create or open the supervisor.conf file

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[supervisord]
logfile=%(here)s/supervisord.log
logfileMaxbytes=50MB
logfileBackups=10
loglevel=debug
pidfile=%(here)s/supervisord.pid

[inet_http_server]
port = :9001
username=
password=

One of them is inet_http_server which is a simple management interface to see the status of the configured process. The web interface is very simple, but useful.

web interface

Process can be started or suspended through the web interface, and this page can be protected by username and password. Next, see how the Process is set up:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
[program:tip-agent]
directory = /xxxxxxxx/tip/agent
command = ./bin/agent
process_name = tip-agent
stdout_logfile = test.log, /dev/stdout
stderr_logfile = test.log, /dev/stderr
restart_when_binary_changed = true
autostart=true
startsecs=3
startretries=3
autorestart=true
exitcodes=0,2
stopsignal=TERM
stopwaitsecs=10
stopasgroup=true
killasgroup=true

One thing to note here is the stopsignal setting. If you don’t set this item, and the process has a Graceful Shutdown, then the program will not end as you expect. Therefore, this option must be added. The related code can be found here. The default is to use syscall.SIGKILL to force the program to shutdown.

p.Signal(syscall.SIGKILL, killasgroup)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
if atomic.LoadInt32(&stopped) == 0 {
  log.WithFields(log.Fields{"program": p.GetName()}).Info("force to kill the program")
  p.Signal(syscall.SIGKILL, killasgroup)
  killEndTime := time.Now().Add(killwaitsecs)
  for killEndTime.After(time.Now()) {
    //if it exits
    if p.state != Starting && p.state != Running && p.state != Stopping {
      atomic.StoreInt32(&stopped, 1)
      break
    }
    time.Sleep(10 * time.Millisecond)
  }
  atomic.StoreInt32(&stopped, 1)
}

In addition to the web interface, you can also use the CLI to see the status of all processes.

1
supervisord ctl status

supervisord ctl status

Supervisord also integrates with Prometheus monitoring, and SRE can get the related monitoring data underneath through http://localhost:9001/metrics.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# HELP node_supervisord_exit_status Process Exit Status
# TYPE node_supervisord_exit_status gauge
node_supervisord_exit_status{group="tip-agent",name="tip-agent"} 0
node_supervisord_exit_status{group="tip-backend",name="tip-backend"} 0
# HELP node_supervisord_start_time_seconds Process start time
# TYPE node_supervisord_start_time_seconds counter
node_supervisord_start_time_seconds{group="tip-agent",name="tip-agent"} 1.632135574e+09
node_supervisord_start_time_seconds{group="tip-backend",name="tip-backend"} 1.632135593e+09
# HELP node_supervisord_state Process State
# TYPE node_supervisord_state gauge
node_supervisord_state{group="tip-agent",name="tip-agent"} 20
node_supervisord_state{group="tip-backend",name="tip-backend"} 20
# HELP node_supervisord_up Process Up
# TYPE node_supervisord_up gauge
node_supervisord_up{group="tip-agent",name="tip-agent"} 1
node_supervisord_up{group="tip-backend",name="tip-backend"} 1

Finally, if you want to use the Supervisord tool in Docker as well, you can copy the binary directly from the official image via COPY.

1
2
3
FROM debian:latest
COPY --from=ochinchina/supervisord:latest /usr/local/bin/supervisord /usr/local/bin/supervisord
CMD ["/usr/local/bin/supervisord"]

Summary

At present, the whole ecology tends to be containerized, but this old management method, there are still various situations, with the Go language version, greatly reducing the team’s time to deal with the environment, it is recommended to use the Go version to integrate the Container internal, or in the non-Docker environment management.