Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Note that this article is really talking about the general case, but in practice a lot of techniques can work if you have narrower requirements or if you have more control over what you run.

For example, in 1.1.4 the author talks about why containers are not a solution giving three distinct reasons. But if we change our perspective a little bit, none of the three reasons are blocking. The first is that it's not easy; but `docker run` or `podman run` is easy. Even systemd units start with separate control groups to allow you to terminate everything at once. The second reason was about gdb; when was the last time you used gdb in production? If you are using gdb someone is interactively using the computer and can be relied upon to clean up processes manually. The third reason is that containers are more heavyweight, but there's no need to make every subprocess a separate container: if multiple processes should be managed as a single unit (including the case when we'd want to terminate a whole group of processes) they should run in the same container.

So with a slight change of perspective we find the problem easily solved. It had trade offs but it works well enough in practice that only very few purists have a problem with it. Not to diss on the author—I think this type of perfectionist thinking is illuminating in terms of API design—but pragmatically it's a solved problem.



I respect the attempt to shift perspective very much, but you only engaged at a surface level with the reasons I listed.

>The first is that it's not easy; but `docker run` or `podman run` is easy.

I was referring to easy use from a full fledged programming language. When you start a subprocess in your programming language of choice, do you always run it in a container? I seriously doubt it, and the reason for that is because it's hard.

>The second reason was about gdb

No, the second reason was about user namespaces, which break many things including ptrace, which in turn breaks gdb, as just one example. There's lots of useful tracing and monitoring software which makes occasional use of ptrace.

>if multiple processes should be managed as a single unit (including the case when we'd want to terminate a whole group of processes) they should run in the same container

Yes, that's true. And indeed, scripting use cases often have that characteristic, where everything can be terminated at once at the end. You can compare this to missile-style garbage collection: Just never free your memory/processes. Unfortunately, long-lived applications both need to free their memory over time and need to clean up their processes over time.


> I was referring to easy use from a full fledged programming language. When you start a subprocess in your programming language of choice, do you always run it in a container? I seriously doubt it, and the reason for that is because it's hard.

I use a (standard) library to start subprocesses in my favourite programming languages.

For running a container, I'd also use a library.

Seems about equally hard, no?


Do user namespaces still break ptrace and gdb? I know gdb versions from 8 onward are much better at handling containerised processes, but I don't know if the reason why is related to what you're describing.


The same is true for the process group/session + controlling terminal solution: the solution doesn't work recursively (can't do process management downstream), and it also requires child processes to abstain from changing SIGHUP handler or mask, but in the vast majority most cases none of those limitations are a problem. Combined with POSIX fcntl locks[1] on a PID file, this is my go to generic solution for Unix-portable[2], multiprocess daemons. The amount of code required in the supervisor component is quite trivial, yet covers almost all of your bases.

[1] fcntl locks permit querying the PID of the lock holder, so you don't need to write the PID to the file, providing a solution to the PID file race and loaded gun dilemmas. (There's still a race, but the same race exists with Linux containers, and both can be resolved in similar manner--query PID, send SIGSTOP, verify PID association, send SIGKILL or SIGCONT.)

[2] One of the crucial behaviors, that the kernel atomically sends SIGHUP to all processes in the group if the controlling process terminates, isn't guaranteed by POSIX, but it's the behavior on all Unix I've tried--AIX, FreeBSD, macOS, Linux, NetBSD, OpenBSD, and Solaris.


> Even systemd units start with separate control groups to allow you to terminate everything at once

My experience has been that _literally_every_ advanced cloud architecture can be simplified to:

- Host w/1 administration IP on a vlan w/ admin services on that IP, then N customer/service IPs w/ customer+service apps bound to them (11.12.13.14:5432 for customer A postgres, 25.26.27.28:5432 for customer B postgres, whatever)

- For each IP+service combo, configure the service to use its own directory of storage. Back each storage according to what it needs; the DB is backed by block RAM > SSD partitions > HDD partitions > SMB nonsense.

- For each Host+customer+service triple, write a python (complex)/shell (simple) deployment.py/sh and health script.py/sh to handle 95% of monthly deployment+maintenance needs.

Done. Scale to MxN HostsxServices across M IPs by N services over N customers.


Now have a slight change of perspective and imagine you are a program that runs another program - and wants to make sure you are in control of anything that program could spawn




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: