Learned a neat trick to replace shells today!
The Problem
I ran into a situation today where I was copying files into a folder using a script, and that folder was mapped to a docker volume in a running container. When I ran the script as my user (which happens to have user id 1000) it copied the files in and gave user id 1000 ownership. In the container the user being used also had user id 1000 (but a different username), and this seemed to cause no issues in the container.
However, I have another container that runs as the root user, and that script can be run from inside that container through some clever trickery. In that scenario the script runs as root, and so the files copied into the volume are owned by root. This did cause an issue with file ownership, because the user in the container that needed to edit those files didn't have ownership permissions over them.
The Solution
I was using AI to refactor a container entrypoint script to deal with this permissions issue. Initially the script ran with the permissions of user id 1000, so in the case of manually running the script from the host everything was fine and it had the permissions to operate on the files, but when this entrypoint script ran and the files had been copied with root permissions it would fail to execute and start the container. One of the first things AI suggested was splitting the script internally into 2 parts:
- The part that needs to be run as root, which mainly involves changing file permissions from the root user to the user with id 1000.
- The part that needs to be run as user id 1000, which does things like run the
npm install
command to install supporting packages. The general recommendation is to not run this command as root, but rather as the user that will be using those packages, in this case the user with id 1000.
Makes sense, easy enough, but I noticed a slight oddity when it refactored the code: One of the first things it did was create a user id check for root, and if that passed any of the code that needed to be run as root was inside of the if..fi block, and then finished the block by rerunning the script as our user id 1000 user. After that block, it had all of the code that needed to be run as user id 1000, and that didn't feel correct to me. Logically, it seems like it would have run the script as the root user, done the work inside the if block which then spawns a child process to run the script as the user id 1000 user, and then when that finished the spawned process would exit and it would have continued on with the rest of the script as the root user, essentially rerunning the commands as root that I didn't want to be run as root. So, I questioned the AI.
I've noticed that when you push back against AI it normally immediately caves and does things the way you suggest doing them. Not this time. The AI helpfully explained that the command it had used to rerun the script as user id 1000 doesn't spawn a child process, but instead replaces the current shell with a new shell as the expected user. When that code runs the check for root fails so it skips the file ownership code that needs to be run as root and only runs the code in the remainder of the file as the user that I want to run that code as. Once that finishes the process exits, and we're done, since it's not a child process there is no root shell it needs to return to to continue execution. Awesome!
So, what's the command that does this? The script looks something like the following.
#!/bin/sh
set -e
# check for root
if [ "$(id -u)" = "0" ]; then
# do rooty things
chown -R user:user /path/to/files
# This is the key line!!
exec su -p user "$0" "$@"
fi
# Do the rest of this as user (not root)
cd /path/to/files
npm install
# in this case we want to kick off the entrypoint file from the base container we built from
exec .entrypoint.sh "$@"
The key line in the script is: exec su -p user "$0" "$@"
Let's break that down. exec
replaces the current shell process by the command that replaces it. So, after su
is executed the original script's process is gone and the su
command takes over. There's no return to the original script after the exec
call. I always thought of su
as "super user," because I typically don't have to switch between multiple users on my system (there's just me as a user and me as a user with root privileges) but it actually aligns closer with "substitute user" or "switch user," so you can use it as root to run a command with the privileges of a non-root user, total mind shift for me!! The -p
flag stands for --preserve-environment
which keeps the current environment variables you're using (any DEBUG or NODE_ENV variables, or whatever you've already set). Without that flag it would set up a clean environment for the user being switched to, so this allows you to just keep on after you've already done your setup. user
can be replaced with the name of your user: dave, dan, alice, bob, eve, whatever. "$0"
is a variable expansion that expands to the name of the script being executed, essentially rerunning the same script but as the user you defined. "$@"
expands to all the command-line arguments passed to the current script, so you're not losing any context when rerunning the script.
Very neat trick. Hope this helps!