44.4 Creating a Container Manually with unshare and chroot
Right, so you want to build a container. Not pull one from a registry, not write a Dockerfile and let a daemon do the heavy lifting. You want to get your hands dirty and build one from scratch. Excellent. This is where you stop waving at the ship and start learning how the engine room works. It’s messy, it’s manual, and it’s the single best way to understand what the hell is actually happening when you run docker exec.
We’re going to use two of Linux’s most powerful, and frankly, slightly cantankerous, tools: unshare and chroot. This is the “roll your own” method, and it’s gloriously illuminating.
The Core Idea: Unsharing and Pivoting
Think of your current Linux system as a big party. Every process is mingling, sharing the same drinks (filesystems), talking in the same room (network), and can see everyone else (process tree). A container is just a way to cordon off a little group of processes into their own private party room.
unshare is the bouncer that creates that new, private room. It tells the kernel, “For this process and all its future children, I want a new set of namespaces. Get them their own guest list (PID), their own room (mount), their own chat circle (network), etc.”
chroot is the guy who then lies to them about the layout of the building. It changes the apparent root directory (/) for a process. So while the real root of the system might have /home, /usr, /etc, your process now thinks its root is /home/you/my_container. This is its whole world. It can’t see anything outside of that directory. It’s a filesystem jail.
Combined, these two tools are the absolute bedrock of what we think of as a container.
Step 1: Building a Prison Cell (The Filesystem)
You can’t chroot into an empty directory. The process needs something to run! It needs /bin, /lib, /lib64, basic utilities. If you try to chroot into a bare directory and run bash, it will scream at you about missing libraries and die a sad, lonely death.
So, let’s build a minimal one. The easiest way is to just debootstrap a basic Ubuntu or Alpine root. But since this is a manual guide, let’s do it the painfully educational way. We’ll copy the absolute essentials for a single binary.
First, create your container root and copy a shell into it. We’ll use bash because it’s common, but honestly, it’s a bit of a bloated tenant for this minimalist cell.
# Create the container root directory
mkdir -p ~/my_container/{bin,lib,lib64}
# Copy the bash binary
cp /bin/bash ~/my_container/bin/
Now, find out what libraries bash depends on and copy those too. This is the part everyone forgets and then gets utterly confused by the “file not found” error, which is the kernel’s unhelpful way of saying “I can’t find the dynamic library.”
# Use ldd to find bash's dependencies
ldd /bin/bash
You’ll see output like:
linux-vdso.so.1 (0x00007ffc45bf0000)
libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007f5a0a104000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f5a0a0fe000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5a09f3c000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5a0a14c000)
Now, copy each of these libraries to the corresponding location in your container. Notice the paths; you need to preserve them!
# Copy the libraries, creating directories as needed
cp /lib/x86_64-linux-gnu/libtinfo.so.6 ~/my_container/lib/x86_64-linux-gnu/
cp /lib/x86_64-linux-gnu/libdl.so.2 ~/my_container/lib/x86_64-linux-gnu/
cp /lib/x86_64-linux-gnu/libc.so.6 ~/my_container/lib/x86_64-linux-gnu/
cp /lib64/ld-linux-x86-64.so.2 ~/my_container/lib64/
Yes, this is tedious. This is why package managers and Dockerfiles exist. Appreciate them more now.
Step 2: The Grand Unsharing
Now for the main event. We’ll use unshare to create a new set of namespaces. The flags are key:
--pid: New PID namespace. This is crucial; it makes our container the “init” process (PID 1) inside its little world.--mount: New mount namespace. This lets us mount things like/procwithout affecting the host.--uts: New UTS namespace. This allows us to set a custom hostname for the container.--fork: Fork the initial process. Required for the PID namespace to work correctly.--user: New user namespace. This is the real magic, as it often lets you run as root inside the container while being an unprivileged user on the host. A stunningly good security feature.
# This is the one-liner that does it all
sudo unshare --pid --mount --uts --user --fork chroot ~/my_container /bin/bash
Wait, why sudo? Because creating new user namespaces still often requires privilege, especially when combined with others. The kernel gets fussy.
You should now be in a chroot environment, running as root. Let’s see what our world looks like.
# Set a hostname to prove our UTS namespace is isolated
hostname my-awesome-container
hostname # Output: my-awesome-container
# Check the process list. It's... not great.
ps aux
You’ll likely see the host’s processes. Why? Because the /proc filesystem is still mounted from the host. Our new mount namespace has it, but it’s not isolated yet. Our container is lying to us. This is a classic pitfall.
Step 3: Mounting /proc Correctly
We need a fresh, clean /proc that only knows about the processes in our new PID namespace. This is a two-step dance.
First, make sure you have a directory to mount it on inside the container root. We should have done this earlier.
# From outside the container, create the directory
mkdir ~/my_container/proc
Now, inside the container, mount a new proc filesystem:
# Mount a new proc filesystem that respects our PID namespace
mount -t proc proc /proc
Now run ps aux again. Beautiful, right? You should see maybe two processes: bash (as PID 1) and ps. You have successfully fooled the ps command. This is the heart of the illusion. You are now in a very basic, but very real, container. You can exit to leave and return to the host.
Why This is Both Brilliant and Terrible
This manual process is brilliant because it exposes the raw mechanics. There’s no mystery. You see exactly how namespaces and chroot combine to create isolation.
It’s terrible as a production strategy because it’s incredibly fragile. You manually copied binaries and libraries. What about ls? cat? A text editor? You’d have to copy each one and all their dependencies, by hand. This is the problem that package managers and Docker images solve. They are essentially automated systems for building a complete and consistent chroot root filesystem for you.
This is also why the --user namespace is so critical. In our example, we used sudo, but the goal is to map your regular user ID to root inside the container. This means a process that escapes the container and breaks into the “root” user inside would actually be running as your unprivileged user ID on the host. It’s a fantastic mitigation layer, and its configuration (in /etc/subuid and /etc/subgid) is what allows Docker to run without full root privileges. It’s a feature worth understanding deeply, because it turns a major design weakness (containers running as root) into a manageable security control.