Transparent CPU Emulation

WALT offers powerful tools to customize an OS image, as demonstrated in the previous blog post. Upon reflection, the curious user might wonder how the WALT server, which has a classical x86 CPU, can customize OS images that often target a different CPU architecture (e.g., an ARM CPU architecture for a Raspberry Pi image). How can a command such as walt image shell handle this scenario?

Introduction

Let us consider the command walt image shell specifically.¹

A very common task is to modify an OS image to install one more package. Considering the OS image is based on Debian, the package manager is apt:

$ walt image shell <existing-image>
image-shell:~$ apt update  # download fresh Debian repo metadata
[...]
image-shell:~$ apt install <new-package>
[...]
image-shell:~$ exit
Save as: <new-image-name>
Done.
$

The walt command line tool delegates its task to the WALT server, so it is the WALT server that is responsible for managing the virtual environment called image-shell above. But if <existing-image> is a Raspberry Pi OS image, then this means apt is a binary built for an ARM CPU. That’s also the case for all other processes started in the shell session. For instance, the shell process itself relies on <image-root>:/bin/bash, an ARM binary too, and the same applies to all sub-processes started by apt.

Actually, this works because the WALT server is configured with a layer of transparent CPU emulation, made of the components described next.

For the sake of the demonstration we will first need an ARM binary. For this purpose, I have chosen busybox. It is an interesting software that provides many Unix utilities in a single executable file. For instance, busybox ps is a trimmed-down version of the classical ps command. It also provides busybox top, busybox ls, etc. (By the way, busybox has an important role in WALT.²)

Qemu user mode

Let us download an ARM binary of busybox, and try to run it on my laptop:

etienne@formose:~$ curl -s -o busybox-arm https://busybox.net/downloads/binaries/1.21.1/busybox-armv7l
etienne@formose:~$ chmod +x ./busybox-arm
etienne@formose:~$ ./busybox-arm ls
bash: ./busybox-arm: cannot execute binary file: Exec format error
etienne@formose:~$

Obviously, it fails. Let us try the same on a WALT server.

etienne@walt-server:~$ curl -s -o busybox-arm https://busybox.net/downloads/binaries/1.21.1/busybox-armv7l
etienne@walt-server:~$ chmod +x ./busybox-arm
etienne@walt-server:~$ ./busybox-arm ls
busybox-arm   notes.txt
etienne@walt-server:~$

It works there!

Obviously, there is a component installed on WALT servers for providing binary translation. It is called qemu. Here is a list of the packages related to qemu on this machine:

etienne@walt-server:~$ dpkg-query -W | grep qemu
qemu-block-extra     1:7.2+dfsg-7+deb12u12
qemu-system-common   1:7.2+dfsg-7+deb12u12
qemu-system-data     1:7.2+dfsg-7+deb12u12
qemu-system-x86      1:7.2+dfsg-7+deb12u12
qemu-user-static     1:7.2+dfsg-7+deb12u12
qemu-utils           1:7.2+dfsg-7+deb12u12
etienne@walt-server:~$

The two main packages here are qemu-system-x86 and qemu-user-static (others are dependencies of those two):

qemu-system-x86 is useful for starting virtual machines with an x86 CPU. This technique is called “qemu system emulation”. Behind the scene, WALT virtual nodes are created this way.
qemu-user-static is what we are looking for. Instead a running a full virtual machine, it allows to run just a single process from a binary made for a different CPU architecture. This technique is called “qemu user mode”.

Let us check which files this package provides:

etienne@walt-server:~$ dpkg -L qemu-user-static
/.
/usr
/usr/bin
/usr/bin/qemu-aarch64-static
/usr/bin/qemu-aarch64_be-static
/usr/bin/qemu-alpha-static
/usr/bin/qemu-arm-static
/usr/bin/qemu-armeb-static
[...]
etienne@walt-server:~$

We have a list of emulators for various architectures. Let us try to use the one for ARM for running the busybox-arm binary:

etienne@walt-server:~$ qemu-arm-static ./busybox-arm ls
busybox-arm   notes.txt
etienne@walt-server:~$

Yes, this works!

But for now, one mystery remains: on this WALT server, it is not necessary to prefix the command with the emulator, the kernel seems to guess it by itself! Let me remind you the test we did earlier:

etienne@walt-server:~$ ./busybox-arm ls
busybox-arm   notes.txt
etienne@walt-server:~$

The next section will shine a light on this.

The Binfmt-misc subsystem

If you are familiar with Linux-based systems, you probably know that /proc is a “virtual file system”. The files and sub-directories we find there are all virtual. For instance, we can find a sub-directory /proc/<pid> for each and every process currently running on the system.

The directory at /proc/sys/fs/binfmt_misc is interesting for us:

etienne@walt-server:~$ ls /proc/sys/fs/binfmt_misc/
llvm-14-runtime.binfmt  qemu-loongarch64  qemu-ppc      qemu-sparc32plus
python3.11              qemu-m68k         qemu-ppc64    qemu-sparc64
qemu-aarch64            qemu-microblaze   qemu-ppc64le  qemu-xtensa
qemu-alpha              qemu-mips         qemu-riscv32  qemu-xtensaeb
qemu-arm                qemu-mips64       qemu-riscv64  register
qemu-armeb              qemu-mips64el     qemu-s390x    status
qemu-cris               qemu-mipsel       qemu-sh4
qemu-hexagon            qemu-mipsn32      qemu-sh4eb
qemu-hppa               qemu-mipsn32el    qemu-sparc
etienne@walt-server:~$

As you might guess, each file found there (except register and status) associates a specific file format and the interpreter needed to run it.

Let us look at /proc/sys/fs/binfmt_misc/qemu-arm:

etienne@walt-server:~$ cat /proc/sys/fs/binfmt_misc/qemu-arm
enabled
interpreter /usr/libexec/qemu-binfmt/arm-binfmt-P
flags: POCF
offset 0
magic 7f454c4601010100000000000000000002002800
mask ffffffffffffff00fffffffffffffffffeffffff
etienne@walt-server:~$
etienne@walt-server:~$ readlink /usr/libexec/qemu-binfmt/arm-binfmt-P
../../bin/qemu-arm-static
etienne@walt-server:~$

The virtual file qemu-arm indicates that if a file starts with the specified magic number (considering only the bits of the mask value), then the Linux kernel must automatically use the specified interpreter to run it. And obviously, the specified interpreter links to qemu-arm-static.

Among the four flags activated, flag F is the most interesting in our context. Quoting wikipedia, flag F allows to make the kernel open the binary at configuration time instead of lazily at startup time, so that it is available inside other mount namespaces and chroots as well. So the fact walt image shell runs ARM binaries in a virtual environment (actually, a Linux container) is not a problem. When we started to design WALT, that flag did not exist, which meant we had to copy qemu-arm-static inside all our WALT images!

Enabling transparent CPU emulation

What makes it even more “magical” is that installing the qemu-user-static package on a Debian system automatically includes registering the /proc/sys/fs/binfmt_misc/qemu-* settings.

So installing qemu-user-static is enough to enable transparent CPU emulation. When installing a WALT server, walt-server-setup installs this package, among others. I wish all WALT features were this easy to implement!

Last words

I hope you found this blog post about WALT internals interesting.

Thanks to Jérémie Finiel for proofreading. If you have questions, we can answer you on the mailing list.

Note that what we describe here also applies to walt image build, more precisely to the RUN lines of the Dockerfile. ↩︎
busybox is very useful in WALT for offering a common set of features to all nodes (e.g., handling reboot requests from the server), regardless of the OS image they booted. For this reason, busybox is one of the requirements for building a WALT image from scratch. ↩︎

Introduction#

Qemu user mode#

The Binfmt-misc subsystem#

Enabling transparent CPU emulation#

Last words#

Introduction

Qemu user mode

The Binfmt-misc subsystem

Enabling transparent CPU emulation

Last words