Better CoreOS in a VM experience
(This was all tested with CoreOS beta
991.2.0, the stable
fails to mount
#!/bin/sh exec ./coreos_production_qemu.sh \ -user-data cloud-config.yaml \ -nographic \ "$@"
cloud-config.yaml looking like
#cloud-config hostname: mytest users: - name: jdoe groups: - sudo - rkt ssh_authorized_keys: - "ssh-ed25519 blahblah firstname.lastname@example.org"
I used that for many experiments, but felt it was less than ideal.
I wanted to move beyond
-net user. That just means calling QEMU
directly, instead of using
coreos_production_qemu.sh, no big deal.
But it meant I would be writing my own QEMU runner, no matter what.
I also wanted more efficient usage of my resources -- after all, the whole point of me running several virtual machines on one physical machine is to make these test setups more economical.
The provided QEMU images waste disk and bandwidth. Every VM stores
two copies of the CoreOS
/usr image, just like a physical machine
would. Copy-on-write trickery on the initial image will not help
beyond the first auto-update, as each VM independently downloads and
applies the updates. This means if you run a small test cluster of say
5 VMs, you'll end up with 10 copies of CoreOS, and 5x the bandwidth
Imitating physical computers with virtual machines is great if you're trying to learn how the CoreOS update mechanism works, but once you're to the point of wanting to just run services, it's simply not needed.
CoreOS does have a supported mode where it does not use the
starting a computer by requesting the software over the network. I
could even skip the virtual networking and use this
with QEMU by launching the kernel and initrd directly,
no need for PXE itself. However, this is wasteful in another way: it
holds the complete
/usr partition contents in RAM, using about
180MB. Once per each VM. There is also an annoying delay of 15+
seconds in VM startup, presumably related to the large initrd image,
and later the kernel spends 1.2 seconds uncompressing it into a
tmpfs (measured on a i5-5300U laptop).
Digging into the PXE image, I find that it actually stores the
contents as a
squashfs -- which is a real filesystem that can be
stored on block devices, as opposed to just unpacking a
cpio to a
tmpfs. The PXE image does what's called a "loopback mount", where a
file is treated like a block device. In the PXE scenario, the file is
held in RAM in a
tmpfs; I can just put those bytes on a block
device, and boot that!
seems to also hold
/usr contents in
tmpfs just like the PXE
variant, even though it could fetch them on demand from the ISO. The
squashfs image is random-access, unlike the usual
initramfs contents. In later versions, CoreOS could switch
their ISO images to use the trick I'll explain below -- at the cost of
physical machines needing to spin up a CD more often than once per
boot. The live CD has another downside that made me avoid it: to pass
kernel parameters, I'd have to resort to kludges like creating a boot
floppy image with
syslinux and the right parameters on it.)
So, I set about fixing the wasted disk and bandwidth problem. Here's a story of an afternoon project.
/usr image directly
Instead of holding an extra copy of the
/usr image data in RAM, we
can make it available as a block device, and load blocks on demand.
For that, we need the
squashfs image as a standalone file,
not inside the
cpio. It's not available as a separate download, but
we can extract it from the PXE image:
wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe.vmlinuz.sig gpg --verify coreos_production_pxe.vmlinuz.sig wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz wget http://beta.release.core-os.net/amd64-usr/current/coreos_production_pxe_image.cpio.gz.sig gpg --verify coreos_production_pxe_image.cpio.gz.sig zcat coreos_production_pxe_image.cpio.gz \ | cpio -i --quiet --sparse --to-stdout usr.squashfs \ >usr.squashfs
Prepare a root filesystem
We also need to make prepare a disk image that will be used for
storing the root filesystem. CoreOS won't boot right with a fully
blank disk. If it had, I would have used
qcow2 as the format, but
now I need to provide some sort of structure for the root filesystem,
so let's go with a
raw disk image.
I might have been able to set up the right GPT partition UUIDs for the
mkfs things for me, but that seemed too complicated, and I
doubted it'd support my "just the root" scenario as well as their
To keep it simple, we won't bother to use partitions; the whole block device is just one filesystem.
>rootfs.img chattr +C rootfs.img truncate -s 4G rootfs.img mkfs.ext4 rootfs.img
This was previously done inside
coreos_production_qemu.sh with a
temp dir, but we'll just pass a directory as
virtfs following the
"config drive" convention. Let's move our previous file into the right
mkdir -p config/openstack/latest mv cloud-config.yaml config/openstack/latest/user_data
Finally, run QEMU
qemu-system-x86_64 \ -name mycoreosvm \ -nographic \ -machine accel=kvm -cpu host -smp 4 \ -m 1024 \ \ -net nic,vlan=0,model=virtio \ -net user,vlan=0,hostfwd=tcp::2222-:22,hostname=mycoreosvm \ \ -fsdev local,id=config,security_model=none,readonly,path=config \ -device virtio-9p-pci,fsdev=config,mount_tag=config-2 \ \ -drive if=virtio,file=usr.squashfs,format=raw,serial=usr.readonly \ -drive if=virtio,file=rootfs.img,format=raw,discard=on,serial=rootfs \ \ -kernel coreos_production_pxe.vmlinuz \ -append 'mount.usr=/dev/disk/by-id/virtio-usr.readonly mount.usrflags=ro root=/dev/disk/by-id/virtio-rootfs rootflags=rw console=tty0 console=ttyS0 coreos.autologin'
You'll be greeted with the Linux bootstrap messages and finally
This is mycoreosvm (Linux x86_64 4.4.6-coreos) 06:14:10 SSH host key: SHA256:t+WkofIWxkARu1hezwPnS/vgTJXUcPidA3UxKr+1uGA (DSA) SSH host key: SHA256:cT32H33EVCHSnrCRsB+I9GG7AgXQWfyjk7JFuEzAqFU (ECDSA) SSH host key: SHA256:NFgc7BLbeyS3SslpscSSNHNzc7lXzx6vKqBmUp+5T7Q (ED25519) SSH host key: SHA256:pK8Dknoib61FnIwMQ6u4F4FxeSMIRq9zYsrJd0N3MPY (RSA) eth0: 10.0.2.15 fe80::5054:ff:fe12:3456 mycoreosvm login: core (automatic login) CoreOS stable (991.2.0) Last login: Fri Apr 1 06:02:25 +0000 2016 on /dev/tty1. Update Strategy: No Reboots core@mycoreosvm ~ $
As usual with QEMU, press
C-a x to exit.
Stay tuned for part 2, where we will make the VM even leaner.