Automation ain't easy; but it pays back¹ - Pitfalls in automating the build of images

push the
button{width=”640” height=”443”}

More than two months ago I wrote about the oVirt Engine Virtual Appliance which is now beeing build for oVirt 3.5.

Back then the first initial builds happened and I thought I was done - as in: Fine; the images are now build, nothing can happen.

Unstable

Well, it wasn’t that easy - again … The builds weren’t as stable as expected and I had to adress several issues.

Let’s give some background: We are using Jenkins within oVirt for continous intgeration and other tasks. I’m also building oVirt Node, which is also some kind of image.

Compared to the Node image, the appliance image is different, because it can be build using livemedia-creator. Node in contrast needs livecd-tools.

The difference beteween livecd-creator (lmc) and livecd-tools is a rather big one. livecd-tools is using a chroot to build the image, lmc otoh is using a complete VM (through libvirt) to isolate the image creation from the host operating system. This is good for several reasons:

  • SELinux - If the SELinux policy between host and guest difffer to much, then obscur problems can appear
  • If the build fails, mount points must be cleared
  • If the build fails, loop devides must be cleared (!)

To name just a few.

livecd-tools main drawback is it’s use of loop back devices, which ain’t bad per se, but it turns out that - still - situations can occur where loop devices can not be removed again. I mean, they can not be removed at all, without rebooting the system. On a CI system, this will slowly eat up all the loop devices, until a builder is unable to build an image because of this execssive use of loop devices. That is a big problem when building images using livecd-tools.

Now, to address this I wanted to use lmc. lmc is not using loop devices, but it’s using libvirt. Which is actually quite nice. But it seems that there are also some problems which make the use of libvirt in jenkins environments hard.

Let me say that those kind of problems might be in general not to bad, you set SELinux to permissive, and run lmc as root. But that is the difference between your lcoal environemnt and a shared Jenkins instance: You don’t have the full controll over your environment. You can’t always tell ahead of time what exact

I’m actually a big fan of using existing code. Because often others have already thought about and solved a problem. But in this case I went with using qemu directly. This turned out to be very well suited for Jenkins, because a qemu instance is independent and does need loop mounts (wohoo) to create a disk image.

Qemu was in place, and was booting anaconda nicely. But it was slow and failed.

It was slow, because the builder does not have a great bandwidth into the internet. So to improve To address this, I setup a caching mechanism, which is working fairly well, but could be further improved by mirroring the release trees of the required distribution.

¹: In the best case

::: {#footer} [ September 25th, 2014 9:05am ]{#timestamp} [fedora]{.tag} [ovirt]{.tag} [node]{.tag} [appliance]{.tag} [livemedia-creator]{.tag} [livecd-tools]{.tag} [jenkins]{.tag} [qemu]{.tag} [libvirt]{.tag} :::