Linux Applications

Cloud-Init and VMs

The following article talks a bit about cloud-init and the problems it has, and how open source doesn’t necessarily means freedom. If you want to use cloud-init to configure cloud-images, just scroll down to point number 3.

1. What it does?

Ever wondered how VPS providers configure your VMs, add your SSH-keys, create users and install packages every time you spin up a new VM in the ‘cloud’? Well, the answer for most vendors is cloud-init. Most OS and distributions ship virtual disk images with their respective OSes installed in the image. The installation is very minimal and can serve as a template for the root filesystem of the OS. The OS maintainers are also kind enough to provide the virtual disk image for all the various formats from raw disk images to qcow2 and even vmdk, vdi and vhd.

The image also has one extra package pre-installed and that is cloud-init. It is the job of cloud-init to initialize the VM (typically within a cloud hosting service like DigitalOcean, AWS or Azure) talk to the hosting provider’s datasource and get the configuration information which it then uses to configure the VM.

The configuration information can include user-data like SSH keys, hostname of the instance, users and passwords along with any other arbitrary command that the user wants to run.

2. The Problem with Cloud-Init

Cloud-init is a great tool if you are a cloud user, if you are spinning up VMs or containers and your cloud provider is kind enough to ask you for a cloud-config, it is great! With a cloud-config file aka your user-data you can add users, run arbitrary commands, install packages right as the VM is being created. The process can be repeated over and over without tedious commands being typed over and over. Soon you have a fleet of VMs, all with identical configuration.

However, if you dig a little deeper and see how the sausage is being made you will start to question some of cloud-init’s aspects. For example, by default, the datasource is like a REST endpoint, and these are essentially hardcoded into the cloud-init package itself. Sure, you can set up a datasource all by yourself, but the process is clucky and time intensive. The documentation to do this is all but non-existent.

The official documentation is nothing but a user manual for end users relying on preexisting cloud services. It doesn’t tell you how you can setup your own cloud-init datasource, in case you are an upcoming vendor. Even the end-user documentation is poor, and I would recommend people using DigitalOcean’s excellent tutorial instead.

To make matters worse, users with home virtualization labs and small VPS startup find it difficult to benefit from those lightweight cloud-images. You can’t really start a VM off of those templates without a cloud-init datasource or some hackery which is difficult to automate and scale. In other words, you can’t even choose to ignore cloud-init unless you want to craft your own templates.

In a classic systemd fashion, it is breaking free from its predefined roles and it starting to mess with networking and other parts of the OS which throws users off. It gets bundled within Ubuntu 18.04 server ISO which makes absolutely no sense (at least not to me).

3. Workaround For Home Labs

All the ranting aside, I still have to deal with cloud-init in my everyday use. I have a very minimal Debian 9 installation on x86_64 hardware, which I use as a KVM hypervisor. I really wanted to use the qcow2 disk images that are shipped by Ubuntu and CentOS. These disk images have the OS preinstalled in them, and to use them you simply need to:

  1. Copy them as your VM’s virtual hard disk image.
  2. Resize the root filesystem’s virtual size to your desired size (at least 10GB is recommended). This will not increase the physical size of your VM but the disk image can grow over time as the VM adds more data to it.
  3. Configure the VM’s using cloud-init. The bare minimum requirement is to set root user’s password or SSH keys, but you can do pretty everything that cloud-init is capable.

The following steps are followed:

  1. Download the cloud image of your favourite OS and save it in the /var/lib/libvirt/boot directory:
$ cd /var/lib/libvirt/boot
$ curl -O https://cloud-images.ubuntu.com/xenial/current/xenial-server-cloudimg-
amd64-disk1.img
$ cd /var/lib/libvirt/images
  1. Create an empty virtual hard disk of your desired size and expand the downloaded qcow2 image into it. I like to store the VM hard disks at /var/lib/libvirt/images/ directory, you can pick a different directory. Whatever you choose, run the below commands in the same directory:
$ qemu-img create -f qcow2 myVM.qcow2 8G ## Create a harddisk with
virtual disk size of 8GB
$ virt-resize --expand /dev/sda1 /var/lib/libvirt/boot/xenial-server-
cloudimg-amd64-disk1.img
 ./myVM.qcow2
  1. Create cloud-init files. These are user-data and meta-data files:
$ vim meta-data
instance-id: myVM
local-hostname: myVM
 
$ vim user-data
#cloud-config
users:
- name: root
chpasswd:
list: |
root:myPassword
expire: False

The only user I have here is the root user. If you don’t mention any user, then the default user with name ubuntu gets created. The default username, differs from one OS to another, which is why I recommend specifying a user, even if it is just root. The next part of the user-data file tells cloud-init to configure the password for all the users you want to assign a password. Again, I am just setting the password for just root user, and it is myPassword. Make sure that there’s no space between the colon and the password string.

Better yet, you can use SSH-keys instead of having hardcoded passwords laying around.

$ vim user-data
#cloud-config
users:
- name: root
ssh_pwauth: True
ssh_authorized_keys:
- ssh-rsa <Your public ssh keys here>
  1. Embed the user-data and meta-data files into an iso.
$ genisoimage -output cidata-myVM.iso -volid cidata -joliet -rock user-data meta-data

Make sure that the file cidata-myVM.iso is situated in /var/lib/libvirt/images/

 

  1. Go to the /var/lib/libvirt/images directory and initialize the VM with virt-install command:
    $ virt-install --import --name myVM --memory 2048 --vcpus 2 --cpu host
     --disk myVM.qcow2,format=qcow2,bus=virtio --disk myVM-cidata.iso,device=cdrom
     --network bridge=virbr0,model=virtio --os-type=linux
     --os-variant=ubuntu16.04 --noautoconsole

    You can now try logging into the VM by using the command virsh console myVM and using the root username and its corresponding password to login. To exit the console, simply type Ctrl+]

Conclusion

The cloud images that most vendors ship are really efficient in terms of resource utilization and they also feel really fast and responsive. The fact that we need to deal with the awkward cloud-init configuration as a starting point only hinders the community’s adoption of KVM and related technologies.

The community can learn a lot from the way Docker builds and ships its images. They are really easy to manage both as running containers and templates that are easy to distribute and use.z

About the author

Ranvir Singh

I am a tech and science writer with quite a diverse range of interests. A strong believer of the Unix philosophy. Few of the things I am passionate about include system administration, computer hardware and physics.