One way to represent and handle Virtual Machines in Kubernetes.

Cardstock model
1{width=”640” height=”414”}

Extending Kubernetes to understand handling Virtual Machines. But how?

In the previous post it became obvious that VMs are sometimes used in Kubernetse, but they can not be fine tuned, because they are often used in a way which is transparent to the user, thus a user does not gain any direct access to the VM. To gain direct access to all VM properties we thus need to explicitly represent VMs inside Kubernetes.

So, what can be done? We need to define a VM type in Kubernetes. Once the type is there, we have the ability to fine define all relevant properties. But how can this be done? Kubernetes has support for so called 3rd party resources (TPR). But let’s take a step back to have a broader context to understand what they are.

In general, Kubernetes works in a declarative and reactive way. A user creates objects of a specific kind through the Kubernetes REST API. There are controllers inside the cluster which are responsible for each and every type supported by Kubernetes. Once a controller sees a new instance of a specific type, he reacts and performs the necessary steps to bring such an object to life. For example, if a user posts a pod specification to the API server, a controller will see this new specification and get the pod scheduled on a host, where the kubelet is then instanciating this pod.

Thus: For every type which is known to Kubernetes there is a controller responsible for handling it.

TPRs are now a way to declare additional types in the Kubernetes API. After you declared such a new type, a user can post objects of this type to the Kuebrnetes API, Kubernetes will then store them as any other object. They can actually be manipulated like any other object. (In reality there are afew bugs and limitations).

Thus we could easily use a TPR to declare a VM type within Kubernetes. And the Kubernetes REST API can be used to modify objects of this type.

The issue is that Kubernetes does store this type, but there is no controller in the cluster or daemon on a node which knows how to handle these objects.

So the second thing we need to do, is to come up with controllers and daemons to provide the cluster and node wide virtualzation logic to Kubernetes. They can ideally be shipped as containers - which will allow us to directly leverage existing Kubernetes functionality like DaemonSets or ReplicaSets.

That’s the high-level picture. In a picture:

A long time ago in a Kubernetes far, far away …

            User
              |
              v
+-----------------------------------------+
| API Server                              |
+ - - - - - - - - - - - - - - - - - - - - +
| [RC Foo]                 [VM Bar]       |
+----A-----------------------A------------+
     |                       |
     | watching for RCs      | watching for VMs
     |                       |
+---------------+        +-----------------+
| rc-controller |        | virt-controller | 🠘 NEW
+---------------+        +-----------------+

But what do we gain? Up to now gain a little. For example we would inherit the deployment features of the cluster. Instead of having our own oVirt logic to turn hosts into cluster nodes, we inherit Kuebrnetes functionality on that front. By shipping the logic for controllers and daemons in containers, we gain functionality for delivery for free (DaemonSets will ensure that a daemon is always running on all hosts in the cluster). Also some kind of failover (ReplicaSets). We also get a communication channel between them for free (network). Also the datastore for VM specifications (API Server). Even more challenging, but by putting our stuff into containers, we isolate ourselfs from the hosts

  • to some degree. The ramining host specific bits can then hopefully pushed into dedicated places - to gain more OS independence - so it does not matter if we are running on CentOS, Fedora, Atomic, Alpine, or Ubuntu.

This all looks to good, yes. There are drawbacks, i.e. we need to adopt our software to play well with Kubernetes. And we will need to to adopt the declarative and reactive patterns.

And it’s also tricky in the detail. We would accept that the kubelet is the designated node level resource manager. This is tricky in virtualization, as there might be conflicts between what the kubelet is planning and virtualization side’s needs. However, these problems - that there might be conflicts between what a workload wants and what the kubelet wants - are not virtulization specific. Therefor I’m optimistic that there will be ways of cooperating with the kubelet to solve these kind of conflicts.

Another thing is that containers are usually - well - contained. And our daemons will need access to physical hardware, for example to do device passthrough.

A few steps forward, but also a few back …

::: {#footer} [ January 15th, 2017 9:45pm ]{#timestamp} [kubernetes]{.tag} [kubevirt]{.tag} [declarative]{.tag} [reactive]{.tag} [virtualization]{.tag} [fedora]{.tag} [centos]{.tag} :::