Friday, May 09, 2008

The truth about KVM and Xen

When I saw this article in my inbox, I knew I shouldn't bother reading it. I really couldn't help myself though. I'm weak for gossip and my flight was delayed so boredom got the best of me.

I can't blame the tech media for the wild reporting though. The situation surrounding KVM, Xen, and Linux virtualization is pretty confused right now. I'll attempt to do my best to clear things up. I'll make an extra disclaimer though that this is purely my own opinions and does not represent any official position of my employer.

I'm think we can finally admit that we, the Linux community, made a very big mistake with Xen. Xen should have never been included in a Linux distribution. There, I've said it. We've all been thinking it, have whispered it in closed rooms, and have done our bests to avoid it.

I say this, not because Xen isn't useful technology and certainly not because people shouldn't use it. Xen is a very useful project and can really make a huge impact in an enterprise environment. Quite simply, Xen is not, and will never be, a part of Linux. Therefore, including it in a Linux distribution has only led to massive user confusion about the relationship between Linux and Xen.

Xen is a hypervisor that is based on the Nemesis microkernel. Linux distributions ship Xen today and by default install a Linux guest (known as domain-0) and do their best to hide the fact that Xen is not a part of Linux. They've done a good job, most users won't even notice that they are running an entirely different Operating System. The whole situation is somewhat absurd though. It's like if the distributions shipped a NetBSD kernel automatically and switched to using it when you wanted to run a LAMP stack. We don't ship a plethora of purpose-built kernels in a distribution. We ship one kernel and make sure that it works well for all users. That's what makes a Linux distribution Linux. When you take away the Linux kernel, it's not Linux any more.

There is no shortage of purpose-built kernels out there. NetBSD is a purpose-built kernel for networking workloads. QNX is a purpose-built kernel for embedded environments. VxWorks is a purpose-built kernel for real-time environments. Being purpose-built doesn't imply superiority and Linux currently is very competitive in all of these areas.

When the distros first shipped Xen, it was done mostly out of desperation. Virtualization was, and still is, the "hot" thing. Linux did not provide any native hypervisor capability. Most Linux developers didn't even really know that much about virtualization. Xen was a pretty easy to use purpose-built kernel that had a pretty good community. So we made the hasty decision to ship Xen instead of investing in making Linux a proper hypervisor.

This decision has come back to haunt us now in the form of massive confusion. When people talk about Xen not being merged into Linux, I don't think they realize that Xen will *never* be merged into Linux. Xen will always be a separate, purpose-built kernel. There are patches to Linux that enable it to run well as a guest under Xen. These patches are likely to be merged in the future, but Xen will never been a part of the Linux kernel.

As a Linux developer, it's hard for me to be that interested in Xen--for the same reasons I have no interest in NetBSD, QNX, or VxWorks. The same is true for the vast majority of Linux developers. When you think about it, it is really quite silly. We advocate Linux for everything from embedded systems, to systems requiring real-time performances, to high-end mainframes. I trust Linux to run on my dvd player, my laptop, and to run on the servers that manage my 401k. Is virtualization so much harder than every other problem in the industry that Linux is somehow incompatible of doing it well on its own? Of course not. Virtualization is actually quite simple compared to things like real-time.

This does not mean that Xen is dead or that we should have never encouraged people to use it in the first place. At the time, it was the best solution available. At this moment in time, it's still unclear whether Linux as hypervisor is better than Xen in every scenario. I won't say that all users should switch en-masse from Xen to Linux for their virtualization needs. All of the projects I've referenced here are viable projects that have large user bases.

I'm a Linux developer though, and just as others Linux hackers who are trying to make Linux run well in everything from mainframes to dvd players, I will continue to work to make Linux work well as a hypervisor. The Linux community will work toward making Linux the best hypervisor out there. The Linux distros will stop shipping a purpose-built kernel for virtualization and instead rely on Linux for it.

Looking at the rest of the industry, I'm surprised that other kernels haven't gone in the direction of Linux in terms of adding hypervisor support directly to the kernel.

Why is Windows not good enough to act a hypervisor such that Microsoft had to write a new kernel from scratch (Hyper-V)?

Why is Solaris not good enough to act as a hypervisor requiring Sun to ship Xen in xVM? Solaris is good enough to run enterprise workloads but not good enough to run a Windows VM? Really? Maybe :-)

Forget about all of the "true hypervisor" FUD you may read. The real question to ask yourself is what is so wrong with these other kernels that they aren't capable of running virtual machines well and instead have to rely on a relatively young and untested microkernel to do their heavy lifting?

Update: modified some of the text for clarity. Flight delayed more so another round of editing :-)