Thursday, April 23, 2009

Boosting server utilization 100% by accelerated VM migration (MemoryMotion™)

Recently I guest-wrote an article on about a technology I've been researching for some time, called MemoryMotion™. By providing a distributed memory disambiguation fabric, VM live migration time is greatly accelerated. Besides the obvious benefits, ultra-quick migration unlocks a healthy chunk of additional server utilization, currently inaccessible due to slow migration times in current hypervisors.

It's worth stepping back a moment and asking the question, "why can we only run a physical server at say a nominal 40% (non-VDI) / 60% (VDI) maximum load?" Based on intuition, one might answer that it's due to workload spikes, which cause us to provision conservatively enough as to absorb spikes to an extent (the micro level), and fall back to distributed scheduling across servers to balance out at the macro level. But that's really how we deal with the problem, rather than an identification of the problem. The truth is that a good chunk of the 40% of nominal under-utilization comes from the slowness of live VM migration. It's easy to demonstrate this with a question. What could the nominal maximum load factor be, if we had infinitely fast VM migrations (0.0 seconds)? Probably 90%+ or so. Statistically, there are likely to be physical servers that are under-utilized at any one time. If we can quickly move VMs to those servers, we can adapt more quickly to load spikes, and thus achieve higher holistic utilization rates. The more servers one has, the more this is true.

Now, if we're using distributed VM scheduling for the purpose of power management, we want to pack as many VMs on as few powered-on physical servers as possible. In that case, we often have powered-off servers, which can be spun up quickly. And thus, we have even more head-room available. With infinitely fast migration and server power-ups, one could push physical server utilization near 100%, without losing quality of service. There's not much reason not to. This would yield a huge gain in power efficiency (and thus power costs).

The article highlights MemoryMotion™, a technology which by way of providing much faster VM migrations, provides a mechanism to greatly boost server utilization. Using high speed network fabrics, we can finally look at distributed VM scheduling like an Operating System scheduler, because the scheduling time-frame decreases down to the sub-1-second range. This is how we can tap the next 100%(relative) of ROI/power-consumption improvements, as highlighted in the following slide.

As far as I'm aware of, this could be the next killer virtualization feature, since the advent of live VM migration. Feel free to contact me if you'd like to know more. Note that it is patent pending technology.

Disclosure: no positions


bnjammin said...

This is a very cool software solution. It seems like it would work great with Cisco's Data Center 3.0 initiative/Nexus line of switches which I've heard provide direct backplane access and direct access to the switches' impressive RAM to VM management software.

Tim said...

I love the idea of speeding up VM migration and agree that it could lead to some pretty cool new scheduling mechanisms. However, if you are paranoid about malicious VMs, then the idea of exploiting shared pages to reduce migration time could be a problem:

How are you going to tell which pages are truly identical and don't need to be copied over the network? Normally vmware uses short hashes to detect identical pages, but there is still a chance for collision. Before doing the actual sharing, it will do a full byte-by-byte check of the two pages. In your case, you can't do the byte-by-byte check, unless you copy the page over the network (but that ruins the whole point!)

I think in practice this generally wouldn't be a problem (assuming vmware uses 64bit hashes, 32bit ones collide pretty frequently). Unfortunately, if a malicious user managed to create a bad page that collided with one of your core kernel memory pages, you'd be in pretty bad shape!

I wrote a short blog post about this here.

Kevin Lawton said...


Benjamin, thanks for the kudos and love your line of thinking. Feel free to pass on the thoughts. :)

Tim, enjoyed your research paper. I share your thoughts on hash-collisions, and there are good solutions there -- I need to wait to disclose more.

Doing lazy copies (I read your blog post) is an approach. It's actually complementary. Though it has the downsides of breaking the single point-of-failure / atomicity of VM migrations as well as not working across long-distance migrations (where cloud computing is going). It also gets really complicated if multiple migrations occur quickly, like when a hardware starts throwing errors and you need to migrate now.

Keep up some great commentary.