Boosting server utilization 100% by accelerated VM migration (MemoryMotion™)

Recently I guest-wrote an article on about a technology I've been researching for some time, called MemoryMotion™. By providing a distributed memory disambiguation fabric, VM live migration time is greatly accelerated. Besides the obvious benefits, ultra-quick migration unlocks a healthy chunk of additional server utilization, currently inaccessible due to slow migration times in current hypervisors.

It's worth stepping back a moment and asking the question, "why can we only run a physical server at say a nominal 40% (non-VDI) / 60% (VDI) maximum load?" Based on intuition, one might answer that it's due to workload spikes, which cause us to provision conservatively enough as to absorb spikes to an extent (the micro level), and fall back to distributed scheduling across servers to balance out at the macro level. But that's really how we deal with the problem, rather than an identification of the problem. The truth is that a good chunk of the 40% of nominal under-utilization comes from the slowness of live VM migration. It's easy to demonstrate this with a question. What could the nominal maximum load factor be, if we had infinitely fast VM migrations (0.0 seconds)? Probably 90%+ or so. Statistically, there are likely to be physical servers that are under-utilized at any one time. If we can quickly move VMs to those servers, we can adapt more quickly to load spikes, and thus achieve higher holistic utilization rates. The more servers one has, the more this is true.

Now, if we're using distributed VM scheduling for the purpose of power management, we want to pack as many VMs on as few powered-on physical servers as possible. In that case, we often have powered-off servers, which can be spun up quickly. And thus, we have even more head-room available. With infinitely fast migration and server power-ups, one could push physical server utilization near 100%, without losing quality of service. There's not much reason not to. This would yield a huge gain in power efficiency (and thus power costs).

The article highlights MemoryMotion™, a technology which by way of providing much faster VM migrations, provides a mechanism to greatly boost server utilization. Using high speed network fabrics, we can finally look at distributed VM scheduling like an Operating System scheduler, because the scheduling time-frame decreases down to the sub-1-second range. This is how we can tap the next 100%(relative) of ROI/power-consumption improvements, as highlighted in the following slide.

As far as I'm aware of, this could be the next killer virtualization feature, since the advent of live VM migration. Feel free to contact me if you'd like to know more. Note that it is patent pending technology.

Disclosure: no positions