Friday, May 8, 2009

50% of Cloud Compute Power & Carbon Waste Solved by Software Technique

Ever wonder why virtualized servers are usually run at nominal loads of 30..40%? It's because the speed at which VMs can be live migrated is much slower than the speed at which load spikes can arise. Thus, a large "head room" is built into the target loads at which IT is generally comfortable running their virtualized servers.

Faster networking would seem to save the day, but per-server VM density is increasing exponentially, along with the size of VMs (see latest VMworld talks). Thus there are many more (increasingly bigger) VMs to migrate, when load spikes occur.

Besides the drastic under-utilization of capital equipment, it's a shame that the load "head room" is actually in the most efficient sweet spot of the curve. The first VMs tasked on a server are the most power-costly, and the last ones are nearly free (except we only use that band for spike management).

If loads could comfortably be advanced from 40 to 80%, half (or more) of the compute-oriented wasted power consumption, resultant carbon footprint, and excess capital equipment could be saved. And that's exactly what my R&D showed can be done, with a software-only solution. And as an added value, the same technique accelerates VM migrations across data centers, which is where cloud computing is going.

Essentially, we're blowing an extraordinary amount of money and trashing the Atmosphere from out-moded virtualization software. None of the virtualization solutions from VMware, Citrix, or KVM have this new technique yet. And it's not available via Amazon's EC2. Here's to hoping the future is greener. As well, a high-end networking infrastructure throughout will be essential. Time is money, and time to migrate VMs translates to a lot of money! For the next cloud computing decade, "networking is power". Consider getting long cloud providers and virtualization vendors that support accelerated VM migration, and networking vendors that supply the goods. It nearly goes without mention that Cisco has entered the server market...

Please help spread the word, and feel free to bug your providers.

Disclosure: no positions, related patents pending

1 comment:

Todd said...

I agree, and this is an interesting new way to attack the problem.

Arguably, underutilization due to excess "head-room" is a huge problem for physical servers as well as for VM's. Since the latency for load balancing physical servers (e.g. bringing up a physical server from standby) is at least as bad as the latency for VM migrations now (e.g. using "virtualization 1.0" technology), your proposed methods will become more advantageous over time, not less. In reality, migrating/re-provisioning physical servers is an order of magnitude slower than for VM's.

We have been researching the power efficiency gains of different storage topologies/architectures and agree that this sort of intelligent storage sync (if you'll forgive the oversimplification) could have huge power efficiency benefits. What is especially interesting is that with physical servers, this approach to reducing head-room only really works for homogeneous servers (e.g. server pools). For the case of large numbers of VM's, it can be heterogeneous, with lots of different server applications, perhaps only with a common underlying OS. For example, we see a case in our labs where the active, unique memory footprint of a RHEL server is often only 12-25 MB. (!)

This is one more driver towards virtualization (away from physical servers), and towards novel VM management technologies.