Thursday, April 23, 2009

Boosting server utilization 100% by accelerated VM migration (MemoryMotion™)

Recently I guest-wrote an article on about a technology I've been researching for some time, called MemoryMotion™. By providing a distributed memory disambiguation fabric, VM live migration time is greatly accelerated. Besides the obvious benefits, ultra-quick migration unlocks a healthy chunk of additional server utilization, currently inaccessible due to slow migration times in current hypervisors.

It's worth stepping back a moment and asking the question, "why can we only run a physical server at say a nominal 40% (non-VDI) / 60% (VDI) maximum load?" Based on intuition, one might answer that it's due to workload spikes, which cause us to provision conservatively enough as to absorb spikes to an extent (the micro level), and fall back to distributed scheduling across servers to balance out at the macro level. But that's really how we deal with the problem, rather than an identification of the problem. The truth is that a good chunk of the 40% of nominal under-utilization comes from the slowness of live VM migration. It's easy to demonstrate this with a question. What could the nominal maximum load factor be, if we had infinitely fast VM migrations (0.0 seconds)? Probably 90%+ or so. Statistically, there are likely to be physical servers that are under-utilized at any one time. If we can quickly move VMs to those servers, we can adapt more quickly to load spikes, and thus achieve higher holistic utilization rates. The more servers one has, the more this is true.

Now, if we're using distributed VM scheduling for the purpose of power management, we want to pack as many VMs on as few powered-on physical servers as possible. In that case, we often have powered-off servers, which can be spun up quickly. And thus, we have even more head-room available. With infinitely fast migration and server power-ups, one could push physical server utilization near 100%, without losing quality of service. There's not much reason not to. This would yield a huge gain in power efficiency (and thus power costs).

The article highlights MemoryMotion™, a technology which by way of providing much faster VM migrations, provides a mechanism to greatly boost server utilization. Using high speed network fabrics, we can finally look at distributed VM scheduling like an Operating System scheduler, because the scheduling time-frame decreases down to the sub-1-second range. This is how we can tap the next 100%(relative) of ROI/power-consumption improvements, as highlighted in the following slide.

As far as I'm aware of, this could be the next killer virtualization feature, since the advent of live VM migration. Feel free to contact me if you'd like to know more. Note that it is patent pending technology.

Disclosure: no positions

Tuesday, April 14, 2009

Inflows and outflows of US cities using a U-Haul barometer

Expanding on the U-Haul metric idea in the article "Fleeing Silicon Valley", I gathered costs of moving a household via U-Haul throughout a matrix of cities in the US. The thesis is that moving truck rentals will fairly accurately reflect the supply/demand equation, likely on a near real-time basis, giving us a current indicator of where the inflows and outflows are. This is interesting for a whole host of investment (and personal) themes, such as which direction commercial and residential real estate values will trend in various areas.

I picked 10 cities, which represent top startup locations in the US (a personal bias). To net out differences in travel distance, and to normalize values so that one can quickly compare and see trends, I gathered data for one-way moves in both directions between every one of 10 cities, and then divided the price of source-to-destination by the price of destination-to-source. What remains are ratios which show which way people are likely moving. The results are very enlightening.

The winners (highest in-flows 1st): Austin, Raleigh, Boulder, DC, Seattle.

The losers (highest out-flows 1st): LA, Silicon Valley, NY, San Diego.

Worth noting, Boston was nearly net-neutral, but it's interesting where it's picking up people from in droves (Silicon Valley, LA and San Diego). Most cities are poaching from California. This can't be a good thing for California's urban real estate markets, municipal bond ratings, nor for its state income tax receipts -- right at a time when revenues are falling short of projections.

Disclosure: no positions

Thursday, April 9, 2009

Portable Linux future using LLVM

Imagine a single Linux distribution that adapts to whatever hardware you run it on. When run on an Atom netbook, all the software shapes and optimizes to the feature set and processor characteristics of the specific version of Atom. Want to run it as a VM on a new Nehalem-based server? No problem, it re-shapes to fit the Nehalem (Core i7) architecture. And here, when I say "re-shape", I don't mean at the GUI level. Rather, I mean the software is effectively re-targeted for your processor's architecture, like it had been re-compiled.

Today, Linux vendors tend to make choices about which one or set of processor generations (and thus features) are targeted during compile-time, for the variety of apps included in the distro. This forces some least-common-denominator decisions, based on processor and compiler features, as known at the time of creation of a given distro version. In other words, decisions are made for you by the Linux distributor, and you're stuck with that. When you run a distro on new hardware, you may well not be able to access new features in the hardware you purchased. Or, a distro targeted for advanced hardware may not run on your older hardware. Such is the current state of affairs (or disarray as I like to think of it).

Enter LLVM, a compiler infrastructure which allows for compilation of programs to an intermediate representation (IR; think Java byte codes), optimizations during many phases (compile, link and run-times), and ultimately the transformation of IR to native machine code for your platform. I see the IR enabling powerful opportunities beyond the current usage -- one opportunity being, distributing Linux apps and libraries in IR format, rather than in machine code. And then having the infrastructure (including back-ends specific to your platform to transform the IR to native machine code), and potentially caching this translation for optimization's sake. The obvious benefit here is to enable much more adaptable/portable apps. But it also orthogonalizes compiler issues, such as which optimizations are used (and the stability of those optimizations), such that those can be updated in the back-end without touching the apps in IR form. Modularization is a good thing.

A resulting question is, how low-level could we push this in terms of software? At a LLVM talk I attended, there was mention that some folks played with compiling the Linux kernel using LLVM. I'd like to see that become a concerted effort from the processor vendors, as it would allow Linux to become more adaptive, perhaps more efficient, and show off features of new processors without having to adapt all the software. I have many related ideas on how to make this work out.

Especially given the trend towards some exciting convergence between CPU and GPU-like processing, and many available cores/threads, I think it's time we take a hard look at keeping distributed code in an IR-like form, and orthogonalizing the optimizations and conversion to native code. In a lot of ways, this will allow the processor vendors to innovate more rapidly without breaking existing code, while exposing the benefits of the innovations more rapidly without necessitating code updates.

Disclosure: no positions