Saturday, August 15, 2009

Cloud Pipeline: future of inter cloud provider sneaker-nets

One of the notable frictions surrounding use of cloud computing providers has been the difficulties in getting large data sets into and out of the domain of the cloud provider. Once your data set grows beyond a certain level, it's just not feasible to use the public network to transfer it. Amazon, in May 2009, began addressing this friction by offering an import feature, whereby one can ship them data (on an external SATA/USB drive), and they'll load it into their S3 storage service. And just recently, Amazon added a similar export feature. This is extremely useful between the customer and Amazon, but I believe it's only the beginning of a trend in what's to come to inter cloud "sneaker nets".

There are a slew of interesting use-cases of transferring data sets between various forms of providers, without the customer ever touching the data, nor ever sending physical devices. This of course, would dictate there being some (set of) standards/formats for inter-provider transfer. There are obvious and well known uses such as shipping data off-site for DR (Disaster Recovery) purposes. If the the DR site is also a cloud provider, the transfer should optimally occur between the two providers without the customer being involved in sending media devices. Under the right circumstances, data could be sent directly from the source to destination provider, eliminating the need for the destination to send a media device at all; that would remove one delivery day from the equation. Doing this would likely require some combination of the data being encrypted and the media being properly scrubbed from its previous data. Or starting with a pool of fresh devices on the source provider side, and shipping the used device back to the customer from the destination provider side, with the extra cost of the device added.

"Cloud Pipeline"

A more efficient and seamless transport of large data sets will be a key enabler that will allow the cloud computing landscape to evolve in usage and in number of providers. With that evolution will come a lot of other "touch-less" uses, such as exporting your database through FedEx from a provider such as Amazon to a database analytics specific provider who houses an army of specialized columnar analytics database engines. Or perhaps to a provider that specializes in render farm activities, real-time 3D server-side rendering, massively parallel HPC (High Performance Computing), compliance, de-duplication, compression, bio-informatics, or a host of other specialties. One could actually set up a 'pipeline' of cloud services in this way, moving data to the stage in the pipeline where it is most optimally processed, based on capabilities, geographies, current pricing, etc. Perhaps the next big Pixar-like animation studio company will make use of a Cloud Pipeline. Or perhaps the next big biotech company. It wouldn't be surprising if they start out with some stages of the pipeline in-house, and farm out an increasing amount of work as the cloud evolves.

Cloud Marketplace

Ultimately, there will be market places for cloud computing. But initially, many things must be developed and normalized/standardized before the compute side has the full potential of being "marketized". One example is e-metering, which can't be simply bolted on as an after-thought, but needs to be deeply integrated into layers of the cloud fabric. That may take quite some time before it becomes marketplace friendly.

But the inter-modal data transport (aka "sneaker net" or "FedEx net") level is a level of abstraction at which a cloud marketplace gets interesting in the shorter term. Here we have the opportunity for a given data set to be copied or multiplexed to a set of receiving providers, based on pre-arranged or real-time market criteria. It may be that by the time the data movement occurs, a provider may have come available who can process the data more efficiently, with a lower pricing for the same efficiency, or perhaps just in a more geographically or politically friendly locale. Perhaps a given provider just rolled in a bank of analytics database engines, or maybe they added banks of GPGPUs. These are the kinds of events that could make one provider much more competitive in the market (10 or 100x). As long as a customer can periodically and inexpensively transport copies of their data to another (backup) site, much of the tie-in problem vanishes. It becomes more of a data format standards issue, one that the customer has more influence on.

Keep in mind a related cloud marketplace would require APIs; orchestrating workflows across a cloud pipeline with real-time market based routing would need them. ;-)

Disclosure: no positions

Monday, August 10, 2009

Server side Android, a Google version of Amazon's EC2

While everyone contemplates the place that Android will hold on the mobile device, in home entertainment and on the netbook, there is another interesting use-case for Android that's not yet been talked about. There's no reason that Android, as a complete OS, application stack and ecosystem (including the app market), has to be run on the client side. In environments where multiple users might want to use the same client hardware (monitor, keyboard, mouse, etc), such as at the office, the thin-client model could be a very useful way to access any given user's Android session. This way, the Android session can be displayed at any end-point, be it a desktop, notebook, meeting-room projector, or even smartphone device. Using a VPN or even SSL protected web browser session from home, a user could also bring up their work Android session.

And of course, as soon as one contemplates serving Android sessions from a server farm, virtualization springs to mind. While one could put each Android its own VM, Android is ripe for an application style of virtualization, having only one kernel and multiple application group boundaries. One can achieve much higher consolidation ratios that way. With whatever choice of virtualization style, one can then imagine that the Android sessions are not necessarily constrained to any one company's private datacenter/cloud, but could also be served from a public cloud. If a public cloud provider can put sessions close enough to a given user's current location (networking latency wise), this proposition gets really interesting. For one, because Android could work its way into the consumer and enterprise VDI spaces. And two, because Google owns a lot of datacenters and could potentially go beyond the OS/application stack space, and into owning the execution of user sessions as well as maintain all their data. This would be likely be a reoccurring revenue (rental) type of service, and open the door to some premium options such as backups, latency/bandwidth QoS, execution locality zones, etc. Kind of the Android desktop version of Amazon's EC2/S3 web services.

There are a number of interesting ways to enhance the server-side Android model. For example, one could allow an Android session to seamless migrate from execution on a server to execution within a local hardware environment (using a VM or otherwise). So for example, if you want to "snap to local device", then execution migrates to your local device and the display interface originates from the local hardware rather than be remoted. There's no reason the user has to see or care about this transition. If you want to "snap to home entertainment device", then your Android session moves seamlessly to your TV. Ditto for the display on your car or netbook. To pull this off, it helps if the environments synchronize in the background automatically. And of course, doing all this on any real scale, means one has to have access to a hearty (Google) sized infrastructure.

Adding in one more piece of the puzzle, a thin-client style tablet (or other form factor) which I wrote about recently, would be an excellent way to access a server-side Android session without ever having any hardware smarts or locally resident data (which can be lost of stolen), and yet would also provide a larger interface for smartphones etc. This kind of device could be manufactured inexpensively on mass scales because it has very little in the way of hardware requirements (runs only firmware). But would be a big opportunity for Google branding and penetration into new markets, and would be a gateway to the evolution of for-pay Google services as mentioned. Perhaps this would be something manufactured by the ODMs such as Compal.

But the discussion isn't nearly complete until we talk about gaming! Server side rendering is a new trend, which decouples the amount of compute power from the end-point device, allowing less capable devices to display amazing server-side rendered games (see my previous article). And it has some of the same requirements as above, in terms of placing the sessions close to the end-point (latency wise), having enough data centers to cover important geographic areas, etc. And a hearty amount of the popular smartphone apps tend to be games, making a great synergy with a "cloud based Android". This style of computing could usher in a new era of phenomenal photo-realistic gaming, decoupled altogether from the underlying client-side hardware. Write once, game anywhere...

Disclosure: no positions

Monday, August 3, 2009

Fault tolerance a new key feature for virtualization

VM migration has been a key feature and enabling technology which has differentiated VMware from Microsoft's Hyper-V. Though as you may know, Windows Server 2008 R2 is slated for broad availability on or before October 22, 2009 (also the Windows 7 GA date), and Hyper-V will then support VM migration. So you may be wondering, what key new high-tech features will constitute the next battleground for differentiation amongst the virtualization players?

Five-Nines (99.999%) Meets Commodity Hardware

One such key feature is very likely to be fault tolerance (FT) -- the ability for a running VM to suffer hardware failure on one machine, and to be restarted on another machine without losing any state. This is not just HA (High Availability), it's CA (Continuous Availability)! And I believe it'll be part of the cover-charge that virtualization vendors (VMware, Citrix/XenSource, Microsoft, et al) and providers such as Amazon will have to offer to stay competitive. When I talk about fault tolerance, I don't mean using special/exotic hardware solutions -- I'm talking about software-only solutions which handle fault tolerance in the hypervisor and/or other parts of the software stack.

Here's a quick summary of where the various key vendors are w.r.t. fault tolerance. Keep watch of this space, because the VM migration battle is nearly over now.

VMware's product line now offers Fault Tolerance, which they conceptually introduced at VMworld 2008. This was perhaps the biggest wow-factor feature VMware talked about at that VMworld. FT is not supported in VMware Essentials, Essentials Plus or vSphere Standard editions. It's supported in more advanced(/expensive) versions.

In the Xen camp, there are two distinct FT efforts, Kemari and Remus. Integration/porting to Xen 4.0 are on the roadmap. If/when that occurs, the Xen ecosystem will benefit. After battle-testing, it's easy to conceive of Amazon offering FT as a premium service. It does after all chew through more network capacity, and will necessitate extra high level logic on their part. There's also a commercial FT solution for XenServer from Marathon, called everRun VM.

Microsoft appears to be leveraging a partnership with Marathon for their initial virtualization FT solution. This is probably smart given it allows Microsoft a way to quickly compete on fault tolerance, with a partner that's been doing FT for a living. One would imagine this option will come at a premium though, perhaps a revenue opportunity for Microsoft for big-money customers, with an associated disadvantage vis-à-vis similar features based on free Xen technology and massive scale virtualization (clouds). That may make Marathon a strategic M&A target.

Licensing Issues, Part II

Just when you thought software-in-a-VM issues were mostly resolved, the same questions may be raised again for FT, given there is effectively a shadow copy of any given FT-protected VM. It's not hard to imagine Microsoft aggressively taking advantage of this situation, given they live at both virtualization/OS and application layers of the stack.

Networking is Key

Fault tolerance of VMs is yet another consumer and driver of high bandwidth, low latency networking. The value in the data center is trending from the compute hardware to the networking. FT is another way-point in the evolution of that trend, allowing continuous availability on commodity hardware. You probably won't run it on all your workloads (they will run with a performance penalty), but you might start out with the most critical stateful workloads. If you want to do this on any scale, or with flexibility, architect with lots of networking capabilities. For zero-sum IT budgets, this would mean cheaper hardware and better networking, something that might be a little bitter-sweet for Cisco, given its entrance into the server market.

Disclosure: no positions