Wednesday, July 8, 2009

kaChing: the sound of money flowing away from mutual funds

This year's Finovate Startup 09 Conference in San Francisco hosted some 56 financial oriented startups. Attending as a blogger from Seeking Alpha (a conference sponsor) and a serial startup guy, it's hard to beat merging the best of two worlds. If you didn't have a chance to attend, a great way to summarize the big picture painted by Finovate Startup '09 was encapsulated by a remark from a fellow Seeking Alpha blogger and hedge fund manager; there are no areas of finance left which will not be heavily disrupted. Some of these startups represent such disruptions to current financial business models. If you're in the biz, I heavily recommend attending the flagship Finovate in NYC on September 29, 2009. Your life is about to change.

My runner-up favorite theme at Finovate was peer-to-peer lending. This has a lot of potential in that it creates a new asset class which takes banks out of the equation, allowing (pools of) borrowers to directly borrow from (pools of) lenders. And as a general rule, startups in this space tend to tout more transparency to the process. There was a great wrap-up of related startups here. If I were a bank, I'd think about buying into these startups early. And note that transparency is the new trend.

But the one startup that stood out from the pack, and stopped me dead in my tracks as an entrepreneur was kaChing. If I had to describe kaChing in one phrase, I'd say "look out mutual funds!" Here's how it works. Portfolio managers make investment decisions on the kaChing site, and will also be able to import past decisions from previous trading activities to establish more history (assuming it's good!). They also declare their investment strategy, perform research, and blog. To extents, these kinds of things have been done by various services/sites. But here's where kaChing's magic begins. One of the most prevalent (and toughest to solve) problems of such a service is to assign ratings to portfolio managers which have real meaning. Basing ratings purely on returns can be a very poor way of assigning a metric, as returns can change quickly and don't reflect important facets of investing such as risk taken or consistency with a strategy. By contrast, kaChing believes they have "broken the code", creating a meaningful rating system called Investing IQ, which ranks managers by the same criteria that premier endowments look for:
  1. Great risk-adjusted returns
  2. Compelling investment rationale
  3. Adherence to a stated strategy
This of course is not easy to do on any scale. But it's absolutely essential, if you want to provide a service which retail investors can bank on. But then comes the brilliant part of kaChing's story -- one won't have to passively follow a portfolio manager; instead, in September you will be able to actively and automatically mirror their trades in real brokerage accounts!!! And portfolio managers will earn fees from followers, as in any real fund scenario. But to earn money as a manager, and to protect mirroring investors, managers have to earn a score above 70, which turns out to be very hard to do. Click on "Find Managers" on the kaChing site to see how few make the cut so far. Fwiw, kaChing thinks not too many mutual funds would make the cut either, but it's hard to tell due to lack of transparency.

I can see a whole new order of portfolio managers being created. If you're consistently good, you can live and manage from wherever you want. And the more prolific you are in your research and interactions with your followers (blogging, answering questions, etc), the more followers you can aquire ("the twitter effect"), which in turn scales your income nicely for the same amount of work! This is also a great venue for promising finance students and investment clubs to cut your teeth and show what you're made of -- get benchmarked with Investing IQ against the brightest. Ultimately, putting an Investing IQ on your resume may be cover-charge, like taking a GMAT is for entering an MBA program. What I'd like to see is a time when talking heads in the media have to disclose their Investing IQ -- what a great way to sort out hot air.

Going mobile? kaChing's iPhone app is now available for free. You can get real-time quotes for US exchange listed equities/ETFs, check out your account, get top-rated research, look for investing ideas, etc. I recommend checking out the "Investing Ideas" screen. You can see investment ideas where the smart money differs from the rest of the pack. Want another idea, just shake the iPhone. This alone is worth using kaChing.

To stay tuned for the trade mirroring feature coming in September, send an email to 'advanced_notice@kaching.com'. This is the company to watch. And it's worth noting, kaChing is a SEC Registered Investment Advisor.

Disclosure: no positions

Tuesday, June 9, 2009

Airbus looking at descent until it adds pilot override

I think we'll see accelerating cancellations of Airbus orders. It will have little to do with whether Airbus can identify and correct sensor and/or other problems with Air France Flight 447, which went down a week ago while en route from Rio de Janeiro to Paris. Rather, I believe it will be because the Flight 447 crash will make the public aware of something that must be very unsettling to pilots -- Airbus has a design philosophy of using computer fly-by-wire, without the ability of pilots to override! Of course, Boeing jets also operate with fly-by-wire, but at least pilots' inputs can override the computer.

I don't know if this design philosophy truly reflects a difference between American and European cultures. But I would say, to the public, this is not a story about culture or mechanical sensors. It's a story about whether the pilots in the cabin have a fighting chance to bring you back alive, if something goes very wrong. That is an epic human story, of which I would not want to be on the wrong marketing side. If I were an airline, I would have already cancelled any Airbus orders possible. That's exactly what I expect to see happen.

Note to Orbitz et al. Offer an extra search option to select for aircraft with pilot override. It would be a great differentiator.

Disclosure: no positions

Sunday, May 10, 2009

Hadoop should target C++/LLVM, not Java (because of watts)

Over the years, there have been many contentious arguments about the performance of C++ versus Java. Oddly, every one I found addressed only one kind of performance (work/time). I can't find any benchmarking of something at least as important in today's massive-scale-computing environments, work/watt. A dirty little secret about JIT technologies like Java, is that they throw a lot more CPU resources at the problem, trying to get up to par with native C++ code. JITs use more memory, and periodically run background optimizer tasks. These overheads are somewhat offset in work/time performance, by extra optimizations which can be performed with more dynamic information. But it results in a hungrier appetite for watts. Another dirty little secret about Java vs C++ benchmarks is that they compare single-workloads. Try running 100 VMs, each with a Java and C++ benchmark in it and Java's hungrier appetite for resources (MHz, cache, RAM) will show. But of course, Java folks don't mention that.

But let's say for the sake of (non-)argument, that Java can achieve a 1:1 work/time performance relative to C++, for a single program. If Java consumes 15% more power doing it, does it matter on a PC? Most people don't dare. Does it matter for small-scale server environments? Maybe not. Does it matter when you deploy Hadoop on a 10,000 node cluster, and the holistic inefficiency (multiple things running concurrently) goes to 30%? Ask the people who sign the checks for the power bill. Unfortunately, inefficiency scales really well.

Btw, Google's MapReduce framework is C++ based. So isn't Hypertable, the clone of Google's Bigtable distributed data storage system. The rationale for choosing C++ for Hypertable is explained here. I realize that Java's appeal is the write-once, run anywhere philosophy as well as all the class libraries that come with it. But there's another way to get at portability. And that's to compile from C/C++/Python/etc to LLVM intermediate representation, which can then be optimized for whatever platform comprises each node in the cluster. A bonus in using LLVM as the representation to distribute to nodes, is that OpenCL can also be compiled to LLVM. This retains a nice GPGPU abstraction across heterogeneous nodes (including those including GPGPU-like processing capabilities), without the Java overhead.

Now I don't have a problem with Java being one of the workloads that can be run on each Hadoop node (even script languages have their time and place). But I believe Hadoop's Java infrastructure will prove to be a competitive disadvantage, and will provoke a mass amount of wasted watts. "Write once, waste everywhere..." In the way that Intel tends to retain a process advantage over other CPU vendors, I believe Google will retain a power advantage over others with their MapReduce (and well, their servers are well-tuned too).

Disclosure: no positions

Friday, May 8, 2009

50% of Cloud Compute Power & Carbon Waste Solved by Software Technique



Ever wonder why virtualized servers are usually run at nominal loads of 30..40%? It's because the speed at which VMs can be live migrated is much slower than the speed at which load spikes can arise. Thus, a large "head room" is built into the target loads at which IT is generally comfortable running their virtualized servers.

Faster networking would seem to save the day, but per-server VM density is increasing exponentially, along with the size of VMs (see latest VMworld talks). Thus there are many more (increasingly bigger) VMs to migrate, when load spikes occur.

Besides the drastic under-utilization of capital equipment, it's a shame that the load "head room" is actually in the most efficient sweet spot of the curve. The first VMs tasked on a server are the most power-costly, and the last ones are nearly free (except we only use that band for spike management).

If loads could comfortably be advanced from 40 to 80%, half (or more) of the compute-oriented wasted power consumption, resultant carbon footprint, and excess capital equipment could be saved. And that's exactly what my R&D showed can be done, with a software-only solution. And as an added value, the same technique accelerates VM migrations across data centers, which is where cloud computing is going.

Essentially, we're blowing an extraordinary amount of money and trashing the Atmosphere from out-moded virtualization software. None of the virtualization solutions from VMware, Citrix, or KVM have this new technique yet. And it's not available via Amazon's EC2. Here's to hoping the future is greener. As well, a high-end networking infrastructure throughout will be essential. Time is money, and time to migrate VMs translates to a lot of money! For the next cloud computing decade, "networking is power". Consider getting long cloud providers and virtualization vendors that support accelerated VM migration, and networking vendors that supply the goods. It nearly goes without mention that Cisco has entered the server market...

Please help spread the word, and feel free to bug your providers.

Disclosure: no positions, related patents pending

Thursday, April 23, 2009

Boosting server utilization 100% by accelerated VM migration (MemoryMotion™)

Recently I guest-wrote an article on virtualization.info about a technology I've been researching for some time, called MemoryMotion™. By providing a distributed memory disambiguation fabric, VM live migration time is greatly accelerated. Besides the obvious benefits, ultra-quick migration unlocks a healthy chunk of additional server utilization, currently inaccessible due to slow migration times in current hypervisors.

It's worth stepping back a moment and asking the question, "why can we only run a physical server at say a nominal 40% (non-VDI) / 60% (VDI) maximum load?" Based on intuition, one might answer that it's due to workload spikes, which cause us to provision conservatively enough as to absorb spikes to an extent (the micro level), and fall back to distributed scheduling across servers to balance out at the macro level. But that's really how we deal with the problem, rather than an identification of the problem. The truth is that a good chunk of the 40% of nominal under-utilization comes from the slowness of live VM migration. It's easy to demonstrate this with a question. What could the nominal maximum load factor be, if we had infinitely fast VM migrations (0.0 seconds)? Probably 90%+ or so. Statistically, there are likely to be physical servers that are under-utilized at any one time. If we can quickly move VMs to those servers, we can adapt more quickly to load spikes, and thus achieve higher holistic utilization rates. The more servers one has, the more this is true.

Now, if we're using distributed VM scheduling for the purpose of power management, we want to pack as many VMs on as few powered-on physical servers as possible. In that case, we often have powered-off servers, which can be spun up quickly. And thus, we have even more head-room available. With infinitely fast migration and server power-ups, one could push physical server utilization near 100%, without losing quality of service. There's not much reason not to. This would yield a huge gain in power efficiency (and thus power costs).

The article highlights MemoryMotion™, a technology which by way of providing much faster VM migrations, provides a mechanism to greatly boost server utilization. Using high speed network fabrics, we can finally look at distributed VM scheduling like an Operating System scheduler, because the scheduling time-frame decreases down to the sub-1-second range. This is how we can tap the next 100%(relative) of ROI/power-consumption improvements, as highlighted in the following slide.

As far as I'm aware of, this could be the next killer virtualization feature, since the advent of live VM migration. Feel free to contact me if you'd like to know more. Note that it is patent pending technology.

Disclosure: no positions

Tuesday, April 14, 2009

Inflows and outflows of US cities using a U-Haul barometer



Expanding on the U-Haul metric idea in the article "Fleeing Silicon Valley", I gathered costs of moving a household via U-Haul throughout a matrix of cities in the US. The thesis is that moving truck rentals will fairly accurately reflect the supply/demand equation, likely on a near real-time basis, giving us a current indicator of where the inflows and outflows are. This is interesting for a whole host of investment (and personal) themes, such as which direction commercial and residential real estate values will trend in various areas.

I picked 10 cities, which represent top startup locations in the US (a personal bias). To net out differences in travel distance, and to normalize values so that one can quickly compare and see trends, I gathered data for one-way moves in both directions between every one of 10 cities, and then divided the price of source-to-destination by the price of destination-to-source. What remains are ratios which show which way people are likely moving. The results are very enlightening.

The winners (highest in-flows 1st): Austin, Raleigh, Boulder, DC, Seattle.

The losers (highest out-flows 1st): LA, Silicon Valley, NY, San Diego.

Worth noting, Boston was nearly net-neutral, but it's interesting where it's picking up people from in droves (Silicon Valley, LA and San Diego). Most cities are poaching from California. This can't be a good thing for California's urban real estate markets, municipal bond ratings, nor for its state income tax receipts -- right at a time when revenues are falling short of projections.

Disclosure: no positions

Thursday, April 9, 2009

Portable Linux future using LLVM

Imagine a single Linux distribution that adapts to whatever hardware you run it on. When run on an Atom netbook, all the software shapes and optimizes to the feature set and processor characteristics of the specific version of Atom. Want to run it as a VM on a new Nehalem-based server? No problem, it re-shapes to fit the Nehalem (Core i7) architecture. And here, when I say "re-shape", I don't mean at the GUI level. Rather, I mean the software is effectively re-targeted for your processor's architecture, like it had been re-compiled.

Today, Linux vendors tend to make choices about which one or set of processor generations (and thus features) are targeted during compile-time, for the variety of apps included in the distro. This forces some least-common-denominator decisions, based on processor and compiler features, as known at the time of creation of a given distro version. In other words, decisions are made for you by the Linux distributor, and you're stuck with that. When you run a distro on new hardware, you may well not be able to access new features in the hardware you purchased. Or, a distro targeted for advanced hardware may not run on your older hardware. Such is the current state of affairs (or disarray as I like to think of it).

Enter LLVM, a compiler infrastructure which allows for compilation of programs to an intermediate representation (IR; think Java byte codes), optimizations during many phases (compile, link and run-times), and ultimately the transformation of IR to native machine code for your platform. I see the IR enabling powerful opportunities beyond the current usage -- one opportunity being, distributing Linux apps and libraries in IR format, rather than in machine code. And then having the infrastructure (including back-ends specific to your platform to transform the IR to native machine code), and potentially caching this translation for optimization's sake. The obvious benefit here is to enable much more adaptable/portable apps. But it also orthogonalizes compiler issues, such as which optimizations are used (and the stability of those optimizations), such that those can be updated in the back-end without touching the apps in IR form. Modularization is a good thing.

A resulting question is, how low-level could we push this in terms of software? At a LLVM talk I attended, there was mention that some folks played with compiling the Linux kernel using LLVM. I'd like to see that become a concerted effort from the processor vendors, as it would allow Linux to become more adaptive, perhaps more efficient, and show off features of new processors without having to adapt all the software. I have many related ideas on how to make this work out.

Especially given the trend towards some exciting convergence between CPU and GPU-like processing, and many available cores/threads, I think it's time we take a hard look at keeping distributed code in an IR-like form, and orthogonalizing the optimizations and conversion to native code. In a lot of ways, this will allow the processor vendors to innovate more rapidly without breaking existing code, while exposing the benefits of the innovations more rapidly without necessitating code updates.

Disclosure: no positions