After creating some script magic, I ran my thesis against the CrunchBase database (kind of a crowdsourced wikipedia for startup information). The run showed a promising correlation between the failure predictor (I call it the f factor) and startups which hit the "dead pool". It was suggested to me to run a logistical regression, to find the correlation between my failure predictor f, and the binary outputs of fail/win. The correlation results in hand, one can then plug the predictive advantage into the Angel Investor Performance Project (AIPP) data, and simulate the returns using a healthy sized portfolio. The improvement takes the returns from an average 2.4x payout multiple and IRR of 30% to a multiple of 3.8x and a 46% IRR!
Now that makes a bucket of assumptions, e.g. how the predictability of f is distributed across investments, etc. At any rate, the bigger point to make here, is that even using a crowdsourced database can enhance returns. Adding in prediction markets in a crowdfunding environment, where all the players are involved, could increase this even further. As I mentioned in the epiologue of the book:
"Human versus the machine: will we create algorithms to make early picks of winning startups, which are better than many humans (like has been done in chess)?"Disclosure: no positions