A predictive model from machine learning service BigML claims it can help predict whether a Kickstarter campaign will be successful. The model analyzed nearly 17,000 campaigns and seems to show that target goals and the number of backers matter a lot. It’s hard to be certain how accurate the model is, but it is fun to play with.
Before digging into the Kickstarter data, though, a refresher on how BigML works is helpful. It’s a machine-learning service that discovers patterns within large datasets and then generates predictive models based on the data. The models are displayed as decision trees that place the factors most highly correlated with the target outcome up top and work their way down to less-predictive factors.
There’s also a feature that lets users predict the outcome of any given situation (a Kickstarter campaign, in this instance) by entering specific data points and receive a prediction on its outcome. Alternatively, BigML can ask users a series of questions based on the correlations it discovered — almost like a game of 21 questions — and the system will answer once it has enough info to make a prediction.
The Kickstarter predictions (available to play around with here) can get pretty complicated but there are some strong indicators of success. The most-important is the number of backers: the model is 80.5 percent confident that campaigns with more than 34 backers will succeed, while it’s 65 percent confident those with less than 34 backers will fail.
Up next is target goal, with smaller amounts resulting in a higher likelihood of success. The model is 92.6 percent confident that campaigns with more than 34 backers and aiming for less than $8,815.85 will succeed, while the confidence level is only 62 percent for those aiming for more. And if a successful campaign is all that matters, the target goal really should be less than $4,844.48 (a a confidence score of 96.3 percent versus 86 percent for higher goals).
However, few — if any — things are actually deal-breakers, according to the BigML model. For every decision a Kickstarter entrepreneurs makes about his or her campaign, or however many backers a campaign attracts, there’s still a branch of factors that can suggest success or failure. The amount of money each backer pledges can make big difference, obviously, and in some cases not choosing categories such as food, publishing or gaming can make a difference.
The likelihood that my hypothetical campaign with 222 backers, a goal of $10,000 and an average of $25 per backer will succeed: High. The likelihood that my hypothetical campaign with 100 backers, a $20,000 goal, an average of $40 per backer and and 6,697 Facebook followers will succeed: Low.
Kickstarter, for what it’s worth, gives a less-nuanced prediction of success that doesn’t always jive with the model’s predictions: “Of the projects that have reached 20% of their funding goal, 82% were successfully funded. Of the projects that have reached 60% of their funding goal, 98% were successfully funded.”
Perhaps this apparent discrepancy has something to do with the limited dataset that the model (created by BigML’s Justin Donaldson using data compiled by Dan Misener at Kickback Machine) uses. It only takes into account less than 17,000 campaigns that had a 52 percent success rate, whereas Kickstarter has handled more than 84,000 campaigns with a 43.6 percent success rate. Of the 35,282 that have been successful as of 11:22 a.m. PT on Jan. 25, approximately 80 percent raised between less than $9,999.
Even if it’s not entirely accurate, the BigML model is a lot of fun to play around with. And I think the service itself is indicative of a forthcoming wave of consumer-friendly web services that let anyone with some interesting data try to play data scientist. We’ll be talking a lot about the cutting edge of machine learning and big data at our Structure: Data conference (March 20-21 in New York), but there’s also a need to bring these concepts down to the masses (or at least non-data-scientists). And while machine learning might be a difficult thing to make foolproof, just being able to start working with it for free is pretty powerful.