Big data startup Skytree emerged from stealth mode on Thursday with its product that is designed to democratize the science of machine learning, while improving significantly on the speed and scale of existing options. Skytree has raised a $1.5 million Series A investment round from Javelin Venture Partners.
Machine learning is a particularly complex approach to big data, and one that has been largely relegated to only the most-advanced companies, such as financial institutions or large web properties. The technique enables systems to get smarter the more data they ingest, which is particularly useful for tasks such as finding hidden patterns or accurately classifying data without human interaction. The libraries and algorithms are out there for anyone to use if they have good enough skills, but deploying a system that can perform the task on large data sets with reasonable performance is the hard part.
That’s the problem Skytree thinks it has solved with its eponymous Skytree Server. R, Informatica and Matlab have been around, as have open source libraries such as Apache Mahout, but “we’ve basically introduced massively scalable machine learning,” said Co-Founder and CEO Martin Hack. It’s one thing to have the best algorithms — something Skytree’s team of machine-learning Ph.D.s have been working on — but it’s something else to be a server designed from the ground up to perform the task across thousands of servers as today’s big data environments require.
Skytree Server connects to any number of existing data stores, including Hadoop, and, says Hack, is tens of thousands of times faster than existing tools, performing in minutes tasks that would have taken hours or days. As of now, it’s tuned to five specific use cases the company says are the most common — recommendation systems, anomaly/outlier identification, predictive analytics, clustering and market segmentation, and similarity search. Hack says most interest from early adopters has come from financial services institutions, as well as from web companies that want to analyze information on par with what Google does, or to improve the speed and accuracy of their ad-placement systems.
Co-Founder and CTO Alexander Gray, who’ll be speaking at our Structure: Data conference next month in New York, said Skytree’s greatest value might be its utility. Whereas most machine-learning projects today focus on a single task, Skytree’s server provides a toolbox that can do many things. “Real data analysis requires a toolbox,” he said, “not just one highly optimized solution.”
Image courtesy of Skytree.