If you have a lot of unstructured data, don’t have (or want) a Hadoop cluster and can write Python jobs, Mortar Data has got the service for you. The New York-based startup, which just emerged from stealth mode with an undisclosed seed round, is jumping into the fray with possibly the most lightweight Hadoop-in-the-cloud service yet.
The way Co-Founder and CEO K Young described it to me during a recent call, Mortar Data resides in the big data nether region between infrastructure and visualization. It’s an area, he said, where companies still need skilled people to run good analytics jobs, but where the skills to manage the requisite big data environments tend to be lacking. The result is that some companies either aren’t adopting Hadoop or aren’t fully utilizing it.
It’s something Young and his co-founders experienced during their shared past life at Wireless Generation. Hadoop was a godsend because it let the team scale the data pipelines it was building, but they learned fast that operating an optimally functioning Hadoop cluster is hard. Where it solved one problem, Young noted, it created a new one.
And so came the idea for Mortar Data: Solve the Hadoop management and deployment “so [users] can be using Hadoop productively in an hour.” Actually, the company’s homepage points out, “[I]f you’re familiar with Pig, it’s more like 6 minutes.”
Although its uses Amazon EC2 for processing and Amazon S3 for storage, Mortar Data differs from Amazon’s Elastic MapReduce, or services like Sungard’s new Unified Analytics Solution, on the development side. Rather than requiring users to write complicated MapReduce jobs, Mortar Data lets them run Hadoop jobs using a custom-built combination of Python and the SQL-like Pig language. In that regard, Mortar Data is more akin to the Infochimps Platform, which offers its own Wukong tool for creating Hadoop jobs with Ruby scripts.
Mortar Data scored its first paying customer about six weeks ago after forming in August, Young said, and now counts a number of users, primarily in the advertising, mobile and web spaces where companies have lots of unstructured data and already use Amazon S3.
As great as it might sound, though, Young isn’t short-sighted enough to think Mortar Data has created the perfect Hadoop service. No one cares if you just process data and put the results in S3, he said, users will demand services to make it easy to get data out of Hadoop and “into something [they] can use.” With that in mind, the company is undertaking a partnership strategy with third-party analytics and visualization companies.
It also has had requests to make its software run on companies’ on-premise Hadoop clusters. There’s definitely a value in doing that for large-enough customers, Young acknowledged, but Mortar Data is too young to expand into the software-provider space just yet: “We’re denying those requests right now.”