10 Comments

Summary:

Mapr, a stealth-mode start-up with about 30 employees is developing a version of Hadoop and plans to compete with the likes of Cloudera. The company is likely to launch later this year and has been funded by Lightspeed Venture Partners and NEA.

Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther
Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther

The Hadoop and Beyond Panel at Structure: Big Data

Hadoop, the open-source file system and MapReduce implementation for massive-scale data, was the talk of the conference Wednesday at our Structure Big Data conference in New York. From new Hadoop distributions to end-customers’ plans, Hadoop was all anyone could talk about. One of the companies whose name crept up in conversations was a stealth-mode company called Mapr, which is building a proprietary version of Hadoop and is likely to launch later this year.

Mapr, based in Saratoga San Jose, Calif., has been in the works for nearly two years. The Securities and Exchange Commission filings show the company has raised about $9 million in funding from Barry Eggers of Lightspeed Venture Partners and Peter Sonsini of the New Enterprise Associates.  On its web site, the company says it’s “engineering game changing Map/Reduce related technologies.” Its ambitions aren’t limited by that somewhat ambiguous statement.

People Behind Mapr:

  • M.C. Srinivas Srivas, an ex-Googler  is the founder and CTO of the company.
  • John Schroeder, formerly of Lightspeed VC and former CEO of Calista Technologies (acquired by Microsoft) and Rainfinity (acquired by EMC) is the CEO and co-founder of Mapr.
  • The company has close to 30 employees, many of them based in India.
  • Ted Dunning, chief scientist at Site Tuner and Veoh Networks, is the chief application architect at Mapr. He created the recommendation engine for Musicmatch, a music service that was popular before iTunes came on the scene. He is also one of the key guys behind the Apache Mahout data-mining project.

What Is Mapr Doing?

They are said to be building a proprietary replacement for the Hadoop Distributed File System that’s allegedly three times faster than the current open-source version. It comes with snapshots and no NameNode single point of failure (SPOF), and is supposed to be API-compatible with HDFS, so it can be a drop-in replacement.

The Road Ahead

Mapr might have an edge over Apache Hadoop in the interim, but Apache is working to improve the HDFS architecture in its distribution, and should have its own snapshot feature sometime in 2012. Also, Appistry sells a NameNode-free HDFS alternative based on its distributed CloudIQ Storage offering. As for the speed advantage, I don’t have any details for now, but if you have some thoughts, please share them with us.

On a broader canvas, I think Mapr is up against a whole lot of major competitors. Cloudera has a lead in the commercial market place, and the Apache Hadoop distribution on which it’s based keeps improving thanks to upgrades from contributors like Facebook and Yahoo. Apache Hadoop companies more control over their data, as they are not at all held hostage by a vendor, and surveys and anecdotal evidence alike suggest that Apache Hadoop is still the most widely-used version.

  1. The MapR website says San Jose, CA (not Saratoga), then again it also says (C) 2009… maybe their site is out of date?

    Share
    1. Thank you for the comment. The site shows San Jose, but the first address listed on the SEC filing is Saratoga. We changed to acknowledge the paperwork was probably filed before the company had office space.

      Share
    2. I was going with the first/only official address there is on a formal document.

      Share
  2. small correction, CTO is M.C. Srivas, not M. C. Srinivas.

    Share
    1. Thanks for the comment; we’ve made the change.

      Share
  3. No mention of Pervasive datarush, which is an already shipping parallel and Hadoop-compatible programming environment?

    Share
  4. +1 DataStax’s new Brisk project to integrate Cassandra with Hadoop.

    Share
  5. Couple of companies are already trying out MapR product and loving it

    Share
    1. Can you tell us which are those companies?

      Share
      1. Now you know :)

        Share

Comments have been disabled for this post