<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; Hbase</title>
	<atom:link href="http://gigaom.com/tag/hbase/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Fri, 24 May 2013 21:33:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; Hbase</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>WibiData gets $15M to help it become the Hadoop application company</title>
		<link>http://gigaom.com/2013/05/23/wibidata-gets-15m-to-help-it-become-the-hadoop-application-company/</link>
		<comments>http://gigaom.com/2013/05/23/wibidata-gets-15m-to-help-it-become-the-hadoop-application-company/#comments</comments>
		<pubDate>Thu, 23 May 2013 11:31:17 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[machine-learning]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[OPower]]></category>
		<category><![CDATA[predictive analytics]]></category>
		<category><![CDATA[WibiData]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=648663</guid>
		<description><![CDATA[Startup WibiData has raised another $15 million and wants to turn the lessons it has learned in the field into generic software that can let anyone build predictive applications on Hadoop.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648663&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.wibidata.com/">WibiData</a> &#8212; the big data startup from Cloudera Co-founder Christophe Bisciglia and Aaron Kimball &#8212; doesn&#8217;t have <em>overly</em> big plans. It only wants to become one of the first, if not the first, company selling off-the-shelf software that lets other companies build valuable, customer-facing applications on Hadoop. On Thursday, WibiData announced $15 million in Series B funding from Canaan Partners, as well as existing investors NEA and Google Chairman Eric Schmidt, to help make the goal a reality. </p>
<p>Kidding aside, that&#8217;s actually quite an ambitious goal in a Hadoop market that&#8217;s big and growing, but that&#8217;s exemplified by expensive consulting arrangements and purpose-built applications. Even more so for companies that want to do something other than transforming unstructured data into structured data (often called ETL) or run back-office analytics jobs. In fact, WibiData has spent the last 18 months doing just this type of deal, and Bisciglia says every single customer has already engaged with one of the big three Hadoop vendors (Cloudera, Hortonworks and MapR). </p>
<p>Home energy-management startup <a href="http://gigaom.com/2012/11/19/opower-the-big-data-energy-player-to-beat/">Opower</a> is a good example of this process. It&#8217;s actually one of Cloudera&#8217;s banner customers, but &#8220;when they wanted to take [their software-as-a-service tool] beyond batch analysis and ETL workloads,&#8221; Bisciglia said, Opower came to WibiData. So whereas the Opower service was originally focused on nightly data analysis comparing users&#8217; energy usage against that of other users, it&#8217;s now working on dynamic recommendations for users and letting them engage with the application in new ways.</p>
<div id="attachment_648685" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/05/wibi-kiji.jpg"><img  alt="The WibiData architecture" src="http://gigaom2.files.wordpress.com/2013/05/wibi-kiji.jpg?w=300&#038;h=224" width="300" height="224" class="size-medium wp-image-648685" /></a><p class="wp-caption-text">The WibiData architecture</p></div>
<p>During these engagements, WibiData <a href="http://gigaom.com/2012/03/22/wibidata-structure-data-2012/">has been building up its core technology</a> for connecting those brawny back-office Hadoop environments to predictive customer-facing applications &#8211; a collection of HBase, data-formatting tools and machine learning algorithms that the company <a href="http://gigaom.com/2012/11/14/wibidata-open-sources-kiji-to-make-hbase-more-useful/">has been slowly open-sourcing under the Kiji banner</a>. It has also been learning the similarities among the applications it&#8217;s building for customers in the same field, figuring out what&#8217;s repeatable. What does any given company in the retail space, for example, need to get started on <a href="http://gigaom.com/2013/05/08/why-3-celebrity-data-scientists-are-willing-to-work-for-free-for-you/">its own recommendation engine</a>? </p>
<p>And now, Bisciglia says, WibiData is going to double down on building application software based on what it has learned. The first two industries it targets will likely be financial services and retail, two areas where the company has seen a lot of traction. He envisions the finished product including some pre-defined schema for formatting data and some pre-built predictive models, both broadly applicable across that industry rather than specific to a single user. </p>
<p>There will also be different interfaces that allow different types of users (e.g., data scientists, systems engineers and business users) to interact with the data in the ways they need to. </p>
<p>Time will tell if WibiData can actually accomplish its goal of turning Hadoop into a collection of somewhat specialized software packages, but someone has to. Even industry heavyweights like Cloudera see the need, but their hands are full just getting Hadoop integrated into existing environments and getting those early uses up and running. As Cloudera CEO Mike Olson <a href="http://gigaom.com/2012/03/21/cloudera-structure-data-2012/">said at Structure: Data in 2012</a> to anyone ambitious enough to tackle the Hadoop-application gap, &#8220;Call me, I’ll connect you with funding. The money is out there.&#8221; </p>
<p>If you want to hear more about the need for Hadoop applications, check out this panel from Structure: Data 2013, where I speak with WibiData&#8217;s Omer Trajman, Continuuity&#8217;s Jonathan Gray and Pivotal&#8217;s Muddu Sudhakar. <span class='embed-youtube' style='text-align:center; display: block;'><iframe class='youtube-player' type='text/html' width='604' height='370' src='http://www.youtube.com/embed/z7BhGEQX9BQ?version=3&#038;rel=1&#038;fs=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;wmode=transparent' frameborder='0'></iframe></span></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648663&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=501797"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=501797" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=648663+wibidata-gets-15m-to-help-it-become-the-hadoop-application-company&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=648663+wibidata-gets-15m-to-help-it-become-the-hadoop-application-company&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-and-data-second-quarter-2012-analysis-and-outlook-2/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=648663+wibidata-gets-15m-to-help-it-become-the-hadoop-application-company&utm_content=dharrisstructure">Takeaways from the second quarter in cloud and data</a></li><li><a href="http://pro.gigaom.com/2011/12/why-the-big-data-startup-boom-will-likely-be-short-lived/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=648663+wibidata-gets-15m-to-help-it-become-the-hadoop-application-company&utm_content=dharrisstructure">Why the big data startup boom will likely be short-lived</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/23/wibidata-gets-15m-to-help-it-become-the-hadoop-application-company/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/wibi-founders.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/wibi-founders.png?w=150" medium="image">
			<media:title type="html">wibi founders</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/wibi-kiji.jpg?w=300" medium="image">
			<media:title type="html">The WibiData architecture</media:title>
		</media:content>
	</item>
		<item>
		<title>Database startup Drawn to Scale is closing down</title>
		<link>http://gigaom.com/2013/05/17/database-startup-drawn-to-scale-is-closing-down/</link>
		<comments>http://gigaom.com/2013/05/17/database-startup-drawn-to-scale-is-closing-down/#comments</comments>
		<pubDate>Fri, 17 May 2013 21:24:03 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Drawn to Scale]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[SQL on Hadoop]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=646718</guid>
		<description><![CDATA[Database startup Drawn to Scale, creator of the SQL-on-Hadoop technology called Spire, is closing down. The company's product, Spire, was one of the first SQL-on-Hadoop technologies.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646718&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Database startup Drawn to Scale, creator of the SQL-on-Hadoop technology called Spire, is closing down. Co-founder and CEO Bradford Stephens officially <a href="http://www.roadtofailure.com/?p=11">announced the closure in a blog post</a> on Friday.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/spirearchitecture-015-e1361407038325.png"><img  alt="spirearchitecture-015-e1361407038325" src="http://gigaom2.files.wordpress.com/2013/05/spirearchitecture-015-e1361407038325.png?w=300&#038;h=185" width="300" height="185" class="alignleft size-medium wp-image-646740" /></a>The company&#8217;s product, Spire, which provided full SQL support on top of the HBase NoSQL database, was one of the first products to <a href="http://gigaom.com/2012/07/24/how-one-startup-wants-to-inject-hadoop-into-your-sql/">try to blend Hadoop&#8217;s scalability with the robustness and familiarity of SQL</a>. That&#8217;s now <a href="http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/">an increasingly crowded space</a> (and has grown since that linked graphic was created). In March, Drawn to Scale <a href="http://gigaom.com/2013/03/19/drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems/">expanded its support to MongoDB</a>, as well.</p>
<p>I wasn&#8217;t shocked when Stephens told me the news &#8212; questions about the four-year-old company&#8217;s financial health had been swirling for a while &#8212; but to hear of its financial woes was a bit surprising. His account in the post pretty much echoes what I had heard from others:</p>
<blockquote id="quote-it-seemed-we-had-eve"><p>&#8220;It seemed we had everything going for us — paid customers such as American Express, Orange Telecom, Flurry, and 4 others. Our technology worked brilliantly, we had a big hiring pipeline, and we had great media presence against our competitors who raised 10-100x more cash.&#8221;</p></blockquote>
<p>He added:</p>
<blockquote id="quote-yet-five-days-before2"><p>&#8220;Yet five days before we signed term sheets for a big A round or sold the company, we started getting hit by a series of black swans — and we just didn’t have what we needed to recover. I’ll leave the public detail at that level, but I will say that paying employees’ health insurance out of your meager savings is a powerful incentive to change course.&#8221;</p></blockquote>
<p>Up to this point, the company <a href="http://gigaom.com/2012/03/08/drawn-to-scale-raises-money-to-make-sql-big-data-ready/">had raised $925,000</a> from RTP Ventures, IA Ventures and SK Ventures. There&#8217;s no word yet on what will come of the company&#8217;s intellectual property.</p>
<p>As Stephens &#8212; who&#8217;s now doing an entrepreneur-in-residence gig at Ping Identity and helping out other startups (including popular wardrobe app <a href="http://www.clothapp.com/">Cloth</a>) &#8212; succinctly put it during a phone discussion, &#8220;We just don&#8217;t have the horsepower to keep running the company.&#8221;</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646718&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=791634"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=791634" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/report/the-new-economics-of-enterprise-data-warehousing/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">How data warehousing is now a cost-effective solution for businesses</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li><li><a href="http://pro.gigaom.com/report/how-to-use-big-data-to-make-better-business-decisions/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">How to use big data to make better business decisions</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/17/database-startup-drawn-to-scale-is-closing-down/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/dtsdragon.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/dtsdragon.png?w=150" medium="image">
			<media:title type="html">dtsdragon</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/spirearchitecture-015-e1361407038325.png?w=300" medium="image">
			<media:title type="html">spirearchitecture-015-e1361407038325</media:title>
		</media:content>
	</item>
		<item>
		<title>MapR releases M7, its commercial HBase distro</title>
		<link>http://gigaom.com/2013/05/01/mapr-releases-m7-its-commercial-hbase-distro/</link>
		<comments>http://gigaom.com/2013/05/01/mapr-releases-m7-its-commercial-hbase-distro/#comments</comments>
		<pubDate>Wed, 01 May 2013 23:21:07 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=641425</guid>
		<description><![CDATA[MapR on Wednesday released its commercial version of HBase called M7, the first such product on the market, that the company claims is bigger, faster and better than the open source version.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=641425&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>MapR didn&#8217;t miss the memo about the key to success in the Hadoop space being the creation of a data platform that can do many things. And on Wednesday, the company released its take on HBase, <a href="http://www.mapr.com/products/mapr-editions/m7-edition">called M7.</a></p>
<p>Last week, I <a href="http://gigaom.com/2013/04/22/how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream/">explained how HBase is fast becoming the star of the Hadoop ecosystem</a> because it allows users to build more real-time, almost transactional applications on top of Hadoop. True to its form with its other products, MapR has taken HBase even further with M7 by promising greater availability (99.999 percent), instant recovery, faster operations and the ability to handle 1 trillion tables in a single cluster. In open source versions of HBase, MapR VP of Marketing Jack Norris told me, the accepted table limit per cluster is several hundred.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/m7.jpg"><img  alt="m7" src="http://gigaom2.files.wordpress.com/2013/05/m7.jpg?w=300&#038;h=265" width="300" height="265" class="alignright size-medium wp-image-641471" /></a>Additionally, M7 shares a single data layer with the Hadoop file system, meaning less performance overhead and, presumably, easier management.</p>
<p>As we&#8217;re seeing with other Hadoop vendors, including Cloudera (which <a href="http://gigaom.com/2013/04/30/with-impala-now-ga-clouderas-ceo-sizes-up-the-sql-on-hadoop-market/">released its Impala SQL query engine on Tuesday</a>), the Hadoop market is fast becoming one where each vendor is trying to set itself apart from the rest by building the best platform with the broadest set of capabilities. In furtherance of that mission, MapR also announced on Wednesday full-text search on its Hadoop distribution thanks to a partnership with Lucene specialist LucidWorks. It already has its own Hadoop distribution complete with proprietary code to bolster the file system and speed up MapReduce, as well as an <a href="http://gigaom.com/2012/08/17/for-fast-interactive-hadoop-queries-drill-may-be-the-answer/">open source SQL-on-Hadoop project called Drill</a> in the works.</p>
<p>MapR employees are probably sleeping a lot easier these days as a result of this platform push. Others in the Hadoop market used to talk about the fear of fragmentation and then point at MapR as the example of a company helping foment that outcome with its proprietary software. Now, however, even if everyone else is building open source products, they&#8217;re all still backing their own and largely dismissing the others.</p>
<p>I suspect the result is feature lock-in even there&#8217;s no technological lock-in, kind of <a href="http://gigaom.com/2011/03/16/how-amazon-is-following-apples-lead-to-rule-cloud-computing/">like using Amazon Web Services for cloud computing</a> and then hoping to replicate its various servies elsewhere. It might be easy enough to move your data, but impossible or very difficult to replicate those additional capabilities elsewhere. If MapR can build a better version of HBase and companies are willing to pay for it, then so be it.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=641425&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=729913"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=729913" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=641425+mapr-releases-m7-its-commercial-hbase-distro&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/report/cloud-and-data-first-quarter-2013-analysis-and-outlook/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=641425+mapr-releases-m7-its-commercial-hbase-distro&utm_content=dharrisstructure">Cloud and data first-quarter 2013: analysis and outlook</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=641425+mapr-releases-m7-its-commercial-hbase-distro&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=641425+mapr-releases-m7-its-commercial-hbase-distro&utm_content=dharrisstructure">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/01/mapr-releases-m7-its-commercial-hbase-distro/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/09/shutterstock_110961494.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/09/shutterstock_110961494.jpg?w=150" medium="image">
			<media:title type="html">Database rows</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/m7.jpg?w=300" medium="image">
			<media:title type="html">m7</media:title>
		</media:content>
	</item>
		<item>
		<title>How HBase converted MySpace&#8217;s MySQL champion and is driving Hadoop mainstream</title>
		<link>http://gigaom.com/2013/04/22/how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream/</link>
		<comments>http://gigaom.com/2013/04/22/how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream/#comments</comments>
		<pubDate>Mon, 22 Apr 2013 18:14:03 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Gravity]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Myspace]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[NoSQL]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=632738</guid>
		<description><![CDATA[ Gravity CTO Jim Benedetto knows his way around MySQL after managing a 600-instance cluster at MySpace, but he has found HBase religion as his real-time content-recommendation platform grew. And he's not alone.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=632738&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>How&#8217;s this for an understatement: Operational databases are important for many, if not the majority, of web applications. And if you&#8217;re doing big business on the web, finding one that can scale with your data volumes and still perform like you need it to is critical. MapReduce for batch data processing and analysis? Not so much, actually.</p>
<p>That&#8217;s why as Hadoop keeps <a href="http://gigaom.com/2013/03/04/the-history-of-hadoop-from-4-nodes-to-the-future-of-data/">thundering toward its destination as the de facto data platform</a> for next-generation applications, companies such as Cloudera and Hortonworks that are making a killing off it might want to stop and thank <a href="http://www.searchenginecaffe.com/2007/05/hbase-powersets-bigtable.html">the guys from Powerset for building HBase</a>. Because the database &#8212; <a href="http://hbase.apache.org/">a columnar Google BigTable clone that runs on top of the Hadoop Distributed File System</a> &#8212; is so fast and scalable, it&#8217;s helping Hadoop find a home in companies and with applications that HDFS and MapReduce alone might not have been able to penetrate so easily.</p>
<p>The latest HBase user I&#8217;ve come across is <a href="http://www.gravity.com/">Gravity</a>, the <a href="http://gigaom.com/2012/03/15/the-personalized-web-is-just-an-interest-graph-away/">interest graph</a> company that powers content recommendations for some of the biggest publishers on the web.</p>
<h2 id="from-big-mysql-at-myspace-to-b">From big MySQL at MySpace to big data with HBase</h2>
<p>Its co-founders were all senior executives at MySpace, including Gravity CTO Jim Benedetto, who was SVP of technology for the social networking pioneer. He was actually MySpace&#8217;s first architect and helped build platform&#8217;s MySQL database. Although MySpace never reached <a href="http://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-mysql-scale/">Facebook&#8217;s scale</a>, it did have 150 millions users at its peak, all able to store unlimited numbers of wall posts, messages and photos. Benedetto eventually oversaw a 600-instance cluster that required about 30 database adminstrators to keep it up and running.</p>
<div id="attachment_603574" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/01/1z5o2256.jpg"><img  alt="Structure Data 2012: Jim Benedetto – CTO, Gravity Ashlie Beringer – Partner, Gibson, Dunn &amp; Crutcher" src="http://gigaom2.files.wordpress.com/2013/01/1z5o2256.jpg?w=300&#038;h=200" width="300" height="200" class="size-medium wp-image-603574" /></a><p class="wp-caption-text">Benedetto (center) at Structure: Data 2012. (c) Pinar Ozger</p></div>
<p>So naturally, when it came time to build out the Gravity architecture, Benedetto opted for the MySQL he knew so well. Until about three years ago, he told me recently, that database held about 95 percent of the company&#8217;s data. At some point, though, Benedetto and his team realized they were spending way too much time keeping their MySQL environment up insteading of building new things, so it was time for a change.</p>
<p>It ultimately opted for HBase, but the decision wasn&#8217;t easy. &#8220;For us,&#8221; Benedetto said, &#8220;our data and algorithms are our company,&#8221; so making the move from a relational database to a column-based database that can serve MapReduce jobs was nerve-racking. After all, he explained, &#8220;You never want to migrate your data &#8230; and if you have to, you never want to migrate it more than once.&#8221; In fact, he added, &#8220;you&#8217;re not going back.&#8221;</p>
<p>But Benedetto says the move to HBase as Gravity&#8217;s primary data store has been &#8220;life-saving,&#8221; and it&#8217;s arguably a more important component of the company&#8217;s infrastructure than is Hadoop MapReduce. HBase handles the company&#8217;s real-time recommendation algorithms, and it does it across the entire Gravity platform rather than on a site-by-site basis. And although it&#8217;s not banking-grade when it comes to the consistency of transactions, Benedetto says it&#8217;s about 99.95 percent consistent in real time. Later on, batch MapReduce jobs swoop in and pick up whatever HBase dropped earlier, and process it all against the company&#8217;s graph algorithms.</p>
<div id="attachment_633095" class="wp-caption aligncenter" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/04/canvas-copy.jpg"><img  alt="interest graph" src="http://gigaom2.files.wordpress.com/2013/04/canvas-copy.jpg?w=708&#038;h=708" width="708" height="708" class="size-large wp-image-633095" /></a><p class="wp-caption-text">An example of an interest graph from Gravity,</p></div>
<h2 id="scalable-for-sure-and-getting-">Scalable for sure, and getting easier to use</h2>
<p>And although it took some serious engineering effort to get HBase operational when Gravity began working with it three years ago, Benedetto thinks HBase is getting to the point (as is rival NoSQL database Cassandra, he acknowledged) where one could safely call it &#8220;enterprise-ready.&#8221; Right now, he noted, &#8220;You&#8217;re not gonna to see HBase in a company that just buys Oracle because Oracle is the name and Oracle has been around for 20 years,&#8221; but for web startups that hope to reach a certain scale and even for existing companies that are running into the MySQL wall, he sees a shift occurring.</p>
<p>&#8220;The web farm is the easiest part of your infrastructure to scale because all it does is cost more money,&#8221; Benedetto explained. Databases, on the other hand, require a lot of thinking about how to migrate data, shard the database and otherwise make a piece of software likely designed for a handful of servers, max, spread across dozens or hundreds. HBase really eases the scaling process, as well as the subsequent management, he said. Now, Gravity&#8217;s 100-node HBase cluster has only two operations engineers dedicated to it.</p>
<p>Indeed, there are startups trying to capitalize on HBase by <a href="http://gigaom.com/2013/03/19/drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems/">using it to power SQL and even MongoDB-compliant databases</a> that can scale beyond what most relational databases can do.</p>
<p>Aside from scale HBase might soon start catching on because of the work companies like Gravity have been doing to make it more user-friendly. It might scale easily, but, as Benedetto noted, it&#8217;s not always easy to get started with &#8212; especially without some deep understanding of the intricacies of the underlying HDFS infrastructure. Last year, eBay VP of Experience, Search and Platforms Hugh Williams <a href="http://gigaom.com/2012/01/31/under-the-covers-of-ebays-big-data-operation/">told me that although HBase is one of the big data tools the company is most excited about</a>, it&#8217;s also the area where he&#8217;d like to see the most improvement.</p>
<p>To help alleviate some of the learning curve, Gravity has <a href="http://www.gravity.com/labs/hpaste/">developed an open-source tool called HPaste</a> that lets developers access data and run jobs on HBase data using Scala rather than the more-bloated Java programming language on which Hadoop and HBase are built. One of the biggest benefits of HPaste, Benedetto said, is that it lets new HBase developers see the data in a way that makes sense to them: HBase stores everything in byte arrays, he explained, and &#8220;when a human tries to read a byte array, it looks like ancient hieroglyphics.&#8221;</p>
<div id="attachment_633093" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/04/kiji-org-architecture1.png"><img  alt="Kiji architecture" src="http://gigaom2.files.wordpress.com/2013/04/kiji-org-architecture1.png?w=300&#038;h=275" width="300" height="275" class="size-medium wp-image-633093" /></a><p class="wp-caption-text">The Kiji architecture</p></div>
<p>Elsewhere, a startup called WibiData has <a href="http://gigaom.com/2012/11/14/wibidata-open-sources-kiji-to-make-hbase-more-useful/">created an open-source framework called Kiji</a> that aims to provide a collection of high-level APIs that should make it easier to store different data types in and develop applications on HBase. The company envisions Kiji being to HBase what the Spring Framework has become to Java over the course of the past decade.</p>
<h2 id="hadoops-weapon-for-the-mainstr">Hadoop&#8217;s weapon for the mainstream?</h2>
<p>But user experience aside, a lot of companies already invested in Hadoop &#8212; aside from <a href="http://gigaom.com/2011/03/04/how-facebook-is-powering-real-time-analytics/">expert users such as Facebook</a> &#8212; are starting to see the promise of HBase and are incorporating it into their architectures.</p>
<p>WibiData co-founder Christophe Bisciglia, who also co-founded Hadoop pioneer Cloudera in 2008, gave me his take on the state of HBase while <a href="http://gigaom.com/2013/03/12/hadoops-past-present-and-future-a-gigaom-special-report/">discussing its role in the future of Hadoop</a> earlier this year. &#8221;If you talk to anyone from Cloudera or any of the platform vendors, I think they will tell you that a large percentage of their customers use HBase. It’s something that I only expect to see increasing,&#8221;  he explained. &#8220;&#8230; HBase is gonna be what takes Hadoop from an ETL and BI platform into a real-time application platform.&#8221;</p>
<div id="attachment_633120" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/04/cloudera_enterprise_diagram.png"><img  alt="The Cloudera Hadoop stack (Gravityu uses Cloudera's distro)." src="http://gigaom2.files.wordpress.com/2013/04/cloudera_enterprise_diagram.png?w=300&#038;h=165" width="300" height="165" class="size-medium wp-image-633120" /></a><p class="wp-caption-text">The Cloudera Hadoop stack (Gravity uses Cloudera&#8217;s distro).</p></div>
<p>Benedetto appears to agree. He considers Hadoop as a whole incredibly important, almost on par with what Amazon Web Services did for computing resources, because it lets startups use commercial-grade open source software to do data storage and processing that previously was only available to massive web companies. &#8220;More and more &#8230; the shining star in that suite is HBase,&#8221; he said. &#8220;If I were Oracle, I&#8217;d be scared.&#8221;</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=632738&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=762237"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=762237" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=632738+how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=632738+how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/how-to-use-big-data-to-make-better-business-decisions/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=632738+how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream&utm_content=dharrisstructure">How to use big data to make better business decisions</a></li><li><a href="http://pro.gigaom.com/2011/04/infrastructure-q1-iaas-comes-down-to-earth-big-data-takes-flight/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=632738+how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream&utm_content=dharrisstructure">Infrastructure Q1: IaaS Comes Down to Earth; Big Data Takes Flight</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/04/22/how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_113600470.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_113600470.jpg?w=150" medium="image">
			<media:title type="html">Shiny database</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/01/1z5o2256.jpg?w=300" medium="image">
			<media:title type="html">Structure Data 2012: Jim Benedetto – CTO, Gravity Ashlie Beringer – Partner, Gibson, Dunn &#38; Crutcher</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/04/canvas-copy.jpg?w=708" medium="image">
			<media:title type="html">interest graph</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/04/kiji-org-architecture1.png?w=300" medium="image">
			<media:title type="html">Kiji architecture</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/04/cloudera_enterprise_diagram.png?w=300" medium="image">
			<media:title type="html">The Cloudera Hadoop stack (Gravityu uses Cloudera&#039;s distro).</media:title>
		</media:content>
	</item>
		<item>
		<title>Sector RoadMap: SQL-on-Hadoop platforms in 2013</title>
		<link>http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/</link>
		<comments>http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/#comments</comments>
		<pubDate>Wed, 20 Mar 2013 12:00:16 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/josephturian/" rel="author">Joseph Turian</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[apache-hive]]></category>
		<category><![CDATA[aster]]></category>
		<category><![CDATA[Aster Big Analytics Appliance]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[BigInsights]]></category>
		<category><![CDATA[Citus Data]]></category>
		<category><![CDATA[CitusDB]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Concurrent]]></category>
		<category><![CDATA[Database theory]]></category>
		<category><![CDATA[Dremel]]></category>
		<category><![CDATA[Drill]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hadoop Distributed File System]]></category>
		<category><![CDATA[HAWQ]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[HCatalog]]></category>
		<category><![CDATA[HDFS]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[Hortonworks]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Impala]]></category>
		<category><![CDATA[JethroData]]></category>
		<category><![CDATA[karmasphere]]></category>
		<category><![CDATA[Lingual]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[MemSQL]]></category>
		<category><![CDATA[microstrategy]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[MPP]]></category>
		<category><![CDATA[NewSQL]]></category>
		<category><![CDATA[Optiq]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[parallel computing]]></category>
		<category><![CDATA[pig]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[PostGIS]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[PostreSQL]]></category>
		<category><![CDATA[RainStor]]></category>
		<category><![CDATA[Salesforce.com]]></category>
		<category><![CDATA[SAP]]></category>
		<category><![CDATA[SAP HANA]]></category>
		<category><![CDATA[Splice Machine]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQL 92]]></category>
		<category><![CDATA[SQL-H]]></category>
		<category><![CDATA[SQLStream]]></category>
		<category><![CDATA[Stinger]]></category>
		<category><![CDATA[Stringer]]></category>
		<category><![CDATA[tableau]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[VoltDB]]></category>
		<category><![CDATA[zookeeper]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?post_type=go-report&#038;p=171512/</guid>
		<description><![CDATA[Today’s most successful companies are the ones with the ability to capture and analyze all data available to them. Enter SQL-on-Hadoop solutions, which increase the accessibility of Hadoop and allow organizations to reuse their investment learning in SQL. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648564&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Today’s most successful companies are the ones with the ability to capture and analyze all data available to them. Enter SQL-on-Hadoop solutions, which increase the accessibility of Hadoop and allow organizations to reuse their investment learning in SQL. </p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648564&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=871025"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=871025" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/04/sector-roadmap-hadoop-platforms-2012/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">2012: The Hadoop infrastructure market booms</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/04/elephant.jpg?w=150" />
		<media:content url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/04/elephant.jpg?w=150" medium="image">
			<media:title type="html">elephant</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>Drawn to Scale wants to make MongoDB scale like Hadoop</title>
		<link>http://gigaom.com/2013/03/19/drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems/</link>
		<comments>http://gigaom.com/2013/03/19/drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems/#comments</comments>
		<pubDate>Tue, 19 Mar 2013 17:00:13 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Drawn to Scale]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=621885</guid>
		<description><![CDATA[Database startup Drawn to Scale has extended its Spire distributed data platform from SQL to MongoDB. That means users can get high performance from the latter even across hundreds of terabytes.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=621885&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>If you love MongoDB but are tired of trying to scale it past a handful of machines and a few hundred gigabytes, database startup <a href="http://drawntoscale.com/">Drawn to Scale</a> says it has you covered. The company has <a href="http://drawntoscale.com/announcing-spire-for-mongo/">expanded the functionality of its distributed data platform from SQL to MongoDB</a>, meaning users of the popular NoSQL database can import their data to Spire and see high performance on hundreds of terabytes.</p>
<p>Drawn to Scale’s flagship product, called Spire, is a distributed data platform that’s built atop an optimized version of the Hadoop-based HBase database. HBase is what lets Spire scale cheaply and easily across. Its fully distributed index is what lets Spire read and write data at speeds that other approaches to scaling databases (e.g., sharding) can’t handle while maintaining the ability to handle rich queries.</p>
<p>To date, <a href="http://gigaom.com/2012/07/24/how-one-startup-wants-to-inject-hadoop-into-your-sql/">the company has been focused on letting users run massive SQL databases</a>, but it has finally completed a lengthy process of rewriting parts of MongoDB to work with Spire, Founder and CEO Bradford Stephens (who’ll be participating in our <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=621885+drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems&amp;utm_content=dharrisstructure">Structure: Data event</a> this week in New York) told me. The company had been keeping the work under tight wraps “because we didn’t know how long it was going to take to build,” he added.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/03/spiremongo-230x300.png"><img alt="SpireMongo-230x300" src="http://gigaom2.files.wordpress.com/2013/03/spiremongo-230x300.png?w=708"   class="alignright size-full wp-image-621963"></a>“Our big vision for the market is providing people with a universal data platform,” Stephens said. After SQL — which accounts for the vast majority of databases in existence — MongoDB is a logical next step (although Spire also supports queries using Hadoop MapReduce). It’s the most-widely used NoSQL database by a longshot, but although many users love its functionality and tooling, <a href="http://gigaom.com/2012/05/29/with-42m-more-10gen-wants-to-take-mongodb-mainstream/">the database is notoriously poor at scaling</a> to meet the demands of big data or high performance.</p>
<p>“You just sort of top out once you max out the memory,” Stephens explained, adding that MongoDB often starts getting inefficient as it’s forced to scale across 50 or 10 servers. “[T]hat’s where we <em>start</em> getting efficient.”</p>
<p>Now, without changing a single line of code, he claims, MongoDB users can import their data onto Spire and start handing 200-plus terabytes with ease. Of course, he noted, this doesn’t mean MongoDB users will abandon the database entirely. It might be they keep it for running applications that don’t require it to scale beyond a single server, and then use Spire to store big data for analytical purposes.</p>
<p>Initially, Spire will just support data importation and the basic CRUD (create, read, update, delete) functions of MongoDB, Stephens said. Later this year, assuming users want it, Drawn to Scale will implement MongoDB’s native MapReduce functionality as well as its management features.</p>
<p>As data volumes and data stores continue to proliferate, though, Drawn to Scale isn’t the only startup trying to provide a one-stop shop experience. At least for analytics, Citus Data is building a Postgres-based database <a href="http://gigaom.com/2013/02/19/citusdb-today-sql-on-hadoop-tomorrow-the-world/">capable of analyzing SQL, Hadoop and MongoDB data</a>, although each data store remains external. And there’s a <a href="http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/">whole group of companies merging SQL and Hadoop</a> for analytic workloads that might be wise to consider supporting operational data stores such as MongoDB, as well.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=621885&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=58165"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=58165" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621885+drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621885+drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/the-new-economics-of-enterprise-data-warehousing/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621885+drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems&utm_content=dharrisstructure">How data warehousing is now a cost-effective solution for businesses</a></li><li><a href="http://pro.gigaom.com/report/cloud-and-data-first-quarter-2013-analysis-and-outlook/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621885+drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems&utm_content=dharrisstructure">Cloud and data first-quarter 2013: analysis and outlook</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/19/drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_113600470.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_113600470.jpg?w=150" medium="image">
			<media:title type="html">Shiny database</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/spiremongo-230x300.png" medium="image">
			<media:title type="html">SpireMongo-230x300</media:title>
		</media:content>
	</item>
		<item>
		<title>In battle for Hadoop, MapR raises $30M</title>
		<link>http://gigaom.com/2013/03/18/in-battle-for-hadoop-mapr-raises-30m/</link>
		<comments>http://gigaom.com/2013/03/18/in-battle-for-hadoop-mapr-raises-30m/#comments</comments>
		<pubDate>Mon, 18 Mar 2013 22:30:50 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=621549</guid>
		<description><![CDATA[Hadoop vendor is racking up customers and on Monday it announced a $30 million venture-capital investment that brings its total funding to $59 million since launching in 2011.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=621549&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>There’s a lot of positioning within the Hadoop community over who has the most contributors to Apache Hadoop and whose distribution is the most open source. Depending on the source, MapR <a href="http://gigaom.com/2012/02/27/hadoop-bigger-than-spring-jboss-and-mysql-combined/">might be singled out as the antithesis of what Hadoop should be</a>. But MapR doesn’t mind the digs: <a href="http://www.mapr.com/">The company</a> is racking up customers and just closed a $30 million venture-capital investment that brings its total funding to $59 million since launching in 2011.</p>
<p>Because <a href="http://gigaom.com/2013/03/04/the-history-of-hadoop-from-4-nodes-to-the-future-of-data/">its roots are as an open-source project</a>, some members of the Hadoop community are rightfully concerned about keeping it as open as possible. This gives customers more flexibility in moving from product to product, they argue, and could help prevent a technological splinter like what happened with Unix in the 1980s and significantly slowed the popular operating system’s uptake and rise to ubiquity.</p>
<div id="attachment_621723" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/03/mapr_control_system2.png"><img alt="MapR's feature list" src="http://gigaom2.files.wordpress.com/2013/03/mapr_control_system2.png?w=300&#038;h=300" width="300" height="300" class="size-medium wp-image-621723"></a><p class="wp-caption-text">MapR’s feature list</p></div>
<p>MapR catches some flak because it has made its name pushing a pair of Hadoop distributions (one free and one not) that are based on the company’s proprietary file system that it claims is significantly faster than the standard Hadoop Distributed File System that many of its competitors use. Last year, it announced a commercial version of the usually HDFS-based HBase database, currently in beta, that also includes many of MapR’s homegrown improvements around performance and reliability.</p>
<p>Although, according to MapR VP of Marketing Jack Norris, the criticisms of its semi-proprietary aren’t entirely fair. He told me during a recent call that there are more than a dozen open-source packages within the company’s Hadoop distribution, and noted that allowing data access via <a href="http://en.wikipedia.org/wiki/Network_File_System">NFS</a> is hardly a tool of vendor lock-in.</p>
<p>The company is also <a href="http://gigaom.com/2012/08/17/for-fast-interactive-hadoop-queries-drill-may-be-the-answer/">spearheading the Apache Drill project</a>, an open-source re-envisioning of <a href="http://gigaom.com/2013/03/14/google-bigquery-is-now-even-bigger/">Google’s Dremel</a> for SQL-like queries on Hadoop data. Tomer Shiran, MapR’s director of product management, will be discussing the project during a panel at our <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=621549+in-battle-for-hadoop-mapr-raises-30m&amp;utm_content=dharrisstructure">Structure: Data</a> conference this week in New York.</p>
<p>But at the end of the day, MapR is a business and it’s doing what it can to make money in the new world of big data. If customers want features they can’t get from open-source versions of Hadoop, MapR will gladly supply them. In fact, he said, open source is “really not a core issue that comes up during the sales cycle.” (Norris took a more-defensive tone in a discussion about this topic last year: “No one can name the top 5 or 10 engineers on Oracle’s database,” he told me, “and no one really cares.”)</p>
<p>Norris points to <a href="http://blogs.gartner.com/merv-adrian/2013/03/09/open-source-purity-hadoop-and-market-realities/">a recent blog post from Gartner analyst Merv Adrian</a> in defending his company’s position. Addressing the concern over open source and Hadoop — particularly as it relates to MapR and <a href="http://gigaom.com/2013/02/25/emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya/">former OEM partner EMC</a> — Adrian wrote: “Having some components of your solution stack provided by the open source community is a fact of life and a benefit for all. So are roads, but nobody accuses Fedex or your pizza delivery guy of being evil for using them without contributing some asphalt.”</p>
<p>But MapR could just as easily point to its customer list and partnerships to prove the effectiveness of its approach, at least. Norris said its customers in fields such as advertising and retail analyze data on more than 90 percent of the internet population monthly and more than a trillion dollars in transactions every year. (It’s pretty mum on naming customers, although Norris did cite ComScore and Ancestry.com as users.) <a href="http://gigaom.com/2012/06/13/amazon-taps-mapr-for-high-powered-elastic-mapreduce/">Both Amazon Web Services</a> and <a href="http://www.mapr.com/company/press-releases/google-compute-engine-and-mapr-technologies-crush-minutesort-record">Google have partnered</a> with MapR to boost Hadoop performance on their cloud platforms.</p>
<p>Still, Hadoop is still relatively young as a commercial technology and it’s very early on for Hadoop as an IT market all its own. What customers like now might not be what they like forever, and there’s plenty of competition for those workloads and dollars. When you look at its bigger, better-funded and better-known competitors such as <a href="http://gigaom.com/2012/12/06/cloudera-snares-big-65m-more-to-boost-international-enterprise-growth/">Cloudera</a>, Hortonworks, <a href="http://gigaom.com/2013/02/25/emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya/">EMC Greenplum</a> and <a href="http://gigaom.com/2013/02/26/cloudera-who-intel-announces-its-own-hadoop-distribution/">now Intel</a>, it’s easy to see just how tough a fight MapR has in front of it.</p>
<p>Norris isn’t sweating it, though. “The big major weakness that needs to be addressed [with Hadoop] is the dynamic read/write capability of HDFS,” he told me. As long as the other players keep relying on HDFS at the storage layer, MapR will at least have a strong point of differentiation.</p>
<p>Mayfield Fund led MapR’s latest investment round, and existing investors Lightspeed Venture Partners, NEA and Redpoint Ventures also participated.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=621549&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=283683"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=283683" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621549+in-battle-for-hadoop-mapr-raises-30m&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/report/cloud-and-data-first-quarter-2013-analysis-and-outlook/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621549+in-battle-for-hadoop-mapr-raises-30m&utm_content=dharrisstructure">Cloud and data first-quarter 2013: analysis and outlook</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621549+in-battle-for-hadoop-mapr-raises-30m&utm_content=dharrisstructure">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li><li><a href="http://pro.gigaom.com/2012/11/unlocking-big-datas-potential-with-search/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=621549+in-battle-for-hadoop-mapr-raises-30m&utm_content=dharrisstructure">How search can unlock the power of big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/18/in-battle-for-hadoop-mapr-raises-30m/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_70904386.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_70904386.jpg?w=150" medium="image">
			<media:title type="html">Fighting elephants</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/mapr_control_system2.png?w=300" medium="image">
			<media:title type="html">MapR&#039;s feature list</media:title>
		</media:content>
	</item>
		<item>
		<title>5 reasons why the future of Hadoop is real-time (relatively speaking)</title>
		<link>http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/</link>
		<comments>http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/#comments</comments>
		<pubDate>Thu, 07 Mar 2013 13:00:37 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[real-time processing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=616972</guid>
		<description><![CDATA[In Part III of our look at all things Hadoop, we examine the trends driving Hadoop's future. At the end of the day, everything is pushing Hadoop toward being just generally faster and easier to consume.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=616972&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In some ways, Hadoop is a like a fine wine: It gets better with age as rough edges (or flavor profiles) are smoothed out, and those who wait to consume it will probably have a better experience. The only problem with this is that Hadoop exists in a world that’s more about <a href="http://www.urbandictionary.com/define.php?term=md+20%2F20">MD 20/20</a> than it is about <a href="http://www.winespectator.com/display/show?id=47374">Relentless Napa Valley 2008</a>: Companies often want to drink their big data fast, get drunk on insights, and then have some more — maybe something even stronger. And with data — unlike technology and tannins — it turns out older isn’t always better.</p>
<p>That’s a crude analogy, of course, but it gets at the essence of what’s currently plaguing Hadoop adoption and what will propel it forward in the next couple years. The work being done by companies like Cloudera and Hortonworks at the distribution level is great and important, as is MapReduce as a processing framework for certain types of batch workloads. But not every company can afford to be concerned about managing Hadoop on a day-to-day basis. And <a href="http://gigaom.com/2012/07/07/why-the-days-are-numbered-for-hadoop-as-we-know-it/">not every analytic job pairs well with MapReduce</a>.</p>
<p>In Part I of our four-part series on Hadoop, we <a href="http://gigaom.com/2013/03/04/the-history-of-hadoop-from-4-nodes-to-the-future-of-data/">looked at how the technology was born</a> and grew into the juggernaut it is today. In Part II, <a href="http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/">we laid out the map of the current products and projects</a> that comprise the Hadoop ecosystem. In this installment, we’ll take a closer look at some of them and how they’re positioning themselves to be important players down the road. Finally, <a href="http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective/">Part IV will highlight some the best Hadoop applications and seminal moments in Hadoop history</a>, as reported by GigaOM over the years.</p>
<p>If there’s one big Hadoop theme at our <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=616972+5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking&amp;utm_content=dharrisstructure">Structure: Data conference</a> March 20-21 in New York, it’s the new realization that people shouldn’t be asking “What’s next after Hadoop?” but rather “What will Hadoop become next?”. Based on what’s transpiring today, the answer to that question is that Hadoop will become faster in all regards and more useful as a result.</p>
<h2 id="interactivity-big-data-style">Interactivity, big-data-style</h2>
<div id="attachment_612788" class="wp-caption alignright" style="width: 310px"><img alt="Source: Shutterstock user hauhu." src="http://gigaom2.files.wordpress.com/2013/02/shutterstock_37622056.jpg?w=300&#038;h=225" width="300" height="225" class="size-medium wp-image-612788"><p class="wp-caption-text">Source: Shutterstock user hauhu.</p></div>
<p>As I explained with some detail a couple weeks ago, <a href="http://gigaom.com/2013/02/21/sql-is-whats-next-for-hadoop-heres-whos-doing-it/">SQL is what’s next for Hadoop</a>, and that’s not because of familiarity alone or the types of queries permitted by SQL <del datetime="2013-03-07T02:30:39+00:00"></del>on relational data<del datetime="2013-03-07T02:30:39+00:00"></del>. It’s also because the types of massively parallel processing engines developed to analyze relational data over the years are very fast. That means analysts can ask questions and get answers at speeds much closer to the speed of their intuitions than is possible when querying entire data sets using standard MapReduce.</p>
<p>But just as SQL and its processing techniques bring something to Hadoop, Hadoop (the Hadoop Distributed File System, specifically) brings something to the table, too. Namely, it brings scale and flexibility that don’t exist in the traditional data warehouse world, where new hardware and licenses can be expensive; so only the “valuable” data makes its way inside and only after it has been fitted to a pre-defined structure. Hadoop, on the other hand, provides virtually unlimited scale and schema-free storage, so companies can store however much information they want in whatever format they want and worry later about what they’ll actually use it for. (Actually, though, most Hadoop jobs do require some sort of structure in order to run, and Hadoop co-creator Mike Cafarella is <a href="http://cloudera.github.com/RecordBreaker/">working on a project called RecordBreaker</a> that aims to automate this process for certain data types.)</p>
<p>How hot is SQL-on-Hadoop space? I profiled the companies and projects working on it on Feb. 21, and since then EMC Greenplum <a href="http://gigaom.com/2013/02/25/emc-to-hadoop-competition-see-ya-wouldnt-wanna-be-ya/">announced a completely rewritten Hadoop distribution</a> that fuses its analytic database to Hadoop, and an entirely new player called <a href="http://jethrodata.com/">JethroData</a> emerged along with $4.5 million in funding. Even if there’s a major shakeout, there will be a few lucky companies left standing to capitalize on a shift to Hadoop as <em>the</em> center of data gravity that EMC Greenplum’s Scott Yara (albeit a biased source) thinks will be the data equivalent of the mainframe’s demise.</p>
<h2 id="this-is-your-database-this-is-">This is your database. This is your database on HDFS</h2>
<p>The SQL versus NoSQL debate appears to be dying down as companies and developers begin to realize there’s definitely a place for both in most environments, but a new debate — with Hadoop at the center — might be about to start up. At its core is <a href="http://datagravity.org/">the concept of data gravity</a> and the large, attractive (in a gravitational sense) entity that is HDFS. Here’s the underlying question that might be posed: If I’m already storing my unstructured data in HDFS and am expected to replace my data warehouse with it, too, why would I also run a handful of other databases that require a separate data store?</p>
<p>This is in part why <a href="http://hbase.apache.org/">HBase</a> has attracted such a strong following despite its relative technical and commercial immaturity compared with comparable NoSQL database <a href="http://cassandra.apache.org/">Cassandra</a>. For applications that would benefit from a relational database, startups such as <a href="http://gigaom.com/2012/07/24/how-one-startup-wants-to-inject-hadoop-into-your-sql/">Drawn to Scale</a> and <a href="http://gigaom.com/2012/10/17/batten-down-the-analysts-its-a-big-data-bi-storm/http://gigaom.com/2012/10/17/batten-down-the-analysts-its-a-big-data-bi-storm/">Splice Machine</a> have turned HBase into a transactional SQL system. Wibidata, the <a href="http://gigaom.com/2012/02/07/hadoop-startup-wibidata-raises-5m-to-power-web-analytics/">new startup from Cloudera C0-founder Christophe Bisciglia and Aaron Kimball</a>, is <a href="http://gigaom.com/2012/11/14/wibidata-open-sources-kiji-to-make-hbase-more-useful/">pushing an open source framework called Kiji</a> to make it easier to develop applications that use HBase.</p>
<p>“If you talk to anyone from Cloudera or any of the platform vendors, I think they will tell you that a large percentage of their customers use HBase,” Bisciglia said. “It’s something that I only expect to see increasing.”</p>
<p>MapR seems to think so, too: the Hadoop-distribution vendor is getting ahead of the game by <a href="http://www.mapr.com/products/mapr-editions/m7-edition">selling an enterprise-grade version of HBase called M7</a>. Should hot startups such as <a href="http://gigaom.com/2012/04/13/meet-tempodb-a-database-startup-with-an-eye-for-time/">TempoDB</a> and <a href="http://gigaom.com/2013/01/16/has-ayasdi-turned-machine-learning-into-a-magic-bullet/">Ayasdi</a> decide to take their HBase-reliant cloud services into the data center, they’ll tap into Hadoop clusters, too.</p>
<p>And the National Security Agency built <a href="http://accumulo.apache.org/">Apache Accumulo</a>, a key-value database similar to HBase but designed for fine-grained security and massive scale. It’s now <a href="http://sqrrl.com/">being sold commercially by a startup called Sqrrl</a>. There’s even a graph-processing project called <a href="http://incubator.apache.org/giraph/">Giraph</a> that relies on HBase or Accumulo as the database layer.</p>
<h2 id="whatever-real-time-means-to-yo">Whatever “real-time” means to you</h2>
<p>Real-time is one of those terms that means different things to different people and different applications. The interactivity that SQL-on-Hadoop technologies promise is one definition, as is the type of stream processing <a href="http://gigaom.com/2011/08/04/twitter-to-open-source-hadoop-like-tool/">enabled by technologies like Storm</a>. When it comes to the latter, there’s a lot of excitement around YARN as the innovation will make it happen.</p>
<p><a href="http://hortonworks.com/blog/introducing-apache-hadoop-yarn/">YARN, aka MapReduce 2.0</a>, is a resource scheduler and distributed application framework that allows Hadoop users to run processing paradigms other than MapReduce. This could mean things, from traditional parallel-processing methods such as MPI to graph processing to newly developed stream-processing engines such as Storm and <a href="http://incubator.apache.org/s4/">S4</a>. Considering for how many years <em>Hadoop </em>meant <em>HDFS and MapReduce</em>, this type flexibility is certainly a big deal.</p>
<p><img alt="figure1" src="http://gigaom2.files.wordpress.com/2013/03/figure1.gif?w=300&#038;h=216" width="300" height="216" class="size-medium wp-image-617741 alignleft">Stream processing, of course, is the antithesis of batch processing, for which Hadoop is known, and which is inherently too slow for workloads such as serving real-time ads or monitoring sensor data. And even if Storm and other stream-processing platforms somehow don’t make their way onto Hadoop clusters, <a href="http://gigaom.com/2013/02/14/hstreaming-ready-to-show-the-world-its-real-time-hadoop/">a startup called HStreaming has made it its mission</a> to deliver stream processing to Hadoop, and <a href="http://www.continuuity.com/technology">it’s on other companies’ radars, as well</a>.</p>
<p>For what it’s worth, though, <a href="http://verticloud.com/">VertiCloud</a> Founder and CEO and former Yahoo CTO Raymie Stata thinks we should do away with terms such as <em></em>batch, real-time and interactive altogether. Instead, he prefers the terms synchronous and asynchronous to describe the human experience with the data rather than the speed of processing it. Synchronous computing happens at the speed of human activity, generally speaking, while asynchronous computing is largely decoupled from the idea of someone sitting in front of a computer screen awaiting a result.</p>
<p>The change in terms is associated with a change in how you manage SLAs for applications. Uploading photos to Flickr: synchronous. Running a MapReduce job: most likely asynchronous. Ironically, according to Stata, stream processing data with Storm is often asynchronous, too. That’s because there’s probably not someone on the other end waiting for a page to update or a query to return. And unless you’re doing something where guaranteed real-time latency is <em>necessary</em>, the occasional difference between milliseconds and 1 second probably isn’t critical.</p>
<iframe width="100%" height="166" scrolling="no" frameborder="no" src="http://w.soundcloud.com/player?url=http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F80972108%253Fsecret_token%253Ds-1QBTa"></iframe>
<h2 id="time-to-insight-starts-at-the-">Time to insight starts at the planning phase</h2>
<p>Even when MapReduce is the answer, though, not everyone is game for a long Hadoop deployment process coupled with a consulting deal to identify uses and build applications or workflows. Sometimes, you just want to buy some software and get going.</p>
<p>Already, companies such as Wibidata and Continuuity are trying to make it easier for companies to build Hadoop applications specific to their own needs, and Wibidata’s Bisciglia said his company is doing less and less customization the more it deals with customers in the same vertical markets. “I think it’s still a couple years out before you can buy a generic application that runs on Hadoop,” he told me, but he does see opportunity for billion-dollar businesses at this level, possibly selling the Hadoop equivalent of an ERP or CRM application.</p>
<div id="attachment_603561" class="wp-caption alignright" style="width: 310px"><img alt="Structure Data 2012: Michael Olson – CEO, Cloudera" src="http://gigaom2.files.wordpress.com/2013/01/1z5o1503.jpg?w=300&#038;h=200" width="300" height="200" class="size-medium wp-image-603561"><p class="wp-caption-text">Cloudera CEO Mike Olson at Structure: Data 2012<br>(c) 2012 Pinar Ozger pinar@pinarozger.com</p></div>
<p>And Cloudera CEO Mike Olson <a href="http://gigaom.com/2012/03/21/cloudera-structure-data-2012/">told the audience at our Structure: Data conference last year</a> that he’ll connect startups trying to build Hadoop-based applications with funding opportunities. In fact, Cloudera backer Accel Partners <a href="http://gigaom.com/2011/11/08/accel-forms-100m-fund-to-feed-big-data-apps/">launched a Big Data Fund in 2011</a> with the sole purpose of funding application-level big data startups.</p>
<p>But maybe Cloudera, like database vendor Oracle before it, will just get into the application space itself: According to Hadoop creator and Cloudera chief architect Doug Cutting:</p>
<blockquote id="quote-i-wouldnt-be-surpris"><p>“I wouldn’t be surprised if you see vendors, like Cloudera, starting to creep up the stack and sell some applications. You’ve seen that before from Red Hat, from Oracle. You could argue that the relational database is a platform for Oracle and they’ve sold a lot of applications on top. So I think that happens as the market matures. When it’s young, we don’t want to stomp on potential collaborators at this point, we want to open that up to other people to really enhance the platform.”</p></blockquote>
<p>Cloud computing is proving to be a big help in getting Hadoop projects off the ground, too. Even low-level services such as Amazon Elastic MapReduce can <a href="http://gigaom.com/2012/02/22/how-infochimps-wants-to-become-heroku-for-hadoop/">ease the burden of managing a physical Hadoop cluster</a>, and there are already a handful of cloud services <a href="http://gigaom.com/2012/04/05/kontagent-turns-data-mining-into-saas-for-mobile-apps/">exposing Hadoop as a SaaS application</a> for business intelligence and analytics. The easier it gets to store, process and analyze data in the cloud, the more appealing Hadoop looks to potential users who can’t be bothered to invest in yet another IT project.</p>
<h2 id="google-and-microsoft-a-guiding">Google (and Microsoft): A guiding light</h2>
<p>Lest we forget, Hadoop is based on a set of Google technologies, and it seems likely its future will also be influenced by what Google is doing. Already, <a href="http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/Federation.html">improvements to HDFS</a> seem to mirror <a href="http://www.theregister.co.uk/2009/08/12/google_file_system_part_deux/">changes to the Google File System a few years bac</a>k, and YARN will enable some new types of non-MapReduce processing similar to what <a href="http://research.google.com/pubs/pub36726.html">Google’s new Percolator framework</a> does. (Google claims Percolator lets it “process the same number of documents per day, while reducing the average age of documents in Google search results by 50%.”) The MapR-led Apache Drill project <a href="http://gigaom.com/2012/08/17/for-fast-interactive-hadoop-queries-drill-may-be-the-answer/">is a Hadoop-based version of Google’s Dremel tool</a>; Giraph was likely inspired by Google’s <a href="http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html">Pregel graph-processing technology</a>.</p>
<p>Cutting is particularly excited about Google Spanner, a database system that <a href="http://gigaom.com/2012/09/17/googles-spanner-a-database-that-knows-what-time-it-is/">spans data geographies while still maintaining transactional consistency</a>. “It’s a matter of time before somebody implements that in the Hadoop ecosystem,” he said. “That’s a huge change.”</p>
<p>It’s possible Microsoft could be an inspiration to the Hadoop community, too, especially if it begins to surface pieces of its Bing search infrastructure as products like a couple of company executives have told me it will. Bing <a href="http://research.microsoft.com/en-us/events/fs2011/helland_cosmos_big_data_and_big_challenges.pdf">runs on a combination of tools called Cosmos, Tiger and Scope</a>, and it’s part of the Online Services division ran by former Yahoo VP and Hadoop backer Qi Lu. Lu said that Microsoft (like Google) is looking beyond just search — Hadoop’s original function — and into building an information fabric that changes how data is indexed, searched for and presented.</p>
<p>However it evolves, though, it’s becoming pretty obvious that Hadoop is no longer just a technology for doing cheap storage and some MapReduce processing. “I think there’s still some doubt in people’s minds about whether Hadoop is a flash in the pan … and I think they’re missing the point,” Cutting said. “I think that’s going to be proven to people in the next year.”</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=616972&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=562575"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=562575" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=616972+5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=616972+5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/the-new-economics-of-enterprise-data-warehousing/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=616972+5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking&utm_content=dharrisstructure">How data warehousing is now a cost-effective solution for businesses</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=616972+5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking&utm_content=dharrisstructure">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" medium="image">
			<media:title type="html">gigaom hadoop icon final</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/02/shutterstock_37622056.jpg?w=300" medium="image">
			<media:title type="html">Source: Shutterstock user hauhu.</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/figure1.gif?w=300" medium="image">
			<media:title type="html">figure1</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/01/1z5o1503.jpg?w=300" medium="image">
			<media:title type="html">Structure Data 2012: Michael Olson – CEO, Cloudera</media:title>
		</media:content>
	</item>
		<item>
		<title>How to use big data to make better business decisions</title>
		<link>http://pro.gigaom.com/report/how-to-use-big-data-to-make-better-business-decisions/</link>
		<comments>http://pro.gigaom.com/report/how-to-use-big-data-to-make-better-business-decisions/#comments</comments>
		<pubDate>Thu, 07 Mar 2013 07:55:06 +0000</pubDate>
		<dc:creator>Paul Miller</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Apache Mahout]]></category>
		<category><![CDATA[Apache Solr]]></category>
		<category><![CDATA[apache-hadoop]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[business]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[cloud-infrastructure]]></category>
		<category><![CDATA[dashboard reporting]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[data management]]></category>
		<category><![CDATA[data storage]]></category>
		<category><![CDATA[data volume]]></category>
		<category><![CDATA[data-analytics]]></category>
		<category><![CDATA[data-management tools]]></category>
		<category><![CDATA[database management systems]]></category>
		<category><![CDATA[decision-making]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[enterprise IT]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Hewlett-Packard]]></category>
		<category><![CDATA[HP]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[LucidWorks]]></category>
		<category><![CDATA[machine-learning]]></category>
		<category><![CDATA[META Group]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Neo4j]]></category>
		<category><![CDATA[open-web]]></category>
		<category><![CDATA[Opera Solutions]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Organizational memory]]></category>
		<category><![CDATA[paypal]]></category>
		<category><![CDATA[Piyanka Jain]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[structured data]]></category>
		<category><![CDATA[Transaction processing]]></category>
		<category><![CDATA[unstructured data]]></category>
		<category><![CDATA[Variety]]></category>
		<category><![CDATA[velocity]]></category>
		<category><![CDATA[Vertica]]></category>
		<category><![CDATA[Vertica and Autonomy]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[Voldemort]]></category>
		<category><![CDATA[volume]]></category>
		<category><![CDATA[Wired]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?post_type=go-report&#038;p=170651/</guid>
		<description><![CDATA[Companies are rushing to embrace the promise of big data to understand both their businesses and the ways in which customers interact with them. But effective data-based decisions are not made in response to simplistic data reporting; they are made in response to considered and ongoing data analysis.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648577&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Companies are rushing to embrace the promise of big data to understand both their businesses and the ways in which customers interact with them. But effective data-based decisions are not made in response to simplistic data reporting; they are made in response to considered and ongoing data analysis.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648577&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=540799"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=540799" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648577+how-to-use-big-data-to-make-better-business-decisions&utm_content=cloudofdata">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648577+how-to-use-big-data-to-make-better-business-decisions&utm_content=cloudofdata">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/cloud-and-data-first-quarter-2013-analysis-and-outlook/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648577+how-to-use-big-data-to-make-better-business-decisions&utm_content=cloudofdata">Cloud and data first-quarter 2013: analysis and outlook</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-and-data-second-quarter-2012-analysis-and-outlook-2/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648577+how-to-use-big-data-to-make-better-business-decisions&utm_content=cloudofdata">Takeaways from the second quarter in cloud and data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/report/how-to-use-big-data-to-make-better-business-decisions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/04/big-data-on-computer-image.jpg?w=150" />
		<media:content url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/04/big-data-on-computer-image.jpg?w=150" medium="image">
			<media:title type="html">big data on computer image</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/7c1b4afa924d36a76027fe2be0543eeb?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">cloudofdata</media:title>
		</media:content>
	</item>
		<item>
		<title>WibiData open sources Kiji to make HBase easier</title>
		<link>http://gigaom.com/2012/11/14/wibidata-open-sources-kiji-to-make-hbase-more-useful/</link>
		<comments>http://gigaom.com/2012/11/14/wibidata-open-sources-kiji-to-make-hbase-more-useful/#comments</comments>
		<pubDate>Wed, 14 Nov 2012 15:42:53 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[application development]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[WibiData]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=584564</guid>
		<description><![CDATA[HBase is a great option for developing big data applications, but it's not necessarily easy to use. WibiData is addressing this by open sourcing a portion of its predictive analytics infrastructure that adds structure to data, followed eventually by a whole HBase development framework called Kiji.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=584564&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.wibidata.com/">WibiData</a>, the Hadoop-based <a href="http://gigaom.com/cloud/hadoop-startup-wibidata-raises-5m-to-power-web-analytics/">user analytics startup from Cloudera co-founder Christophe Bisciglia</a>, has open sourced part of its software stack that&#8217;s designed to make it easier for developers build big data apps on the HBase NoSQL database. Called <a href="http://www.kiji.org/">KijiSchema</a>, the technology is a Java API for adding schema to data flowing into HBase so that applications needing to analyze the data can actually know something about it.</p>
<p>As WibiData product manager Devjit Chakravarti told me during a recent call, KijiSchema essentially &#8220;takes the &#8216;No&#8217; out of NoSQL.&#8221; What he means is that although NoSQL databases such as HBase are lauded in part because they can store unstructured data and don&#8217;t require rigid rules for data formatting like relational databases do, having some structure is actually necessary once you want to do meaningful analysis on it. That&#8217;s why some commercial products, such as <a href="http://gigaom.com/cloud/how-one-startup-wants-to-inject-hadoop-into-your-sql/">Drawn to Scale&#8217;s Spire</a> and <a href="http://gigaom.com/data/batten-down-the-analysts-its-a-big-data-bi-storm/">Splice Machine&#8217;s Splice SQL Engine</a>, already have built functional SQL databases on top of HBase.</p>
<div id="attachment_584629" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/11/kimball.jpg"><img  title="kimball" alt="" src="http://gigaom2.files.wordpress.com/2012/11/kimball.jpg?w=708"   class="size-full wp-image-584629" /></a><p class="wp-caption-text">Kimball speaking at Structure: Data in 2012<br />(c) 2012 Pinar Ozger. pinar@pinarozger.com</p></div>
<p>&#8220;If you can&#8217;t store data in an organized way, you can&#8217;t analyze it effectively,&#8221; WibiData Co-Founder and CTO Aaron Kimball explained. KijiSchema isn&#8217;t part of WibiData&#8217;s secret sauce around predictive analytics for user data, he added, but nothing gets done without it.</p>
<p>Here&#8217;s how Kimball describes how KijiSchema manages data <a href="http://www.wibidata.com/2012/11/14/the-kiji-project-an-open-source-framework-for-building-big-data-applications-with-apache-hbase/">in a blog post announcing the project</a>:</p>
<blockquote><p>&#8220;KijiSchema gives developers the ability to easily store both structured and unstructured data within HBase using Avro serialization. It supports a variety of rich schema features, including complex, compound data types, HBase column key and time-series indexing, as well cell-level evolving schemas that dynamically encode version information.</p>
<p>&#8220;KijiSchema promotes the use of entity-centric data modeling, where all information about a given entity (user, mobile device, ad, product, etc.), including dimensional and transaction data, is encoded within the same row. This approach is particularly valuable for user-based analytics such as targeting, recommendations, and personalization.&#8221;</p></blockquote>
<div id="attachment_584626" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/11/wibi-kiji.jpg"><img  title="wibi kiji" alt="" src="http://gigaom2.files.wordpress.com/2012/11/wibi-kiji.jpg?w=300&#038;h=224" height="224" width="300" class="size-medium wp-image-584626" /></a><p class="wp-caption-text">Kiji resides in the lower left section</p></div>
<p>The coolest part for HBase developers or prospective HBase developers, however, might be that KijiSchema isn&#8217;t just code but is already pre-packaged any ready to deploy. WibiData has created what it calls the Kiji BentoBox &#8212; &#8220;a fully-functional HBase mini-cluster with KijiSchema on your machine with minimal configuration in under 15 minutes&#8221; &#8212; that&#8217;s <a href="http://www.kiji.org/getstarted/#Downloads">available for download on Github</a>.</p>
<p>KijiSchema is also part of a broader Kiji framework for HBase that WibiData plans to open source over the next year or so. People perceive HBase as being complicated to set up and having a steep learning curve, Kimball said, and his teams wants to make it more accessible and lower the barrier for getting started. The ultimate goal is to make the types of HBase applications <a href="http://gigaom.com/cloud/how-facebook-is-powering-real-time-analytics/">that folks at Facebook</a>, <a href="http://gigaom.com/cloud/under-the-covers-of-ebays-big-data-operation/">eBay</a> and other large web shops are building something that any developer can do.</p>
<p>WibiData&#8217;s Omer Trajman, formerly VP of technology solutions at Cloudera, describes the ultimate Kiji framework as being akin what the <a href="http://www.springsource.org/spring-framework">Spring framework</a> if for Java. Despite its complexity, &#8220;there are also tens of thousands of developers who have been able to figure [HBase] out,&#8221; he said, but learning it might take weeks of intensive training on learning the low-level guts of the Hadoop Distributed File System and other stuff. Why learn to build an enterprise Java application from scratch, Trajman asked, when you can just use Spring?</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=584564&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=430137"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=430137" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=584564+wibidata-open-sources-kiji-to-make-hbase-more-useful&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=584564+wibidata-open-sources-kiji-to-make-hbase-more-useful&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=584564+wibidata-open-sources-kiji-to-make-hbase-more-useful&utm_content=dharrisstructure">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li><li><a href="http://pro.gigaom.com/report/cloud-and-data-first-quarter-2013-analysis-and-outlook/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=584564+wibidata-open-sources-kiji-to-make-hbase-more-useful&utm_content=dharrisstructure">Cloud and data first-quarter 2013: analysis and outlook</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/11/14/wibidata-open-sources-kiji-to-make-hbase-more-useful/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/11/wibi-kiji.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/11/wibi-kiji.jpg?w=150" medium="image">
			<media:title type="html">wibi kiji</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/11/kimball.jpg" medium="image">
			<media:title type="html">kimball</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/11/wibi-kiji.jpg?w=300" medium="image">
			<media:title type="html">wibi kiji</media:title>
		</media:content>
	</item>
	</channel>
</rss>
