<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; Hadapt</title>
	<atom:link href="http://gigaom.com/tag/hadapt/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Wed, 19 Jun 2013 14:30:43 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; Hadapt</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>If the future of BI is Hadoop, SQL and the cloud are the glue</title>
		<link>http://gigaom.com/2013/03/21/if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue/</link>
		<comments>http://gigaom.com/2013/03/21/if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue/#comments</comments>
		<pubDate>Thu, 21 Mar 2013 18:18:10 +0000</pubDate>
		<dc:creator>Kevin C. Tofel</dc:creator>
				<category><![CDATA[Ashish Thusoo]]></category>
		<category><![CDATA[Ben Werther]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Justin Borgman]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[Qubole]]></category>
		<category><![CDATA[Ravi Murthy]]></category>
		<category><![CDATA[Structure Data 2013]]></category>
		<category><![CDATA[Tomer Shiran]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=622888</guid>
		<description><![CDATA[There's no doubt that Hadoop is the data tool of the present and future, but more can be done to make it really shine for business intelligence.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=622888&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Starting with the well-known quote — “A good way to predict the future is to invent it” – Ravi Murthy, engineering manager at Facebook, kicked off an interesting panel discussion at <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=622888+if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue&amp;utm_content=kevintofel">GigaOM Structure:Data 2013</a> Thursday with four industry experts on business intelligence (BI) and Hadoop. Hadoop has a big place in that future, but not by itself. The conclusion? Applications and SQL databases built atop Hadoop are needed for better BI, noted the panel.</p>
<p>“Why are so many systems being built in the BI landscape? If Hadoop can deliver the promise, why have all these other solutions?” asked Murthy.</p>
<p>Ashish Thusoo, co-Founder and CEO at <a href="http://www.qubole.com/">Qubole,</a> said that putting SQL on top of Hadoop just makes sense. “As a system, Hadoop is not a low-latency system, opening the need for faster SQL-based systems to query the data. And there’s probably only space for half-dozen of these solutions in the market; not dozens.”</p>
<p>Agreeing with Thusoo was Tomer Shiran, director, product management at <a href="http://www.mapr.com/">MapR Technologies</a>. “With our open source Apache Drill we’re enabling lots of differing BI use cases allowing companies to do different things with Hadoop. One use case is ability to interactively query and explore data.” Apache Drill is an interactive, low-latency SQL way to get at the data reservoir in Hadoop. Ben Werther, founder and CEO, <a href="http://www.platfora.com/">Platfora</a> completely agreed, saying that customers looking for much more agile approaches to data exploration without building more IT work.</p>
<p>But Hadoop is still an important underlying part of the puzzle. Justin Borgman, CEO, <a href="http://hadapt.com/">Hadapt</a> noted that “Hadoop scales so cost effectively; it’s a landfill where you can dump everything. That opens up new opportunities to explore that data including indexing to boost performance and interactivity across a broader data set.”</p>
<p>When asked for a use case of the benefits, Werther pointed out an unnamed customer. “They had 50 analysts working against SQL stores in a very siloed fashion. We moved them to a Hadoop-based stack and built a data reservoir. Only 5 of the 50 were able to be productive before. Within a week, all 50 became productive.”</p>
<p>Of course, the cloud is also part of BI’s future, although it’s not without risks. Sure, running Hadoop in the cloud is very elastic so that you can use as many resources as you need in near real-time. But the issues of security and data gravity in particular are worth noting: Generating data in the cloud could make it tough to move out in the future and may require more apps build on this data to also be in the cloud.</p>
<p>Check out <a href="http://gigaom.com/2013/03/20/structuredata-2013-live-coverage/">the rest of our Structure:Data 2013 live coverage here</a>, and a video embed of the session follows below:</p>
<p><span class="embed-youtube" style="text-align:center; display: block;"><iframe class="youtube-player" type="text/html" width="560" height="315" src="http://www.youtube.com/embed/neo6TE41I8I?version=3&amp;rel=1&amp;fs=1&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" frameborder="0"></iframe></span><br>
A transcription of the video follows on the next page</p>
<p><a href="http://gigaom.com/2013/03/21/if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue/2/">Go to page 2 (of 2) on GigaOM .</a></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=622888&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=165154"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=165154" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622888+if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue&utm_content=kevintofel">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622888+if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue&utm_content=kevintofel">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622888+if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue&utm_content=kevintofel">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li><li><a href="http://pro.gigaom.com/2012/03/why-service-providers-matter-for-the-future-of-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622888+if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue&utm_content=kevintofel">Why service providers matter for the future of big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/21/if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/2jdr1lkkt5mzdquv3eohac6hpes7cwccwjhxtsbri0g.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/2jdr1lkkt5mzdquv3eohac6hpes7cwccwjhxtsbri0g.jpg?w=150" medium="image">
			<media:title type="html">Justin Borgman Hadapt Tomer Shiran MapR Technologies Ashish Thusoo Qubole Ben Werther Platfora Structure Data 2013</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/6cbb45abac59965c2626e40155358d1b?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">Kevin C. Tofel</media:title>
		</media:content>
	</item>
		<item>
		<title>Sector RoadMap: SQL-on-Hadoop platforms in 2013</title>
		<link>http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/</link>
		<comments>http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/#comments</comments>
		<pubDate>Wed, 20 Mar 2013 12:00:16 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/josephturian/" rel="author">Joseph Turian</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[apache-hive]]></category>
		<category><![CDATA[aster]]></category>
		<category><![CDATA[Aster Big Analytics Appliance]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[BigInsights]]></category>
		<category><![CDATA[Citus Data]]></category>
		<category><![CDATA[CitusDB]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Concurrent]]></category>
		<category><![CDATA[Database theory]]></category>
		<category><![CDATA[Dremel]]></category>
		<category><![CDATA[Drill]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hadoop Distributed File System]]></category>
		<category><![CDATA[HAWQ]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[HCatalog]]></category>
		<category><![CDATA[HDFS]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[Hortonworks]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Impala]]></category>
		<category><![CDATA[JethroData]]></category>
		<category><![CDATA[karmasphere]]></category>
		<category><![CDATA[Lingual]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[MemSQL]]></category>
		<category><![CDATA[microstrategy]]></category>
		<category><![CDATA[MongoDB]]></category>
		<category><![CDATA[MPP]]></category>
		<category><![CDATA[NewSQL]]></category>
		<category><![CDATA[Optiq]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[parallel computing]]></category>
		<category><![CDATA[pig]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[PostGIS]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[PostreSQL]]></category>
		<category><![CDATA[RainStor]]></category>
		<category><![CDATA[Salesforce.com]]></category>
		<category><![CDATA[SAP]]></category>
		<category><![CDATA[SAP HANA]]></category>
		<category><![CDATA[Splice Machine]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQL 92]]></category>
		<category><![CDATA[SQL-H]]></category>
		<category><![CDATA[SQLStream]]></category>
		<category><![CDATA[Stinger]]></category>
		<category><![CDATA[Stringer]]></category>
		<category><![CDATA[tableau]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[VoltDB]]></category>
		<category><![CDATA[zookeeper]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?post_type=go-report&#038;p=171512/</guid>
		<description><![CDATA[Today’s most successful companies are the ones with the ability to capture and analyze all data available to them. Enter SQL-on-Hadoop solutions, which increase the accessibility of Hadoop and allow organizations to reuse their investment learning in SQL. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648564&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Today’s most successful companies are the ones with the ability to capture and analyze all data available to them. Enter SQL-on-Hadoop solutions, which increase the accessibility of Hadoop and allow organizations to reuse their investment learning in SQL. </p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648564&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=777021"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=777021" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/04/sector-roadmap-hadoop-platforms-2012/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">2012: The Hadoop infrastructure market booms</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=648564+sql-on-hadoop-roadmap-2013&utm_content=gigaedit">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/04/elephant.jpg?w=150" />
		<media:content url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/04/elephant.jpg?w=150" medium="image">
			<media:title type="html">elephant</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>The downside of Upstart: mentors come at a price</title>
		<link>http://gigaom.com/2012/11/08/the-downside-of-upstart-mentors-come-at-a-price/</link>
		<comments>http://gigaom.com/2012/11/08/the-downside-of-upstart-mentors-come-at-a-price/#comments</comments>
		<pubDate>Thu, 08 Nov 2012 16:17:42 +0000</pubDate>
		<dc:creator>Barb Darrow</dc:creator>
				<category><![CDATA[Chris Lynch]]></category>
		<category><![CDATA[Daniel Abadi]]></category>
		<category><![CDATA[David Girouard]]></category>
		<category><![CDATA[hack/reduce]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[upstart]]></category>
		<category><![CDATA[yale]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=581893</guid>
		<description><![CDATA[VC-backed Upstart aims to help smart college graduates fund the startups of their dreams in return for a slice of their future earnings. So what's so wrong with that? Plenty, according to Daniel Abadi, chief scientist of Hadapt and associate professor at Yale. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=581893&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.upstart.com/">Upstart</a> offers a way for smart college grads who want to start their own companies to get funding &#8212; typically up to $50,000 &#8212; along with an experienced mentor who supplies that funding. That can be enough cash to buy computers (or rent Amazon EC2 time), pay off some school debt, or rent an office. So what&#8217;s so bad about that?</p>
<p>It turns out that entrepreneur and college prof <a href="http://cs-www.cs.yale.edu/homes/dna/">Daniel Abadi</a> has a bone to pick with that model.  <a href="http://www.upstart.com/"><br />
</a></p>
<p>In <a href="http://dbmsmusings.blogspot.com/2012/11/is-upstart-right-way-to-get-college.html">a blog post, </a>Abadi, who is a noted Hadoop expert, chief scientist at Hadapt, and an associate professor of computer science at Yale, says he was initially intriqued by Upstart but then saw some things that concerned him.</p>
<blockquote>
<div>&#8220;A deeper look at the Upstart Website reveals a problematic clause that is attached with the funding of the student start-up ideas. This is not a traditional crowdfunding model where investors receive equity in the start-up in exchange for their investment dollars. Instead, the investors get a percentage of the student’s income for a 10-year period in exchange for the investment. This way, in the likely event that the student’s start-up idea does not work out, the investor is able to receive a nice return on investment by taking a cut from the student’s hard-earned salary when the student enters the workforce.&#8221;</div>
</blockquote>
<p>The problem is that, many students have unrealistic expectations of success and given that the failure rate of startups is high &#8212; some say that 11 out of 12 fail &#8212; the entrepreneur still has to pay the Upstart mentor a portion of annual earnings &#8212; whatever their source &#8212; as agreed upon no matter what.  Upstart itself gets 3 percent of the patron’s initial contribution. And, later should the startup succeed, Upstart collects a 1.5 percent service fee when the debt is repaid.</p>
<h2>Mentorship over money</h2>
<p>Upstart CEO and Google veteran David Girouard said the program is almost more about the mentorship than the money. &#8220;It&#8217;s possible (and actually likely ) that startups fail,&#8221; he said via email. &#8220;The idea of Upstart is you have backers that help you well beyond any particular startup or initiative &#8212; they are with you for at least the duration of your commitment.&#8221;</p>
<p>And, there are some protections in place. The deal is negotiated up front and the percentage taken from yearly income is capped at 7 percent. If the entrepreneur makes less than $30,000 in a given year, payment is waived and another year is added to the payout timeline, which is capped at 15 years.</p>
<p>Of course, naysayers contend that 7 percent is a pretty hefty take for young people who are also likely saddled with school debt.</p>
<p>Girouard said the program was specifically designed to facilitate investment in a person, rather than in a company. &#8220;We aren&#8217;t really a substitute for venture capital or angel investing &#8212; funds are typically used to retire debt or pay living expenses rather than corporate expenses,&#8221; he said via email.</p>
<p>Others in the tech world have other ideas about helping startups. The new Cambridge, Mass.-based <a href="http://gigaom.com/data/bostons-preps-big-kickoff-for-big-data-hub/">hack-reduce program</a>, for example, is a 501-3c charity that will give a select set of big data entrepreneurs a leg up without taking a piece of the action, according to co-founder Chris Lynch, who posted a comment to that effect on Abadi&#8217;s blog.</p>
<p><em>Feature photo courtesy of Shutterstock user <a href="http://www.shutterstock.com/gallery-76219p1.html">wavebreakmedia</a></em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=581893&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=658287"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=658287" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=581893+the-downside-of-upstart-mentors-come-at-a-price&utm_content=gigabarb">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=581893+the-downside-of-upstart-mentors-come-at-a-price&utm_content=gigabarb">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=581893+the-downside-of-upstart-mentors-come-at-a-price&utm_content=gigabarb">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=581893+the-downside-of-upstart-mentors-come-at-a-price&utm_content=gigabarb">The importance of putting the U and I in visualization</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/11/08/the-downside-of-upstart-mentors-come-at-a-price/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/11/shutterstock_107331713.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/11/shutterstock_107331713.jpg?w=150" medium="image">
			<media:title type="html">college grads</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/4af03439988d64f816da72496325cb73?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigabarb</media:title>
		</media:content>
	</item>
		<item>
		<title>Cloudera makes SQL a first-class citizen in Hadoop</title>
		<link>http://gigaom.com/2012/10/24/cloudera-makes-sql-a-first-class-citizen-in-hadoop/</link>
		<comments>http://gigaom.com/2012/10/24/cloudera-makes-sql-a-first-class-citizen-in-hadoop/#comments</comments>
		<pubDate>Wed, 24 Oct 2012 13:00:52 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[analytic database]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[Drill]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hortonworks]]></category>
		<category><![CDATA[Impala]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=576626</guid>
		<description><![CDATA[Cloudera has joined the fray of Hadoop companies trying to turn the big data platform into an engine for exploring data interactively using standard SQL. As the biggest company in the space, its new technology called Impala could go a long way toward changing Hadoop's image.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=576626&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Not content to watch its competitors leave it in the dust, veteran big data startup Cloudera is fundamentally changing the face of its flagship Hadoop distribution into something much more appealing. The company has developed a real-time <a href="http://www.cloudera.com/content/cloudera/en/products/cloudera-enterprise-core/cloudera-enterprise-RTQ.html">SQL query engine called Impala</a> that will sit aside MapReduce as a native processing option within Cloudera&#8217;s version of Hadoop. Cloudera is biggest and most well-known Hadoop vendor around, so opening its platform up to the wide world of SQL-trained data analysts is a really big deal &#8212; even if Cloudera is a bit late to the SQL party.</p>
<h2>From batch processing to data interaction</h2>
<p>The business world regularly laments the circumstances that spurred Impala&#8217;s creation. I summed them up last week and again yesterday when reporting similar products <a href="http://gigaom.com/data/hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future/">from startups Hadapt</a> and <a href="http://gigaom.com/data/platfora-shows-a-whole-new-way-to-do-business-intelligence-on-big-data/">Platfora</a>, but the gist is that although Hadoop is more scalable and more flexible than traditional data warehouses or analytic databases, it&#8217;s also slower, harder to learn and designed for batch processing an entire data set rather than interactively querying a data set. Until now, the common methods for querying Hadoop were to <a href="http://hive.apache.org/">use a custom-built language such as Hive</a>, or to transport data to a data warehouse from Hadoop and then analyze it using traditional business intelligence software.</p>
<p>However, Cloudera&#8217;s Cloud VP of Products Charles Zedlewski was quick to point out during a recent conversation that Impala isn&#8217;t a replacement for other BI tools, just a new data source into which they can connect. If anything, it&#8217;s a replacement for Hive, which Facebook built to bring data warehouse capabilities to Hadoop, but which wasn&#8217;t really developed for public consumption as a software product. For the sake of uniformity, Impala actually uses the same SQL set as Hive, but is on average 10 times faster thanks to its purpose-built query engine that foregoes reliance on MapReduce. Small queries, Zedlewski said, can run in less than a second.</p>
<p>Impala has been in the making for almost two years, and Cloudera &#8220;took a a lot of pains to stitch this really well in with the rest of the Hadoop stack,&#8221; Zedlewski said. Users still store data in the Hadoop Distributed File System of the HBase database, and they can still store whatever types of structured, semi-structured on unstructured data they please. Impala uses the same metadata as the other Hadoop components, the same drivers and &#8212; like almost everything else in the Hadoop world &#8212; is open source under the Apache Software Foundation license.</p>
<p>Unlike some other Hadoop startups, though, Cloudera isn&#8217;t interested in selling BI or other analytic applications. Impala (which is called Real-Time Query for customers who pay for support) is the execution engine, but it still relies on software from Cloudera partners such as <a href="http://gigaom.com/cloud/thanks-to-consumerization-its-ipo-season-in-analytics/">Tableau, QlikTech</a> and MicroStrategy in order to ask questions and visualize the results. &#8220;We&#8217;re sticking to our knitting as a platform vendor,&#8221; said Zedlewski, echoing a sentiment on which his boss, Cloudera CEO Mike Olson, <a href="http://gigaom.com/2012/03/21/cloudera-structure-data-2012/">has been bullish for years</a>.</p>
<p style="text-align: center;"><a href="http://gigaom2.files.wordpress.com/2012/10/impala.jpg"><img  title="impala" alt="" src="http://gigaom2.files.wordpress.com/2012/10/impala.jpg?w=708"   class="size-full wp-image-576688 aligncenter" /></a></p>
<h2>Different strokes move the world</h2>
<p>I can&#8217;t underscore enough how critical all of this innovation is for Hadoop, which in order to add substance to its unparalleled hype needed to become far more useful to far more users. But the sudden shift from Hadoop as a batch-processing engine built on MapReduce into an ad hoc SQL querying engine might leave industry analysts and even Hadoop users scratching their heads.</p>
<p>Cloudera, now with more than 300 employees and annual revenue rumored to be in hundreds of millions, is the 800-pound gorilla in the Hadoop market, and its implementation of Impala has to make it look even better for prospective customers. But Cloudera doesn&#8217;t have this space to itself. Assuming your goal is to use Hadoop as the platform for running SQL queries (as opposed to, for example, <a href="http://gigaom.com/data/metamarkets-open-sources-druid-its-in-memory-database/">using it for ETL before putting it in an in-memory system</a>), there are plenty of choices on the table. And everyone&#8217;s approach is different.</p>
<p>For starters, bitter distribution-level rival MapR announced in August that it&#8217;s <a href="http://gigaom.com/cloud/for-fast-interactive-hadoop-queries-drill-may-be-the-answer/">leading an open source project called Drill</a> that provides essentially the same functionality as Impala. MapR is <a href="http://gigaom.com/cloud/amazon-taps-mapr-for-high-powered-elastic-mapreduce/">getting a lot of love from Hadoop users right now</a>, and a future implementation of Drill into its product lineup would add even more legitimacy. Not wanting to cede the innovation edge to Cloudera of MapR, one has to suspect <a href="http://gigaom.com/cloud/hortonworks-teams-with-vmware-to-keep-hadoop-running/">Yahoo spinoff Hortonworks</a> will also get into the query engine game at some point. (We&#8217;ll leave the debate over whether the myriad different flavors of Hadoop constitute the beginning of a community fracture for another day.)</p>
<p>Like Cloudera, however, if MapR and Hortonworks decide to integrate query engines in their products, they&#8217;ll likely rely on application providers to deliver the user experience on top. For better or worse, that presently means reliance on legacy vendors until startups can get familiar with the source code and start building BI products designed to take advantage of the new capabilities. When asked about Impala as a technology for disrupting the traditional data warehouse market, Cloudera&#8217;s Zedlewski noted that existing products are often very good at what they do.</p>
<p>&#8220;I think it&#8217;s highly unlikely that something like Impala would really be considered an alternative of that,&#8221; he said. Those vendors don&#8217;t seem to think so either, as companies like Teradata and EMC Greenplum (e emc) are <a href="http://gigaom.com/cloud/emc-throws-lots-of-hardware-at-hadoop/">telling always-improving stories</a> about integrating their existing product lines with Hadoop.</p>
<div id="attachment_576706" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/10/2-drill_down-11.jpg"><img  title="2-drill_down-1" alt="" src="http://gigaom2.files.wordpress.com/2012/10/2-drill_down-11.jpg?w=300&#038;h=144" height="144" width="300" class="size-medium wp-image-576706" /></a><p class="wp-caption-text">Running a sentiment analysis in Tableau with Hadapt</p></div>
<p>On the other end of the spectrum are startups such as Hadapt, Platfora and <a href="http://gigaom.com/data/batten-down-the-analysts-its-a-big-data-bi-storm/">Birst</a>, which have built Hadoop-based query engines on their own, independent of loyalty to any particular Hadoop distribution. These companies have a lot of smart people on board, and their technologies are for real. Platfora CEO Ben Werther, in particular, makes no bones about his goal of unseating the BI incumbents with analytics applications built from the ground up to analyze big data stored in Hadoop.</p>
<p>Similar, although not necessarily competitive, technologies include <a href="http://gigaom.com/cloud/how-one-startup-wants-to-inject-hadoop-into-your-sql/">Spire (from Drawn to Scale)</a> and <a href="http://www.splicemachine.com">Splice Machine</a>. Both support some level of SQL querying and/or BI integration, although their real value comes in leveraging HBase to provide transactional capabilities that analytic databases aren&#8217;t designed to do.</p>
<p>Even though all these choices and approaches might add to the confusion over how to use Hadoop and which products to choose, the result is a net gain for Hadoop <a href="http://gigaom.com/cloud/the-state-of-hadoop-strong-and-poised-to-explode/">as the de facto platform for big data environments</a> even in the face of some alternative approaches. It has changed from a batch system to an interactive query engine pretty much overnight, so although he wouldn&#8217;t comment on the competition, Zedlewski wasn&#8217;t just blowing vendor smoke when told me, &#8220;I would argue Impala is a proof point that Hadoop as a platform has an ability to grow that no other data management platform has.&#8221;</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=576626&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=40979"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=40979" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=576626+cloudera-makes-sql-a-first-class-citizen-in-hadoop&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=576626+cloudera-makes-sql-a-first-class-citizen-in-hadoop&utm_content=dharrisstructure">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=576626+cloudera-makes-sql-a-first-class-citizen-in-hadoop&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/03/why-service-providers-matter-for-the-future-of-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=576626+cloudera-makes-sql-a-first-class-citizen-in-hadoop&utm_content=dharrisstructure">Why service providers matter for the future of big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/10/24/cloudera-makes-sql-a-first-class-citizen-in-hadoop/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/impala1-e1351083747709.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/impala1-e1351083747709.jpg?w=150" medium="image">
			<media:title type="html">impala</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/10/impala.jpg" medium="image">
			<media:title type="html">impala</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/10/2-drill_down-11.jpg?w=300" medium="image">
			<media:title type="html">2-drill_down-1</media:title>
		</media:content>
	</item>
		<item>
		<title>Hadapt hints at the future of Hadoop and BI</title>
		<link>http://gigaom.com/2012/10/16/hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future/</link>
		<comments>http://gigaom.com/2012/10/16/hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future/#comments</comments>
		<pubDate>Tue, 16 Oct 2012 14:00:56 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[tableau]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=572521</guid>
		<description><![CDATA[Hadoop startup Hadapt has made its unified Hadoop-and-SQL analytic architecture even easier by adding native advanced analytic functions and integrating tightly with Tableau's powerful BI software. It's a sign of things to come as Hadoop and traditional SQL-based BI become cozy across the board.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=572521&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Boston-based startup <a href="http://www.hadapt.com">Hadapt</a> is practicing its own form of polygamy &#8212; bringing as many piece of the data ecosystem as possible under its roof &#8212; and the resulting union looks pretty nice. In version 2.o of its Adaptive Analytical Platform, the company is expanding on its original premise of a unified architecture for Hadoop and SQL by adding advanced analytic functions and a tight integration with business-intelligence favorite Tableau Software.</p>
<p>The new capabilities are a big step for Hadapt, which <a href="http://gigaom.com/cloud/hadapt-raises-9-5m-for-hadoop-data-warehouse/">launched less than two years ago</a> at our inaugural Structure: Data conference, and foretell a coming trend in the Hadoop marketplace. In a nutshell, the new version ships with some standard analytical functions such as sentiment analysis and funnel analysis, as well as a development kit to help users write their own. And users can write the functions in standard SQL rather than trying to compose them in MapReduce or any other framework designed specifically to work with Hadoop. (For a more-technical explanation of Hadapt 2.0 and how it stacks up against traditional analytic databases, you might want to check out database industry analyst Curt Monash&#8217;s <a href="http://www.dbms2.com/2012/10/16/hadapt-version-2/">blog post on it</a>.)</p>
<p>The real beauty, however, might be the integration with Tableau, which has <a href="http://gigaom.com/cloud/thanks-to-consumerization-its-ipo-season-in-analytics/">become a darling of business analysts everywhere</a> who appreciate its point-and-click analytic functions and beautiful visualizations. Hadapt CTO Philip Wickline told me during a recent call that Tableau is &#8220;overwhelmingly the choice&#8221; for BI within Hadapt customer accounts. I received a demo of Hadapt functions exposed via Tableau&#8217;s interface, and it was impressive to watch someone run and visualize an Apache Mahout-based sentiment analysis literally by making a few mouse clicks.</p>
<div id="attachment_572955" class="wp-caption aligncenter" style="width: 614px"><a href="http://gigaom2.files.wordpress.com/2012/10/2-drill_down-1.jpg"><img  title="2-drill_down (1)" alt="" src="http://gigaom2.files.wordpress.com/2012/10/2-drill_down-1.jpg?w=604&#038;h=290" height="290" width="604" class="size-large wp-image-572955" /></a><p class="wp-caption-text">Drilling down on a social-media analysis job with Tableau</p></div>
<h2>Why Hadoop and BI need each other</h2>
<p>Hadapt isn&#8217;t perfect, and Tableau isn&#8217;t the be-all, end-all of analytics and visualization software, but this new feature set is a sign of how we should expect to see Hadoop evolve in the near future. Historically speaking, Hadoop is slow compared with the interactive query capabilities of relational analytic databases. It&#8217;s also not easy to use if you&#8217;re relegated to writing MapReduce jobs, and there&#8217;s no native capability for visualizing the results of those jobs.</p>
<p>On the other hand, specialized analytic tools such as Tableau and even machine-data master Splunk have been known to crumple under the weight of the massive data sets that Hadoop was designed to handle. Splunk has <a href="http://gigaom.com/cloud/splunk-connects-with-hadoop-to-master-machine-data/">built Hadoop into its Splunk Enterprise product</a> in order to enable easier processing of huge data sets. Furthermore, Hadoop is almost a necessity for adding structure to unstructured data and making it analyzable by BI tools and relational databases at all.</p>
<p>All this has some industry watchers <a href="http://gigaom.com/cloud/why-the-days-are-numbered-for-hadoop-as-we-know-it/">wondering whether Hadoop is really the answer</a> as big data users increasingly require capabilities such as real-time processing and rapid interaction with data. Startups such as Precog are <a href="http://gigaom.com/data/startup-precog-says-big-data-doesnt-need-to-be-so-complex/">trying to answer these concerns by building their own analytic tools</a> from scratch that don&#8217;t rely on Hadoop at all to handle even rather large data sets. Google evolved has from its MapReduce processing roots and build tool such as Dremel (<a href="http://gigaom.com/cloud/google-opens-up-its-biq-query-data-analytics-service-to-all/">productized as Big Query</a>), Percolator and Pregel.</p>
<p>Of course, no amount of work wholly outside the world of Hadoop is going to change the amount being done with Hadoop as the foundation of a BI and SQL revolution. A startup called Platfora is <a href="http://gigaom.com/cloud/platfora-gets-5-7m-to-make-hadoop-mainstream/">promising a product</a> that turns Hadoop into an engine for an entirely new type of BI experience. Hadoop-distribution vendor MapR is <a href="http://gigaom.com/cloud/for-fast-interactive-hadoop-queries-drill-may-be-the-answer/">driving an open source project called Apache Drill</a> that replicates Google&#8217;s Dremel on top of Hadoop. Outside Hadoop-based startups, large companies such as <a href="http://gigaom.com/cloud/emc-throws-lots-of-hardware-at-hadoop/">EMC Greenplum</a>, <a href="http://gigaom.com/cloud/microsofts-hadoop-play-is-shaping-up-and-it-includes-excel/">Microsoft</a>, <a href="http://gigaom.com/cloud/teradata-taps-hortonworks-to-improve-hadoop-story/">Teradata</a> and others have at least decided to make Hadoop a first-class citizen in their big data work, even where they have other legacy products to push as well.</p>
<p>Daniel Abadi, Hadapt co-founder and chief scientist, and Yale professor, says that just because you might circumvent MapReduce to increase the speed of processing different types of jobs, &#8220;that doesn&#8217;t mean you have to circumvent Hadoop itself.&#8221; Hadapt, he added, actually emerged from a research project called <a href="http://db.cs.yale.edu/hadoopdb/hadoopdb.html">HadoopDB</a> that had been trying to make Hadoop more interactive since its inception in 2009. Three years later, it&#8217;s clear Abadi and company were onto something.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-607447p1.html">Shutterstock user FWStudio</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=572521&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=353453"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=353453" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=572521+hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=572521+hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=572521+hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li><li><a href="http://pro.gigaom.com/report/how-to-manage-big-data-without-breaking-the-bank/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=572521+hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future&utm_content=dharrisstructure">How to manage big data without breaking the bank</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/10/16/hadapt-does-big-love-for-big-data-and-hints-at-hadoops-future/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_109440755-e1350299777271.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_109440755-e1350299777271.jpg?w=150" medium="image">
			<media:title type="html">Elephant sunlight</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/10/2-drill_down-1.jpg?w=604" medium="image">
			<media:title type="html">2-drill_down (1)</media:title>
		</media:content>
	</item>
		<item>
		<title>Calvin: A fast, cheap database that isn&#8217;t a database at all</title>
		<link>http://gigaom.com/2012/05/16/calvin-a-fast-cheap-database-that-isnt-a-database-at-all/</link>
		<comments>http://gigaom.com/2012/05/16/calvin-a-fast-cheap-database-that-isnt-a-database-at-all/#comments</comments>
		<pubDate>Wed, 16 May 2012 18:12:47 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[Calvin]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Hadapr]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[NewSQL]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[Relational database]]></category>
		<category><![CDATA[scale-out]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=522185</guid>
		<description><![CDATA[Yale researchers Daniel Abadi and Alexander Thomson think they have developed the cure for Oracle and IBM dominance in the world of database performance, and it isn't even technically a database. The two have created a system they think can level the playing field.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=522185&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/05/databases.jpg"><img  title="databases" src="http://gigaom2.files.wordpress.com/2012/05/databases.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="alignleft size-medium wp-image-522290" /></a><del>Yale researchers Daniel Abadi and Alexander Thomson</del> A team of Yale researchers think they have developed the cure for Oracle and IBM dominance in the world of database performance, and it isn&#8217;t even technically a database. In a blog post Wednesday morning written by team members Daniel Abadi and Alexander Thomson (and in a related research paper), <a href="http://dbmsmusings.blogspot.com/2012/05/if-all-these-new-dbms-technologies-are.html">the two researchers detail Calvin,</a> a &#8220;transaction scheduling and replication coordination service&#8221; that they think can level the playing field between high-cost distributed relational databases and less-expensive, but limited, NoSQL and NewSQL databases.</p>
<p><del>Abadi and Thomson</del>The researchers aren&#8217;t dismissing either NoSQL or NewSQL, but rather attempting to address the type of use case on which the popular <a href="http://www.tpc.org/tpcc/detail.asp">TPC-C database peformance benchmark</a> is based. That benchmark, which simulates an online retail application, requires ACID compliance &#8212; which NoSQL options can&#8217;t meet &#8212; and the ability to update records across database shards in the same transaction &#8212; something the authors claim <a href="http://gigaom.com/cloud/is-stonebraker-right-why-sql-isnt-the-choice-du-jour-for-many-apps/">NewSQL databases</a> can&#8217;t do.</p>
<p>Why not just stick with Oracle Database and IBM DB2? Cost, especially at scale. As Abadi and Thomson point out in the blog, an Oracle system capable of handling 500,000 transactions per second costs $30 million in hardware and software expenditures.</p>
<p>So, what is Calvin? In a nutshell, it&#8217;s software that sits above above a scale-out storage system and turns it into a transaction-processing system by capturing, scheduling and executing transactions. Here&#8217;s how Abadi and Thomson describe it in the blog post, allthough <a href="http://cs-www.cs.yale.edu/homes/dna/papers/calvin-sigmod12.pdf">the paper goes into much more detail</a>.</p>
<blockquote><p>Calvin requires all transactions to be executed fully server-side and sacrifices the freedom to non-deterministically abort or reorder transactions on-the-fly during execution. In return, Calvin gets scalability, ACID-compliance, and extremely low-overhead multi-shard transactions over a shared-nothing architecture. In other words, Calvin is designed to handle high-volume OLTP throughput on sharded databases on cheap, commodity hardware stored locally or in the cloud. &#8230; Calvin allows user transaction code to access the data layer freely, using any data access language or interface supported by the underlying storage engine (so long as Calvin can observe which records user transactions access).</p></blockquote>
<p><a href="http://gigaom2.files.wordpress.com/2012/05/calvin.jpg"><img  title="calvin" src="http://gigaom2.files.wordpress.com/2012/05/calvin.jpg?w=708" alt=""   class="aligncenter size-full wp-image-522292" /></a></p>
<p>Calvin, the researchers claim, can match Oracle&#8217;s 500,000 transaction-per-second performance running on commodity servers on Amazon EC2. The cost of the resources to run their benchmark was only $300. (Although, obviously, that doesn&#8217;t account for the cost of running the system continuously for years, potentially. Commodity physical hardware might be a better bet in the long term.)</p>
<p>Ultimately, <del>Abadi and Thomson</del> the researchers conclude, for transactions that can execute entirely on the server side, Calvin could be the foundation for an end to the current OLTP regime. The world certainly is hungry for something that can do what Oracle and IBM can do, but that costs what NoSQL databases cost (i.e., nothing, often). And Abadi has some distributed database street cred &#8212; the HadoopDB project he led is the foundation of <a href="http://gigaom.com/cloud/hadapt-raises-9-5m-for-hadoop-data-warehouse/">Hadapt&#8217;s Hadoop-and-data-warehouse hybrid</a> &#8212; so, especially if it&#8217;s open sourced, one can&#8217;t dismiss Calvin out of hand.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-219154p1.html">Shutterstock user Semisatch</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=522185&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=977494"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=977494" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=522185+calvin-a-fast-cheap-database-that-isnt-a-database-at-all&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=522185+calvin-a-fast-cheap-database-that-isnt-a-database-at-all&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/cloud-and-data-first-quarter-2013-analysis-and-outlook/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=522185+calvin-a-fast-cheap-database-that-isnt-a-database-at-all&utm_content=dharrisstructure">Cloud and data first-quarter 2013: analysis and outlook</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=522185+calvin-a-fast-cheap-database-that-isnt-a-database-at-all&utm_content=dharrisstructure">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/05/16/calvin-a-fast-cheap-database-that-isnt-a-database-at-all/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/05/databases.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/05/databases.jpg?w=150" medium="image">
			<media:title type="html">databases</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/05/databases.jpg?w=300" medium="image">
			<media:title type="html">databases</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/05/calvin.jpg" medium="image">
			<media:title type="html">calvin</media:title>
		</media:content>
	</item>
		<item>
		<title>The importance of putting the U and I in visualization</title>
		<link>http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/</link>
		<comments>http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/#comments</comments>
		<pubDate>Fri, 04 May 2012 06:55:34 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[1010data]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[apache-hive]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Aster Data Systems]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[BigQuery]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[ClearStory]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[Datameer]]></category>
		<category><![CDATA[datamine]]></category>
		<category><![CDATA[DataStax]]></category>
		<category><![CDATA[dive]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[infochimps]]></category>
		<category><![CDATA[infochimps-platform]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[jaspersoft]]></category>
		<category><![CDATA[Kontagent]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Mortar Data]]></category>
		<category><![CDATA[odbc]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[pentaho]]></category>
		<category><![CDATA[pig]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[qliktech]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[spreadsheets]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Stanford]]></category>
		<category><![CDATA[Startups]]></category>
		<category><![CDATA[tableau]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[UI]]></category>
		<category><![CDATA[UIS]]></category>
		<category><![CDATA[User interface]]></category>
		<category><![CDATA[vc]]></category>
		<category><![CDATA[venture capital]]></category>
		<category><![CDATA[wukong]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=104734</guid>
		<description><![CDATA[Ask a VC about big data and she will probably tell you about visualization of the user interface. We're talking about intuitive UIs that let users visually work with data using charts and tools, not algorithms. It's hard to do right, but the payoff could be huge.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=517773&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Ask a venture capitalist about big data and she will probably tell you about visualization. Only it won&#8217;t be visualization in the usual sense. Instead, it will be about visualization of the user interface. We&#8217;re talking about strikingly intuitive UIs that let users visually work with data using charts and tools instead of with algorithms and code. It&#8217;s hard work to do right — especially when you&#8217;re talking about massive data sets and complex computations — but the payoff could be huge for businesses.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=517773&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=725546"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=725546" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=517773+the-importance-of-putting-the-u-and-i-in-visualization&utm_content=gigaguest">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=517773+the-importance-of-putting-the-u-and-i-in-visualization&utm_content=gigaguest">Infrastructure Q1: Cloud and big data woo enterprises</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=517773+the-importance-of-putting-the-u-and-i-in-visualization&utm_content=gigaguest">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=517773+the-importance-of-putting-the-u-and-i-in-visualization&utm_content=gigaguest">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/4411542bbd7a2a9a2fc2a1b38809e45c?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaguest</media:title>
		</media:content>
	</item>
		<item>
		<title>Why service providers matter for the future of big data</title>
		<link>http://pro.gigaom.com/2012/03/why-service-providers-matter-for-the-future-of-big-data/</link>
		<comments>http://pro.gigaom.com/2012/03/why-service-providers-matter-for-the-future-of-big-data/#comments</comments>
		<pubDate>Thu, 22 Mar 2012 06:55:34 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/derrickharris/" rel="author">Derrick Harris</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[33across]]></category>
		<category><![CDATA[algorithm-specialists]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[analytics-as-a-service]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[big-data-outsourcing]]></category>
		<category><![CDATA[big-data-service-providers]]></category>
		<category><![CDATA[BloomReach]]></category>
		<category><![CDATA[Cetas]]></category>
		<category><![CDATA[Cisco]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[data scientists]]></category>
		<category><![CDATA[data-analytics]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hortonworks]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[impetus]]></category>
		<category><![CDATA[infochimps]]></category>
		<category><![CDATA[logicworks]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[metascale]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[mu-sigma]]></category>
		<category><![CDATA[Netflix]]></category>
		<category><![CDATA[nuevora]]></category>
		<category><![CDATA[Opera Solutions]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parse.ly]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[profitero]]></category>
		<category><![CDATA[Rackspace]]></category>
		<category><![CDATA[redgiant-analytics]]></category>
		<category><![CDATA[scale-unlimited]]></category>
		<category><![CDATA[sears]]></category>
		<category><![CDATA[SGI]]></category>
		<category><![CDATA[Skytree]]></category>
		<category><![CDATA[software as a service]]></category>
		<category><![CDATA[Sourcefire]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[think-big-analytics]]></category>
		<category><![CDATA[web analytics]]></category>
		<category><![CDATA[WibiData]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=102032</guid>
		<description><![CDATA[One solution to the big data skills shortage has been consulting firms that specialize in deploying big data systems companies need to make sense of their information. These companies will continue to play a vital role in helping us make sense of the the data deluge.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=502479&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>One major solution to the big data skills shortage has been the emergence of consulting and outsourcing firms specializing in deploying big data systems that companies need in order to actually derive value from their information. These companies will continue to play a vital role in helping the greater corporate world make sense of the mountains of data they are collecting. However, if the current wave of democratizing big data lives up to its ultimate potential, today’s consultants and outsourcers will have to find a way to keep a few steps ahead of the game in order to remain relevant.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=502479&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=997293"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=997293" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=502479+why-service-providers-matter-for-the-future-of-big-data&utm_content=gigaedit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=502479+why-service-providers-matter-for-the-future-of-big-data&utm_content=gigaedit">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-and-data-second-quarter-2012-analysis-and-outlook-2/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=502479+why-service-providers-matter-for-the-future-of-big-data&utm_content=gigaedit">Takeaways from the second quarter in cloud and data</a></li><li><a href="http://pro.gigaom.com/2011/07/infrastructure-q2-big-data-and-paas-gain-more-momentum/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=502479+why-service-providers-matter-for-the-future-of-big-data&utm_content=gigaedit">Infrastructure Q2: Big data and PaaS gain more momentum</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2012/03/why-service-providers-matter-for-the-future-of-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>A near-term outlook for big data</title>
		<link>http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/</link>
		<comments>http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/#comments</comments>
		<pubDate>Wed, 21 Mar 2012 06:55:20 +0000</pubDate>
		<dc:creator>Krish</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[33across]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[AOL]]></category>
		<category><![CDATA[Apache Foundation]]></category>
		<category><![CDATA[apache-hadoop]]></category>
		<category><![CDATA[apixio]]></category>
		<category><![CDATA[AppFog]]></category>
		<category><![CDATA[AstraZeneca]]></category>
		<category><![CDATA[AT&T]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[big-data-outsourcing]]></category>
		<category><![CDATA[BloomReach]]></category>
		<category><![CDATA[Blue Button]]></category>
		<category><![CDATA[Bristol-Myers Squibb]]></category>
		<category><![CDATA[BYD]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[CBRE Group]]></category>
		<category><![CDATA[cdata-quality]]></category>
		<category><![CDATA[Cetas]]></category>
		<category><![CDATA[Cisco]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Cloudant]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Comcast]]></category>
		<category><![CDATA[connected devices]]></category>
		<category><![CDATA[Consert]]></category>
		<category><![CDATA[data privacy]]></category>
		<category><![CDATA[data processing]]></category>
		<category><![CDATA[data science]]></category>
		<category><![CDATA[data scientists]]></category>
		<category><![CDATA[data storage]]></category>
		<category><![CDATA[data-analytics]]></category>
		<category><![CDATA[data-as-a-service]]></category>
		<category><![CDATA[data-governance]]></category>
		<category><![CDATA[data-markets]]></category>
		<category><![CDATA[data-obesity]]></category>
		<category><![CDATA[data-quality]]></category>
		<category><![CDATA[data-quality-dimensions]]></category>
		<category><![CDATA[data-security]]></category>
		<category><![CDATA[DataFlux]]></category>
		<category><![CDATA[DataStax]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[DuPont]]></category>
		<category><![CDATA[E-ZPass]]></category>
		<category><![CDATA[EcoFactor]]></category>
		<category><![CDATA[Ecologic Analytics]]></category>
		<category><![CDATA[Electronic Medical Records]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[eMeter]]></category>
		<category><![CDATA[emrs]]></category>
		<category><![CDATA[ENBALA Power Networks]]></category>
		<category><![CDATA[energy-internet]]></category>
		<category><![CDATA[Enterprise Mobility]]></category>
		<category><![CDATA[enterprise-control-language]]></category>
		<category><![CDATA[enterprises]]></category>
		<category><![CDATA[Explorys]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Forbes]]></category>
		<category><![CDATA[Geisinger Health Systems]]></category>
		<category><![CDATA[ginger-io]]></category>
		<category><![CDATA[Global Pulse]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Greenplum]]></category>
		<category><![CDATA[grid storage]]></category>
		<category><![CDATA[GridMobility]]></category>
		<category><![CDATA[GroundedPower]]></category>
		<category><![CDATA[Group Health Cooperative]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[hadoop-stack]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[HDFS]]></category>
		<category><![CDATA[health care]]></category>
		<category><![CDATA[Hewlett-Packard]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[Honeywell]]></category>
		<category><![CDATA[Hortonworks]]></category>
		<category><![CDATA[hpcc]]></category>
		<category><![CDATA[Humedica]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[ibm-watson]]></category>
		<category><![CDATA[IDC]]></category>
		<category><![CDATA[impetus]]></category>
		<category><![CDATA[infochimps]]></category>
		<category><![CDATA[informatica]]></category>
		<category><![CDATA[infrastructure as a service]]></category>
		<category><![CDATA[intelligent-applications]]></category>
		<category><![CDATA[Intermountain Healthcare]]></category>
		<category><![CDATA[jeopardy]]></category>
		<category><![CDATA[kaiser-permanente]]></category>
		<category><![CDATA[Landis+Gyr]]></category>
		<category><![CDATA[lexisnexis]]></category>
		<category><![CDATA[LinkedIn]]></category>
		<category><![CDATA[logicworks]]></category>
		<category><![CDATA[M2M]]></category>
		<category><![CDATA[machine-to-machine]]></category>
		<category><![CDATA[MapR Technologies]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[mayo clinic]]></category>
		<category><![CDATA[McKinsey]]></category>
		<category><![CDATA[metascale]]></category>
		<category><![CDATA[meter-data-management-systems]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[mobile carriers]]></category>
		<category><![CDATA[mobile health]]></category>
		<category><![CDATA[mu-sigma]]></category>
		<category><![CDATA[National Cancer Institute]]></category>
		<category><![CDATA[Netflix]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[Nuance Communications]]></category>
		<category><![CDATA[nuevora]]></category>
		<category><![CDATA[oozie]]></category>
		<category><![CDATA[Opera Solutions]]></category>
		<category><![CDATA[OPower]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Parse.ly]]></category>
		<category><![CDATA[patientslikeme]]></category>
		<category><![CDATA[Pervasive]]></category>
		<category><![CDATA[Pfizer]]></category>
		<category><![CDATA[pig]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[Platform as a Service]]></category>
		<category><![CDATA[private clouds]]></category>
		<category><![CDATA[profitero]]></category>
		<category><![CDATA[Public Clouds]]></category>
		<category><![CDATA[Rackspace]]></category>
		<category><![CDATA[Recurve]]></category>
		<category><![CDATA[Red Hat]]></category>
		<category><![CDATA[redgiant-analytics]]></category>
		<category><![CDATA[Regulated Industries]]></category>
		<category><![CDATA[Salesforce.com]]></category>
		<category><![CDATA[SAP]]></category>
		<category><![CDATA[scale-unlimited]]></category>
		<category><![CDATA[scienergy]]></category>
		<category><![CDATA[Sears Holding Corporation]]></category>
		<category><![CDATA[service providers]]></category>
		<category><![CDATA[SGI]]></category>
		<category><![CDATA[Siemens]]></category>
		<category><![CDATA[Silver Spring Networks]]></category>
		<category><![CDATA[Skytree]]></category>
		<category><![CDATA[Smart Grid]]></category>
		<category><![CDATA[smart meters]]></category>
		<category><![CDATA[software as a service]]></category>
		<category><![CDATA[Sourcefire]]></category>
		<category><![CDATA[Sprint]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[structured data]]></category>
		<category><![CDATA[Systemcon]]></category>
		<category><![CDATA[T-Mobile]]></category>
		<category><![CDATA[talend]]></category>
		<category><![CDATA[Target]]></category>
		<category><![CDATA[targeted-advertising]]></category>
		<category><![CDATA[Tendril]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[The Internet of Things]]></category>
		<category><![CDATA[think-big-analytics]]></category>
		<category><![CDATA[Toshiba]]></category>
		<category><![CDATA[Trillium]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[unstructured data]]></category>
		<category><![CDATA[utilities]]></category>
		<category><![CDATA[Verizon Wireless]]></category>
		<category><![CDATA[VoltDB]]></category>
		<category><![CDATA[Wal-Mart]]></category>
		<category><![CDATA[WellPoint]]></category>
		<category><![CDATA[whirlpool]]></category>
		<category><![CDATA[WibiData]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[yelp]]></category>
		<category><![CDATA[zettaset]]></category>
		<category><![CDATA[zookeeper]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=101786</guid>
		<description><![CDATA[Big data now touches everything from enterprises to smart-meter startups, while Hadoop is fast becoming the leading tool to analyze that data, and debates around privacy abound. GigaOM Pro analysts offer insights on what to consider when it comes to big data decisions for your business.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=501896&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Big data now touches everything from enterprises and hospitals to smart-meter startups and connected devices in the home. Hadoop, meanwhile, is fast becoming the leading tool to analyze that data, and there is the ever-lingering question of privacy and how we, the technology industry, are responsible for teaching ethical ways to collect and regulate our data. This report, composed of eight different sections each written by a GigaOM Pro analyst, offers insights on what to consider when it comes to big data decisions for your business.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=501896&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=624175"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=624175" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=501896+a-near-term-outlook-for-big-data&utm_content=iamkrishnan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/why-service-providers-matter-for-the-future-of-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=501896+a-near-term-outlook-for-big-data&utm_content=iamkrishnan">Why service providers matter for the future of big data</a></li><li><a href="http://pro.gigaom.com/2011/07/infrastructure-q2-big-data-and-paas-gain-more-momentum/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=501896+a-near-term-outlook-for-big-data&utm_content=iamkrishnan">Infrastructure Q2: Big data and PaaS gain more momentum</a></li><li><a href="http://pro.gigaom.com/2012/04/sector-roadmap-hadoop-platforms-2012/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=501896+a-near-term-outlook-for-big-data&utm_content=iamkrishnan">2012: The Hadoop infrastructure market booms</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/03/datacenter.jpg?w=150" />
		<media:content url="https://gigaom-pro-files.s3.amazonaws.com/files/2012/03/datacenter.jpg?w=150" medium="image">
			<media:title type="html">datacenter</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/04f327f032df043846baa7474b8e6aff?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">Krish</media:title>
		</media:content>
	</item>
		<item>
		<title>Hadoop jumps through hoops, becomes mainstream</title>
		<link>http://gigaom.com/2012/03/03/hadoop-jumps-through-hoops-becomes-mainstream/</link>
		<comments>http://gigaom.com/2012/03/03/hadoop-jumps-through-hoops-becomes-mainstream/#comments</comments>
		<pubDate>Sat, 03 Mar 2012 17:01:24 +0000</pubDate>
		<dc:creator>Matt Howard, Norwest Venture Partners</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Hadapt]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[Hortonworks]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[Matt Howard]]></category>
		<category><![CDATA[Norwest Venture Partners]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=493240</guid>
		<description><![CDATA[Matt Howard of Norwest Venture Partners predicts that 2012 and 2013 will be Hadoop’s breakout years. Howard gives us insight into the five factors that will accelerate Hadoop’s mainstream adoption over the next 18 months.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=493240&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom.com/cloud/ibm-doing-hadoop-as-a-service-in-its-cloud/hadoop-6/" rel="attachment wp-att-426524"><img  title="hadoop" src="http://gigaom2.files.wordpress.com/2011/10/hadoop1.jpg?w=708" alt=""   class="alignleft size-full wp-image-426524" /></a>One of the things I love most about the software industry is the way new technologies can materialize from unlikely places and get applied in unexpected ways. <a href="http://hadoop.apache.org/">Hadoop</a> is a great example of this. Conceived by the open source community, <a href="http://www.google.com/">Google</a>, <a href="http://www.yahoo.com">Yahoo</a> and others, this programming framework has emerged as a promising solution to the big data problem.</p>
<p>I expect Hadoop to become enterprise-ready within the next 18 months. Encouraged by the arrival of innovative Hadoop vendors, many Fortune 500 companies — including <a href="http://www.ebay.com">eBay</a>, <a href="http://www.bofa.com">Bank of America</a> and <a href="http://www.jpmorgan.com">JP Morgan</a> — are experimenting with Hadoop deployments. As a technologist and an investor in this sector (<a href="http://www.nvp.com/">Norwest Venture Partners</a>, where I am a general partner, is an investor in <a href="http://www.hadapt.com/">Hadapt</a>), I believe these investigations are quickly evolving into serious roll-outs. The following five key factors will accelerate mainstream adoption, making 2012 and 2013 Hadoop’s breakout years.</p>
<ul>
<li> <strong>1.  </strong><strong>SQL provides a “fast pass” to Hadoop</strong></li>
</ul>
<p>The first hurdle Hadoop must clear is the stigma of its origins. As a product of the open source community, Hadoop and its countless siblings are regarded by traditional IT shops with confusion, suspicion, or even abject terror. Whatever their potential, these revolutionary interlopers threaten huge investments in expensive applications and proprietary technologies.</p>
<p>An SQL interface can help bridge the gap between the future, current and legacy technologies. Organizations are already purchasing Hadoop tools that offer various levels of SQL compatibility. We expect Hadoop to acquire deeper and deeper SQL support — and <a href="http://hive.apache.org/">Hive</a>, an open source SQL interface for Hadoop<strong>,</strong> is a good start.</p>
<p>In the next 18 months, I think we will see large retailers, financial services, Wall Street and the government using this “fast pass” SQL option to initiate much broader Hadoop deployments.</p>
<ul>
<li><strong>2.  </strong><strong>Hadoop performance gets a big boost</strong></li>
</ul>
<p>One of the leading reasons to use Hadoop is its extreme scalability. To date, that scalability has often come with significant performance penalties<strong>, </strong>including <a href="http://www.mapreduce.org/">MapReduce</a> query overhead and a storage layer that requires broad scans across file systems. If big data can’t produce information on demand, then it’s just an albatross.</p>
<p>Fortunately, the entire Hadoop industry — including a rapidly proliferating group of startups (<a href="http://www.cloudera.com">Cloudera</a>, <a href="http://www.hadapt.com">Hadapt,</a> <a href="http://www.hortonworks.com">Hortonworks</a>, <a href="http://www.mapr.com">MapR</a>), the amazingly innovative open source community, and such established vendors as <a href="http://www.ibm.com">IBM</a> — are aggressively tackling these performance issues. The forthcoming Hadoop v0.23 and subsequent releases will include performance-boosting enhancements, including basic file system performance, minimum MapReduce<strong> </strong>job latency, and higher-level query interface (e.g. Hive, <a href="http://pig.apache.org/">Apache Pig</a>) performance.</p>
<ul>
<li><strong>3.  </strong><strong>Hadoop becomes increasingly reliable</strong></li>
</ul>
<p>To avoid having a single point of failure, Hadoop needs to address topology and deployment concerns left over from its initial incarnation<em>. </em>Hadoop employs a master node to keep track of data and to determine how to access it. If this “brain” goes down, everything could be at risk without the correct topology and redundancy. Over time, the Hadoop community will make improvements in this area. Cloudera, Hortonworks, MapR and other commercial vendors are already addressing this.<strong></strong></p>
<ul>
<li><strong> </strong><strong>4.  </strong><strong>Mainstream case studies emerge</strong></li>
</ul>
<p>Hadoop is a grassroots phenomenon that emerged in the social networking and consumer Internet world. As always, there are early adopters who take risks on the cutting edge, and there are more conservative organizations watching the pioneers from the sidelines.</p>
<p>This played out in 2011 as early customer experiences with Hadoop were shared via conferences, online forums and vendor white papers. Experts think Hadoop is on the edge of a tipping point, as some of the earliest Hadoop implementers move from experiments to adoption. As a result, people implementing Hadoop today are benefiting from the lessons learned by the early pioneers.</p>
<p>In 2012 and 2013, we will see a growing body of case studies and the emergence of best practices as Hadoop technology matures and gets deployed in traditional enterprise environments. In short, Hadoop’s momentum will grow exponentially in the next 18 months.</p>
<p>If becoming mainstream is step four in the technology adoption process, Hadoop will move through step two and into step three this year and next.</p>
<ul>
<li><strong>5.  </strong><strong>The architecture evolves</strong></li>
</ul>
<p>Hadoop applications process vast amounts of data in parallel across many computers, relying upon MapReduce as the enabling distribution framework. Currently, Hadoop tightly couples distributed resource management and a single distributed programming paradigm (MapReduce) into one package. The Hadoop community is now decoupling the two functions. Separating these will provide more control over the different system functions and free up query processing.</p>
<p>Future releases of Hadoop will have an enhanced MapReduce framework and will feature a growing array of alternative distributed computing paradigms. Likely candidates include <a href="http://www.mcs.anl.gov/research/projects/mpi/">Message Passing Interface (MPI),</a> distributed shell systems, <a href="http://code.google.com/p/dremel/">OpenDremel</a> and <a href="http://www.bsp-worldwide.org/">Bulk Synchronous Parallel (BSP).</a> With these additional programming and distribution options, Hadoop will be able to support an even greater variety of workloads.</p>
<p><strong>Hadoop is here to stay</strong></p>
<p>Over the next few years, Hadoop will become a common component of the standard IT tool belt. To meet this demand, vendors are starting to package Hadoop<strong> </strong>into commercial off-the-shelf software (COTS).</p>
<p>Hadoop adoption will build on itself as organizations augment Hadoop solutions and grow ecosystems around them. Before our very eyes, Hadoop is becoming a platform.</p>
<p><a href="http://www.nvp.com/Team/Partners/Matthew%20Howard.aspx"><em>Matt Howard</em></a><em> is a general partner at </em><a href="http://www.nvp.com"><em>Norwest Venture Partners (NVP</em>)</a>,<em> where he invests in mobile and wireless, big data, security, rich media, networking and storage sectors. He currently serves on the boards of </em><a href="http://www.averesystems.com"><em>Avere Systems</em></a><em>, </em><a href="http://www.bluejeans.com"><em>Blue Jeans Network</em></a><em>, </em><a href="http://www.contextream.com"><em>ConteXtream</em></a><em>, </em><a href="http://www.hadapt.com/"><em>Hadapt,</em></a><em> </em><a href="http://www.mobileiron.com"><em>MobileIron</em></a><em>, </em><a href="http://www.retrevo.com"><em>Retrevo</em></a><em> and </em><a href="http://www.summitmicro.com/"><em>Summit Microelectronics</em></a><em>. He blogs at </em><a href="http://www.nvp-blog.com/"><em>NVP Blog</em></a><em>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=493240&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=552923"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=552923" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=493240+hadoop-jumps-through-hoops-becomes-mainstream&utm_content=aprilkilcrease">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=493240+hadoop-jumps-through-hoops-becomes-mainstream&utm_content=aprilkilcrease">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/report/sql-on-hadoop-roadmap-2013/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=493240+hadoop-jumps-through-hoops-becomes-mainstream&utm_content=aprilkilcrease">Sector RoadMap: SQL-on-Hadoop platforms in 2013</a></li><li><a href="http://pro.gigaom.com/2012/04/sector-roadmap-hadoop-platforms-2012/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=493240+hadoop-jumps-through-hoops-becomes-mainstream&utm_content=aprilkilcrease">2012: The Hadoop infrastructure market booms</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/03/hadoop-jumps-through-hoops-becomes-mainstream/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/10/hadoop1.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/10/hadoop1.jpg?w=150" medium="image">
			<media:title type="html">hadoop</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/f61183cf1974afda4981596f4a1e7cde?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">aprilkilcrease</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/10/hadoop1.jpg" medium="image">
			<media:title type="html">hadoop</media:title>
		</media:content>
	</item>
	</channel>
</rss>
