<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; Structure Data</title>
	<atom:link href="http://gigaom.com/tag/structure-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Wed, 22 May 2013 05:32:44 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; Structure Data</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>Structure:Data 2013 live coverage</title>
		<link>http://gigaom.com/2013/03/20/structuredata-2013-live-coverage/</link>
		<comments>http://gigaom.com/2013/03/20/structuredata-2013-live-coverage/#comments</comments>
		<pubDate>Wed, 20 Mar 2013 09:00:59 +0000</pubDate>
		<dc:creator>Tom Krazit</dc:creator>
				<category><![CDATA[EMC]]></category>
		<category><![CDATA[Paul Maritz]]></category>
		<category><![CDATA[Structure Data]]></category>
		<category><![CDATA[Structure Data 2013]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=622057</guid>
		<description><![CDATA[You can find all of our coverage of Structure:Data 2013 here, along with links to more info on the conference and a livestream of the action.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=622057&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>If you’re a regular reader of GigaOM you know that <a href="http://gigaom.com/channel/data/">we’re big believers in the power of data</a> — both big and, well, bigger — to change the way we think about our world. Until relatively recently, it hasn’t been possible to gather enough data at scale to truly understand the patterns around us. But as our smartphones turn us into walking sensors and we live ever-increasing amounts of our lives online, we’re generating massive amounts of data that can improve the health of a business (<a href="http://gigaom.com/2012/11/20/how-aetna-is-using-big-data-to-improve-patient-health/">or a person</a>), help us make <a href="http://gigaom.com/2012/10/23/a-hot-trend-in-cleantech-startups-targeting-energy-data-and-analytics/">rational decisions about resource prioritization</a> and finally convince NFL coaches that <a href="http://gigaom.com/2013/03/17/statwing-wants-to-make-your-data-and-armchair-quarterback-dreams-come-true/">they should go for it more often on fourth down</a>.</p>
<p><a href="http://pro.gigaom.com/do/structuredata2013-livestream-signup/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=622057+structuredata-2013-live-coverage&amp;utm_content=tkrazit"><img alt="GigaOM Structure:Data: Watch Live" src="http://gigaom2.files.wordpress.com/2013/03/structure-data_in-article_livestream_300x200_watch-live.png?w=708"   class="alignleft size-full wp-image-622420"></a>The key is developing <a href="http://gigaom.com/2013/03/12/hadoops-past-present-and-future-a-gigaom-special-report/">the tools and techniques</a> to properly collect, harvest, and analyze that data. On Wednesday and Thursday this week, we’ve gathered some of the top minds in this emerging field in New York at Structure:Data to help you understand how data and data analysis can improve your business and your life. They include <a href="http://gigaom.com/2013/03/19/the-world-is-ready-for-the-consumer-grade-enterprise/">Paul Maritz, chief strategist at EMC</a> and a leader of <a href="http://gigaom.com/2013/03/13/the-pivotal-initiative-in-case-you-were-wondering-is-now-official/">The Pivotal Initiative</a>; data experts from companies like Facebook, Microsoft, IBM and a host of startups; and even the CTO for the Central Intelligence Agency, Ira “Gus” Hunt.</p>
<p>More information about the conference <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=622057+structuredata-2013-live-coverage&amp;utm_content=tkrazit">can be found here</a>. If you can’t join us in New York, <a href="http://pro.gigaom.com/do/structuredata2013-livestream-signup/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=622057+structuredata-2013-live-coverage&amp;utm_content=tkrazit">follow along with the livestream here</a>, or join the conversation on Twitter with #dataconf. We’ll also post a roundup of all our coverage from what should be a very interesting two days of data.</p>
<p><strong>Wednesday</strong>:</p>
<ul><li><a href="http://gigaom.com/2013/03/20/forget-fico-how-data-is-changing-the-rules-of-credit-and-underwriting/">Forget FICO: how data is changing the rules of credit and underwriting </a></li>
<li><a href="http://gigaom.com/2013/03/20/kleiner-perkins-michael-abbott-it-takes-two-teams-to-build-a-successful-app/">Kleiner Perkins’ Michael Abbott: It takes two (teams) to build a successful app</a></li>
<li><a href="http://gigaom.com/2013/03/20/without-human-input-augmentation-algorithms-alone-are-making-us-dumber/">Without human input augmentation, algorithms alone are making us dumber</a></li>
<li><a href="http://gigaom.com/2013/03/20/sdn-can-turn-the-network-into-a-big-data-curator-claims-juniper/">SDN can turn the network into a big data “curator,” claims Juniper</a></li>
<li><a href="http://gigaom.com/2013/03/20/its-not-skynet-yet-in-machine-learning-theres-still-a-role-for-humans/">It’s not Skynet yet: In machine learning there’s still a role for humans </a></li>
<li><a href="http://gigaom.com/2013/03/20/data-science-is-not-enough-we-need-data-intelligence-too/">Data science is not enough. We need data intelligence too</a></li>
<li><a href="http://gigaom.com/2013/03/20/if-you-think-big-data-is-big-now-just-wait-for-the-internet-of-things/">If you think big data is big now, just wait for the internet of things</a></li>
<li><a href="http://gigaom.com/2013/03/20/emcs-paul-maritz-it-takes-leadership-to-move-companies-toward-a-data-driven-future/">EMC’s Paul Maritz: it takes leadership to move companies toward a data-driven future</a></li>
<li><a href="http://gigaom.com/2013/03/20/how-aetna-uses-patient-data-to-prevent-diabetes-and-heart-attacks/">How Aetna uses patient data to prevent diabetes and heart attacks</a></li>
<li><a href="http://gigaom.com/2013/03/20/even-the-cia-is-struggling-to-deal-with-the-volume-of-real-time-social-data/">Even the CIA is struggling to deal with the volume of real-time social data</a></li>
<li><a href="http://gigaom.com/2013/03/20/how-coding-contests-can-be-better-at-solving-problems-than-harvard/">How coding contests can be better at solving problems than Harvard</a></li>
<li><a href="http://gigaom.com/2013/03/20/for-big-data-achievements-it-and-analysts-need-to-work-together/">For big data achievements, IT and analysts need to work together</a></li>
<li><a href="http://gigaom.com/2013/03/20/big-data-is-still-hard-but-it-gets-better/">Big data is still hard, but it gets better</a></li>
<li><a href="http://gigaom.com/2013/03/20/from-amazons-top-data-geek-data-has-got-to-be-big-and-reproducible/">From Amazon’s top data geek: data has got to be big — and reproducible</a></li>
<li><a href="http://gigaom.com/2013/03/20/beyond-the-like-button-putting-social-networks-to-work-for-us/">Beyond the Like button: Putting social networks to work for us</a></li>
<li><a href="http://gigaom.com/2013/03/20/cloud-data-portability/">Do we need internet exchanges for public cloud data portability?</a></li>
<li><a href="http://gigaom.com/2013/03/20/people-will-give-up-their-personal-info-if-you-give-them-a-good-reason/">People will give up their personal info if you give them a good reason</a></li>
<li><a href="http://gigaom.com/2013/03/20/six-ideas-from-entrepreneurs-for-solving-your-big-data-problems/">Six ideas from entrepreneurs for solving your big-data problems</a></li>
</ul><p><strong>Thursday</strong>:</p>
<ul><li><a href="http://gigaom.com/2013/03/21/why-nuance-sees-the-semantic-web-as-a-key-to-smarter-natural-language-interfaces/">Why Nuance sees the semantic web as a key to smarter natural language interfaces</a></li>
<li><a href="http://gigaom.com/2013/03/21/big-data-analytics-is-great-but-its-no-a-silver-bullet/">Big data analytics is great but it’s no a silver bullet</a></li>
<li><a href="http://gigaom.com/2013/03/21/its-not-enough-to-just-have-information-intelligence-requires-context/">It’s not enough to just have information — intelligence requires context</a></li>
<li><a href="http://gigaom.com/2013/03/21/hadoop-its-damn-hard-to-use/">Hadoop: “It’s damn hard to use”</a></li>
<li><a href="http://gigaom.com/2013/03/21/want-a-bettergreeneragile-data-center-use-the-data/">Want a bigger/greener/more agile data center? Use the data</a></li>
<li><a href="http://gigaom.com/2013/03/21/getting-beyond-the-cult-of-big-data/">Getting beyond the cult of big data</a></li>
<li><a href="http://gigaom.com/2013/03/21/how-search-can-solve-big-data-problems/">How search can solve big data problems</a></li>
<li><a href="http://gigaom.com/2013/03/21/hadoop-applications-abound-but-hadoop-still-needs-improvement/">Hadoop applications abound, but Hadoop still needs improvement</a></li>
<li><a href="http://gigaom.com/2013/03/21/if-the-future-of-bi-is-hadoop-sql-and-the-cloud-are-the-glue/">If the future of BI is Hadoop, SQL and the cloud are the glue</a></li>
<li><a href="http://gigaom.com/2013/03/21/why-guavus-analyzes-lots-of-telecommunications-data-before-storing-it-all/">Why Guavus analyzes lots of telecommunications data before storing it all</a></li>
<li><a href="http://gigaom.com/2013/03/21/pursuing-big-data-utopia-what-realtime-interactive-analytics-could-mean-to-you/">Pursuing big data utopia: What realtime interactive analytics could mean to you</a></li>
<li><a href="http://gigaom.com/2013/03/21/nasdaq-on-the-virtues-of-the-public-cloud/">Nasdaq on the virtues of the public cloud</a></li>
<li><a href="http://gigaom.com/2013/03/21/no-not-every-database-was-created-equal-heres-how-theyre-stand-out/">No, not every database was created equal. Here’s how they stand out</a></li>
</ul>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=622057&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=571996"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=571996" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622057+structuredata-2013-live-coverage&utm_content=tkrazit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/04/aws-storage-gateway-jolts-cloud-storage-ecosystem/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622057+structuredata-2013-live-coverage&utm_content=tkrazit">AWS Storage Gateway jolts cloud-storage ecosystem</a></li><li><a href="http://pro.gigaom.com/2011/12/why-the-big-data-startup-boom-will-likely-be-short-lived/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622057+structuredata-2013-live-coverage&utm_content=tkrazit">Why the big data startup boom will likely be short-lived</a></li><li><a href="http://pro.gigaom.com/2010/09/the-red-hot-data-warehouse-market-whos-buying-next/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622057+structuredata-2013-live-coverage&utm_content=tkrazit">The Red-Hot Data Warehouse Market: Who&#8217;s Buying Next?</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/20/structuredata-2013-live-coverage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/structure-data-2013.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/structure-data-2013.jpg?w=150" medium="image">
			<media:title type="html">Structure Data 2013 audience</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/98a6e059487f51246e6d79c13e773447?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">tkrazit</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/structure-data_in-article_livestream_300x200_watch-live.png" medium="image">
			<media:title type="html">GigaOM Structure:Data: Watch Live</media:title>
		</media:content>
	</item>
		<item>
		<title>Hadoop&#8217;s past, present, and future: A GigaOM special report</title>
		<link>http://gigaom.com/2013/03/12/hadoops-past-present-and-future-a-gigaom-special-report/</link>
		<comments>http://gigaom.com/2013/03/12/hadoops-past-present-and-future-a-gigaom-special-report/#comments</comments>
		<pubDate>Tue, 12 Mar 2013 21:48:11 +0000</pubDate>
		<dc:creator>Tom Krazit</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Structure Data]]></category>
		<category><![CDATA[Structure Data 2013]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=619807</guid>
		<description><![CDATA[Check out our special retrospective on the history of Hadoop, one of the most powerful open-source data tools ever developed, in this post.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=619807&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>As we gear up for our Structure:Data conference next week in New York, we wanted to create a home for our four-part special report published over the course of last week on Hadoop, the big data engine that could. Links to the series follow below, and you can find <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=619807+hadoops-past-present-and-future-a-gigaom-special-report&amp;utm_content=tkrazit">more information on Structure:Data here</a>.</p>
<ul><li><a href="http://gigaom.com/2013/03/04/the-history-of-hadoop-from-4-nodes-to-the-future-of-data/">The history of Hadoop: From 4 nodes to the future of data</a></li>
<li><a href="http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/">The Hadoop ecosystem: the (welcome) elephant in the room (infographic)</a></li>
<li><a href="http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/">5 reasons why the future of Hadoop is real-time (relatively speaking)</a></li>
<li><a href="http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective/">Hadoop through the years: A GigaOM retrospective</a></li>
</ul>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=619807&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=551252"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=551252" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=619807+hadoops-past-present-and-future-a-gigaom-special-report&utm_content=tkrazit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=619807+hadoops-past-present-and-future-a-gigaom-special-report&utm_content=tkrazit">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/03/dont-hold-your-breath-for-a-single-big-data-stack/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=619807+hadoops-past-present-and-future-a-gigaom-special-report&utm_content=tkrazit">Don&#8217;t hold your breath for a single big data stack</a></li><li><a href="http://pro.gigaom.com/2012/03/big-data-budgets-on-the-rise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=619807+hadoops-past-present-and-future-a-gigaom-special-report&utm_content=tkrazit">Big data budgets on the rise</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/12/hadoops-past-present-and-future-a-gigaom-special-report/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" medium="image">
			<media:title type="html">gigaom hadoop icon final</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/98a6e059487f51246e6d79c13e773447?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">tkrazit</media:title>
		</media:content>
	</item>
		<item>
		<title>Hadoop through the years: A GigaOM retrospective</title>
		<link>http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective/</link>
		<comments>http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective/#comments</comments>
		<pubDate>Fri, 08 Mar 2013 13:00:11 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[Structure]]></category>
		<category><![CDATA[Structure Data]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=614707</guid>
		<description><![CDATA[We were there very early on for the birth of Hadoop and its maturation into a vital data analysis tool. Here's a look back at some of our best stories.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=614707&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A few years before we had a Structure:Data conference dedicated to big data — and, by proxy, Hadoop — GigaOM spotted Hadoop’s promise and began trying to spread the word about and advance the discussion around this groundbreaking technology. Now that Hadoop is 10 years old (give or take), we thought now would be a good time to look back on how Hadoop has influenced our events and editorial over years. This is the final installment in our four-part Hadoop anthology that has already covered its <a href="http://gigaom.com/2013/03/04/the-history-of-hadoop-from-4-nodes-to-the-future-of-data/">birth</a>, <a href="http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/">present</a> and <a href="http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/">future</a>.</p>
<p>Think about this like Hadoop’s greatest hits, but know that there will be more to come. Although the big data discussion is moving away from Hadoop somewhat, it’s still an integral — if not the integral — part of the discussion around data infrastructure. We have two great panels on Hadoop at our <a href="http://event.gigaom.com/structuredata/schedule/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=614707+hadoop-through-the-years-a-gigaom-retrospective&amp;utm_content=dharrisstructure">Structure:Data conference</a> March 20-21 in New York (which include participants from Facebook, Platfora, Continuuity and EMC’s Pivotal Initiative (whose leader Paul Maritz will also be speaking), among others) and will keep up with all things Hadoop and data for the next 10 years.</p>
<h2 id="the-biggest-news">The biggest news</h2>
<ol><li><a href="http://gigaom.com/2009/03/15/hadoop-focussed-startup-cloudera-raises-5-million/">Hadoop-focused startup Cloudera raises $5 million</a> (March 15, 2009)</li>
<li><a href="http://gigaom.com/2009/08/10/friends-on-the-move-hadoop-aol-paypal/">Friends on the move: Hadoop, AOL &amp; PayPal</a>  (Aug. 10, 2009)</li>
<li><a href="http://gigaom.com/2010/09/29/survey-hadoop-is-great-but-challenges-remain-2/">Survey: Hadoop is great, but challenges remain</a> (Sept. 29, 2010)</li>
<li><a href="http://gigaom.com/2011/03/17/yahoo-suggests-mapreduce-overhaul-to-improve-hadoop-performance/">Yahoo suggests MapReduce overhaul to improve Hadoop performance</a> (March 17, 2011)</li>
<li><a href="http://gigaom.com/2011/03/24/meet-mapr-a-competitor-to-hadoop-leader-cloudera/">Meet MapR, a competitor to Hadoop leader Cloudera</a> (March 24, 2011)</li>
<li><a href="http://gigaom.com/2011/05/09/emc-hadoop/">EMC makes a big bet on Hadoop</a> (May 9, 2011)</li>
<li><a href="http://gigaom.com/2011/06/27/exclusive-yahoo-launching-hadoop-spinoff-this-week/">Exclusive: Yahoo launching Hadoop spinoff this week</a> (June 27, 2011)</li>
<li><a href="http://gigaom.com/2012/02/28/microsofts-hadoop-play-is-shaping-up-and-it-includes-excel/">Microsoft’s Hadoop play is shaping up, and it includes Excel</a> (Feb. 28, 2012)</li>
<li><a href="http://gigaom.com/2012/06/13/vmware-aims-for-hadoop-on-vms-with-serengeti-project/">VMware aims for Hadoop on VMs with ‘Serengeti’ project</a> (June 13, 2012)</li>
<li><a href="http://gigaom.com/2012/10/24/cloudera-makes-sql-a-first-class-citizen-in-hadoop/">Cloudera makes SQL a first-class citizen in Hadoop</a> (Oct. 24, 2012)</li>
</ol><h2 id="the-best-analysis">The best analysis</h2>
<ol><li><a href="http://gigaom.com/2009/04/10/the-data-mining-renaissance/">The data mining renaissance</a> (April 10, 2009)</li>
<li><a href="http://gigaom.com/2009/10/02/is-hadoop-champion-cloudera-the-next-red-hat/">Is Hadoop champion Cloudera the next Red Hat?</a> (Oct. 2, 2009)</li>
<li><a href="http://gigaom.com/2010/08/01/meet-big-data-equivalent-of-the-lamp-stack/">Meet the big data equivalent of the LAMP stack</a> (Aug. 1, 2010)</li>
<li><a href="http://gigaom.com/2011/03/25/as-big-data-takes-off-the-hadoop-wars-begin/">As big data takes off, the Hadoop wars begin</a> (March 25, 2011)</li>
<li><a href="http://gigaom.com/2011/10/07/hadoops-civil-war-does-it-matter-who-contributes-most/">Hadoop’s civil war: Does it matter who contributes the most?</a> (Oct. 7, 2011)</li>
<li><a href="http://gigaom.com/2012/01/28/5-low-profile-startups-that-could-change-the-face-of-big-data/">5 low-profile startups that could change the face of big data</a> (Jan. 28, 2012)</li>
<li><a href="http://gigaom.com/2012/02/06/what-it-really-means-when-someone-says-hadoop/">What it really means when someone says Hadoop</a> (Feb. 6, 2012)</li>
<li><a href="http://gigaom.com/2012/03/03/hadoop-jumps-through-hoops-becomes-mainstream/">Hadoop jumps through hoops, becomes mainstream</a> (March 3, 2012)</li>
<li><a href="http://gigaom.com/2012/07/07/why-the-days-are-numbered-for-hadoop-as-we-know-it/">Why the days are numbered for Hadoop as we know it</a> (July 7, 2012)</li>
<li><a href="http://gigaom.com/2012/11/09/a-few-stats-rumors-and-stories-on-on-hadoops-rapid-growth/">A few stats, rumors and stories on Hadoop’s rapid growth</a> (Nov. 9, 2012)</li>
</ol><h2 id="the-coolest-users-aside-from-y">The coolest users … aside from Yahoo</h2>
<table width="530" border="1" cellspacing="0" cellpadding="0"><tbody><tr><td valign="top" width="40%"><b>Facebook</b>
<ul><li><a href="http://gigaom.com/2012/11/08/facebook-open-sources-corona-a-better-way-to-do-webscale-hadoop/">Facebook open sources Corona — a better way to do webscale Hadoop </a></li>
<li><a href="http://gigaom.com/2011/07/27/facebook-hadoop-cluster/">How Facebook moved 30 petabytes of Hadoop data</a></li>
<li><a href="http://gigaom.com/2012/08/22/facebook-is-collecting-your-data-500-terabytes-a-day/">Facebook is collecting your data — 500TB a day</a></li>
<li><a href="http://gigaom.com/2012/06/13/how-facebook-keeps-100-petabytes-of-hadoop-data-online/">How Facebook keeps 100 petabytes of Hadoop data online</a></li>
</ul></td>
<td valign="top" width="30%"><b>Netflix</b>
<ul><li><a href="http://gigaom.com/2012/06/14/netflix-analyzes-a-lot-of-data-about-your-viewing-habits/">Netflix analyzes <i>a lot </i>of data about your viewing habits</a></li>
<li><a href="http://gigaom.com/2013/01/10/netflix-shows-off-its-hadoop-architecture/">Netflix shows off how it does Hadoop in the cloud</a></li>
</ul></td>
<td valign="top" width="29%"><b>Etsy</b>
<ul><li><a href="http://gigaom.com/2012/08/31/etsy-unveils-its-infrastructure-and-its-supermicro-love/">Etsy unveils its infrastructure (and its SuperMicro love)</a></li>
<li><a href="http://gigaom.com/2011/11/02/how-etsy-handcrafted-a-big-data-strategy/">How Etsy handcrafted a big data strategy</a></li>
</ul></td>
</tr><tr><td valign="top" width="40%"><b>eBay</b>
<ul><li><a href="http://gigaom.com/2012/01/31/under-the-covers-of-ebays-big-data-operation/">Under the covers of eBay’s big data operation</a></li>
<li><a href="http://gigaom.com/2012/04/06/making-the-web-more-efficient-a-thousand-servers-at-a-time/">Making the web more efficient 1,000 servers at a time</a></li>
</ul></td>
<td valign="top" width="30%"><b>The smart grid world</b>
<ul><li><a href="http://gigaom.com/2009/06/02/how-to-use-open-source-hadoop-for-the-smart-grid/">How to use open-source Hadoop for the smart grid</a></li>
<li><a href="http://gigaom.com/2012/11/19/opower-the-big-data-energy-player-to-beat/">Opower, the big data energy player to beat</a></li>
</ul></td>
<td valign="top" width="29%"><b>Obama for America</b>
<ul><li><a href="http://gigaom.com/2012/03/11/vote-for-me-how-data-will-change-the-2012-elections/">Vote for me: How data will change the 2012 election</a></li>
<li><a href="http://gigaom.com/2012/12/08/how-obamas-data-scientists-built-a-volunteer-army-on-facebook/">How Obama’s data scientists built a volunteer army on Facebook</a></li>
</ul></td>
</tr><tr><td valign="top" width="40%"><a href="http://gigaom.com/2012/12/02/pinterest-flipboard-and-yelp-tell-how-to-save-big-bucks-in-the-cloud/"><b>Yelp</b></a></td>
<td valign="top" width="30%"><a href="http://gigaom.com/2012/02/22/bloomreach-wants-to-save-your-site-with-big-data/"><b>BloomReach</b></a></td>
<td valign="top" width="29%"><a href="http://gigaom.com/2012/06/12/how-ancestry-com-is-using-big-data-to-map-time-place-and-people/"><b>Ancestry.com</b></a></td>
</tr><tr><td valign="top" width="40%"><a href="http://gigaom.com/2013/03/03/how-and-why-linkedin-is-becoming-an-engineering-powerhouse/"><b>LinkedIn</b></a></td>
<td valign="top" width="30%"><a href="http://gigaom.com/2012/09/27/quantcast-releases-bigger-faster-stronger-hadoop-file-system/"><b>Quantcast</b></a></td>
<td valign="top" width="29%"><a href="http://gigaom.com/2012/09/16/how-disney-built-a-big-data-platform-on-a-startup-budget/"><b>Disney</b></a></td>
</tr><tr><td valign="top" width="40%"><a href="http://gigaom.com/2011/11/22/big-data-reveals-mac-users-book-pricier-hotels/"><b>Orbitz</b></a></td>
<td valign="top" width="30%"><a href="http://gigaom.com/2012/03/06/why-klout-is-making-its-bed-with-hadoop-and-microsoft/"><b>Klout</b></a></td>
<td valign="top" width="29%"><a href="http://gigaom.com/2012/03/07/how-twitter-is-doing-its-part-to-democratize-big-data/"><b>Twitter</b></a></td>
</tr><tr><td valign="top" width="40%"><a href="http://gigaom.com/2012/07/15/better-medicine-brought-to-you-by-big-data/"><b>The medical world</b></a></td>
<td valign="top" width="30%"><a href="http://gigaom.com/2012/05/02/how-climate-corp-is-pitting-big-data-against-mother-nature/"><b>Climate Corporation</b></a></td>
<td valign="top" width="29%"><a href="http://gigaom.com/2012/04/17/satellite-imagery-and-hadoop-mean-70m-for-skybox/"><b>Skybox Imaging</b></a></td>
</tr><tr><td valign="top" width="40%"><a href="http://gigaom.com/2012/02/13/how-tumblr-went-from-wee-to-webscale/"><b>Tumblr</b></a></td>
<td valign="top" width="30%"><a href="http://gigaom.com/2012/05/25/how-intuit-uses-big-data-to-delight-you/"><b>Intuit</b></a></td>
<td valign="top" width="29%"><a href="http://gigaom.com/2012/03/23/walmart-labs-is-building-big-data-tools-and-will-then-open-source-them/"><b>@Walmartlabs</b></a></td>
</tr><tr><td valign="top" width="40%"><a href="http://gigaom.com/2012/03/08/how-hadoop-can-help-keep-your-money-in-the-bank/"><strong>Zions Bancorporation</strong></a></td>
<td valign="top" width="30%"><a href="http://gigaom.com/2012/05/20/can-i-help-you-how-liveperson-decides-whos-worth-the-personal-touch/"><strong>LivePerson</strong></a></td>
<td valign="top" width="29%"><a href="http://gigaom.com/2012/11/15/6-ways-big-data-is-helping-reinvent-enterprise-security/"><strong>The enterprise security world</strong> </a></td>
</tr></tbody></table><h2 id="taking-hadoop-to-the-stage">Taking Hadoop to the stage</h2>
<p><strong><a href="http://gigaom.com/2008/04/27/gigaom-pm-the-hadoop-meetup-may-1/">The Hadoop Meetup (May 1, 2008)</a></strong></p>
<div id="attachment_618096" class="wp-caption aligncenter" style="width: 718px"><a href="http://www.flickr.com/photos/joeywan/2467450286/"><img alt="Cutting (center) flanked by Baldeschwieler and Om Malik at GigaOM’s Hadoop Meetup in 2008." src="http://gigaom2.files.wordpress.com/2013/03/2467450286_db547ef9ef_b1.jpg?w=708&#038;h=366" width="708" height="366" class="size-full wp-image-618096"></a><p class="wp-caption-text">Cutting (center) flanked by Baldeschwieler and Om Malik at GigaOM’s Hadoop Meetup in 2008.</p></div>
<p><strong>Next-generation data stores (Structure 2008; start at 57:00)<br></strong></p>
		<form id="wpcom-iframe-form-d4f61b1638dd6a050b7f88589dd98847" target="wpcom-iframe-d4f61b1638dd6a050b7f88589dd98847" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="708"><input type="hidden" name="height" value="295"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNzA4IiBoZWlnaHQ9IjI5NSIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL3N0cnVjdHVyZTA4P2xheW91dD00JmNsaXA9cGxhXzM4NzgxNDM1NjA0MDEyNDIxMzQmY29sb3I9MHhlN2U3ZTcmYXV0b1BsYXk9ZmFsc2UmbXV0ZT1mYWxzZSZpY29uQ29sb3JPdmVyPTB4ODg4ODg4Jmljb25Db2xvcj0weDc3Nzc3NyZhbGxvd2NoYXQ9dHJ1ZSZoZWlnaHQ9Mjk1JndpZHRoPTcwOCIgc3R5bGU9ImJvcmRlcjowO291dGxpbmU6MCIgZnJhbWVib3JkZXI9IjAiIHNjcm9sbGluZz0ibm8iPjwvaWZyYW1lPjxkaXYgc3R5bGU9ImZvbnQtc2l6ZToxMXB4O3BhZGRpbmctdG9wOjEwcHg7dGV4dC1hbGlnbjpjZW50ZXI7d2lkdGg6NzA4cHgiPldhdGNoIDxhIGhyZWY9aHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS8/dXRtX3NvdXJjZT1sc3BsYXllciZhbXA7dXRtX21lZGl1bT1lbWJlZCZhbXA7dXRtX2NhbXBhaWduPWZvb3RlcmxpbmtzIHRpdGxlPWxpdmUgc3RyZWFtaW5nIHZpZGVvPmxpdmUgc3RyZWFtaW5nIHZpZGVvPC9hPiBmcm9tIDxhIGhyZWY9aHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9zdHJ1Y3R1cmUwOD91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MgdGl0bGU9V2F0Y2ggc3RydWN0dXJlMDggYXQgbGl2ZXN0cmVhbS5jb20+c3RydWN0dXJlMDg8L2E+IGF0IGxpdmVzdHJlYW0uY29tPC9kaXY+,788149533ec5c5687e7bb36f4bf9b905b6b60cd8"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="d4f61b1638dd6a050b7f88589dd98847"></form>
		<iframe name="wpcom-iframe-d4f61b1638dd6a050b7f88589dd98847" width="708" height="295" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-d4f61b1638dd6a050b7f88589dd98847').submit();</script><p><strong>Hadoop, NoSQL and webscale data (Structure 2009)</strong></p>
		<form id="wpcom-iframe-form-23d3006138547495ed32527646496dd4" target="wpcom-iframe-23d3006138547495ed32527646496dd4" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="708"><input type="hidden" name="height" value="295"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNzA4IiBoZWlnaHQ9IjI5NSIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL2dpZ2FvbXR2P2xheW91dD00JmNsaXA9cGxhXzY2NzQ0NDMyMjQzNzY3MDExMzQmY29sb3I9MHhlN2U3ZTcmYXV0b1BsYXk9ZmFsc2UmbXV0ZT1mYWxzZSZpY29uQ29sb3JPdmVyPTB4ODg4ODg4Jmljb25Db2xvcj0weDc3Nzc3NyZhbGxvd2NoYXQ9dHJ1ZSZoZWlnaHQ9Mjk1JndpZHRoPTcwOCIgc3R5bGU9ImJvcmRlcjowO291dGxpbmU6MCIgZnJhbWVib3JkZXI9IjAiIHNjcm9sbGluZz0ibm8iPjwvaWZyYW1lPjxkaXYgc3R5bGU9ImZvbnQtc2l6ZToxMXB4O3BhZGRpbmctdG9wOjEwcHg7dGV4dC1hbGlnbjpjZW50ZXI7d2lkdGg6NzA4cHgiPldhdGNoIDxhIGhyZWY9aHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS8/dXRtX3NvdXJjZT1sc3BsYXllciZhbXA7dXRtX21lZGl1bT1lbWJlZCZhbXA7dXRtX2NhbXBhaWduPWZvb3RlcmxpbmtzIHRpdGxlPWxpdmUgc3RyZWFtaW5nIHZpZGVvPmxpdmUgc3RyZWFtaW5nIHZpZGVvPC9hPiBmcm9tIDxhIGhyZWY9aHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9naWdhb210dj91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MgdGl0bGU9V2F0Y2ggZ2lnYW9tdHYgYXQgbGl2ZXN0cmVhbS5jb20+Z2lnYW9tdHY8L2E+IGF0IGxpdmVzdHJlYW0uY29tPC9kaXY+,b4c17ac86fa594e10c2537dee162b96194047c2a"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="23d3006138547495ed32527646496dd4"></form>
		<iframe name="wpcom-iframe-23d3006138547495ed32527646496dd4" width="708" height="295" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-23d3006138547495ed32527646496dd4').submit();</script><p><b>The big data tsunami (Structure 2010)<br></b></p>
		<form id="wpcom-iframe-form-a4195be60b3b94cd2e3c14fb4c2a96d4" target="wpcom-iframe-a4195be60b3b94cd2e3c14fb4c2a96d4" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="560"><input type="hidden" name="height" value="340"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNTYwIiBoZWlnaHQ9IjM0MCIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL2dpZ2FvbXR2P2xheW91dD00JmFtcDtjbGlwPXBsYV82MDhiMTQyZC0wNmU4LTQ2NzAtODBhOC1hZDNkZTRmZjIwMzUmYW1wO2hlaWdodD0zNDAmYW1wO3dpZHRoPTU2MCZhbXA7YXV0b3BsYXk9ZmFsc2UiIHN0eWxlPSJib3JkZXI6MDtvdXRsaW5lOjAiIGZyYW1lYm9yZGVyPSIwIiBzY3JvbGxpbmc9Im5vIj48L2lmcmFtZT48ZGl2IHN0eWxlPSJmb250LXNpemU6IDExcHg7cGFkZGluZy10b3A6MTBweDt0ZXh0LWFsaWduOmNlbnRlcjt3aWR0aDo1NjBweCI+PGEgaHJlZj0iaHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9naWdhb210dj91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MiIHRpdGxlPSJXYXRjaCBnaWdhb210diI+Z2lnYW9tdHY8L2E+IG9uIGxpdmVzdHJlYW0uY29tLiA8YSBocmVmPSJodHRwOi8vd3d3LmxpdmVzdHJlYW0uY29tLz91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MiIHRpdGxlPSJCcm9hZGNhc3QgTGl2ZSBGcmVlIj5Ccm9hZGNhc3QgTGl2ZSBGcmVlPC9hPjwvZGl2Pg==,a66a2b63054dcab32505b7a32fb09fde9b215c8b"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="a4195be60b3b94cd2e3c14fb4c2a96d4"></form>
		<iframe name="wpcom-iframe-a4195be60b3b94cd2e3c14fb4c2a96d4" width="560" height="340" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-a4195be60b3b94cd2e3c14fb4c2a96d4').submit();</script><p><strong>Hadoop and beyond (Structure: Data 2011)</strong></p>
		<form id="wpcom-iframe-form-8f0174d8011e8a91ca3cf5d3281fc56f" target="wpcom-iframe-8f0174d8011e8a91ca3cf5d3281fc56f" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="560"><input type="hidden" name="height" value="340"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNTYwIiBoZWlnaHQ9IjM0MCIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL2dpZ2FvbWJpZ2RhdGE/bGF5b3V0PTQmYW1wO2NsaXA9cGxhXzc3MGZiOWQyLTVlZTAtNDA5NC05NDZmLTA5YzNhMmM0NDMxZSZhbXA7aGVpZ2h0PTM0MCZhbXA7d2lkdGg9NTYwJmFtcDthdXRvcGxheT1mYWxzZSIgc3R5bGU9ImJvcmRlcjowO291dGxpbmU6MCIgZnJhbWVib3JkZXI9IjAiIHNjcm9sbGluZz0ibm8iPjwvaWZyYW1lPjxkaXYgc3R5bGU9ImZvbnQtc2l6ZTogMTFweDtwYWRkaW5nLXRvcDoxMHB4O3RleHQtYWxpZ246Y2VudGVyO3dpZHRoOjU2MHB4Ij5XYXRjaCA8YSBocmVmPSJodHRwOi8vd3d3LmxpdmVzdHJlYW0uY29tLz91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MiIHRpdGxlPSJsaXZlIHN0cmVhbWluZyB2aWRlbyI+bGl2ZSBzdHJlYW1pbmcgdmlkZW88L2E+IGZyb20gPGEgaHJlZj0iaHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9naWdhb21iaWdkYXRhP3V0bV9zb3VyY2U9bHNwbGF5ZXImYW1wO3V0bV9tZWRpdW09ZW1iZWQmYW1wO3V0bV9jYW1wYWlnbj1mb290ZXJsaW5rcyIgdGl0bGU9IldhdGNoIGdpZ2FvbWJpZ2RhdGEgYXQgbGl2ZXN0cmVhbS5jb20iPmdpZ2FvbWJpZ2RhdGE8L2E+IGF0IGxpdmVzdHJlYW0uY29tPC9kaXY+,eae1d97f8130c40580a2e8b90cf8299a919bbb3c"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="8f0174d8011e8a91ca3cf5d3281fc56f"></form>
		<iframe name="wpcom-iframe-8f0174d8011e8a91ca3cf5d3281fc56f" width="560" height="340" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-8f0174d8011e8a91ca3cf5d3281fc56f').submit();</script><p><strong>What’s next for Hadoop? (Structure: Data 2012)</strong></p>
		<form id="wpcom-iframe-form-7d49aa65bff3fe7274c505f1cb895b40" target="wpcom-iframe-7d49aa65bff3fe7274c505f1cb895b40" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="560"><input type="hidden" name="height" value="340"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNTYwIiBoZWlnaHQ9IjM0MCIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL2dpZ2FvbWJpZ2RhdGE/bGF5b3V0PTQmYW1wO2NsaXA9cGxhX2FmYjNlY2JlLWNhMzMtNGExNy04MWViLTAyYTFhMTRlYzJmZSZhbXA7aGVpZ2h0PTM0MCZhbXA7d2lkdGg9NTYwJmFtcDthdXRvcGxheT1mYWxzZSIgc3R5bGU9ImJvcmRlcjowO291dGxpbmU6MCIgZnJhbWVib3JkZXI9IjAiIHNjcm9sbGluZz0ibm8iPjwvaWZyYW1lPjxkaXYgc3R5bGU9ImZvbnQtc2l6ZTogMTFweDtwYWRkaW5nLXRvcDoxMHB4O3RleHQtYWxpZ246Y2VudGVyO3dpZHRoOjU2MHB4Ij5XYXRjaCA8YSBocmVmPSJodHRwOi8vd3d3LmxpdmVzdHJlYW0uY29tLz91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MiIHRpdGxlPSJsaXZlIHN0cmVhbWluZyB2aWRlbyI+bGl2ZSBzdHJlYW1pbmcgdmlkZW88L2E+IGZyb20gPGEgaHJlZj0iaHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9naWdhb21iaWdkYXRhP3V0bV9zb3VyY2U9bHNwbGF5ZXImYW1wO3V0bV9tZWRpdW09ZW1iZWQmYW1wO3V0bV9jYW1wYWlnbj1mb290ZXJsaW5rcyIgdGl0bGU9IldhdGNoIGdpZ2FvbWJpZ2RhdGEgYXQgbGl2ZXN0cmVhbS5jb20iPmdpZ2FvbWJpZ2RhdGE8L2E+IGF0IGxpdmVzdHJlYW0uY29tPC9kaXY+,e9312aa180969ee6bde32e59f6d5e8c62f9c2967"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="7d49aa65bff3fe7274c505f1cb895b40"></form>
		<iframe name="wpcom-iframe-7d49aa65bff3fe7274c505f1cb895b40" width="560" height="340" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-7d49aa65bff3fe7274c505f1cb895b40').submit();</script><p><strong>Mike Olson on Hadoop (Structure: Data 2012)<br></strong></p>
		<form id="wpcom-iframe-form-3f40a9a1dc899b887d48f5b2247fc295" target="wpcom-iframe-3f40a9a1dc899b887d48f5b2247fc295" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="560"><input type="hidden" name="height" value="340"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNTYwIiBoZWlnaHQ9IjM0MCIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL2dpZ2FvbWJpZ2RhdGE/bGF5b3V0PTQmYW1wO2NsaXA9cGxhX2ZiN2EzOGFlLTczODMtNDljNi05ZDUxLWJmY2NkOGFkZDJjZiZhbXA7aGVpZ2h0PTM0MCZhbXA7d2lkdGg9NTYwJmFtcDthdXRvcGxheT1mYWxzZSIgc3R5bGU9ImJvcmRlcjowO291dGxpbmU6MCIgZnJhbWVib3JkZXI9IjAiIHNjcm9sbGluZz0ibm8iPjwvaWZyYW1lPjxkaXYgc3R5bGU9ImZvbnQtc2l6ZTogMTFweDtwYWRkaW5nLXRvcDoxMHB4O3RleHQtYWxpZ246Y2VudGVyO3dpZHRoOjU2MHB4Ij5XYXRjaCA8YSBocmVmPSJodHRwOi8vd3d3LmxpdmVzdHJlYW0uY29tLz91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MiIHRpdGxlPSJsaXZlIHN0cmVhbWluZyB2aWRlbyI+bGl2ZSBzdHJlYW1pbmcgdmlkZW88L2E+IGZyb20gPGEgaHJlZj0iaHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9naWdhb21iaWdkYXRhP3V0bV9zb3VyY2U9bHNwbGF5ZXImYW1wO3V0bV9tZWRpdW09ZW1iZWQmYW1wO3V0bV9jYW1wYWlnbj1mb290ZXJsaW5rcyIgdGl0bGU9IldhdGNoIGdpZ2FvbWJpZ2RhdGEgYXQgbGl2ZXN0cmVhbS5jb20iPmdpZ2FvbWJpZ2RhdGE8L2E+IGF0IGxpdmVzdHJlYW0uY29tPC9kaXY+,af6115d20008e856efd03159816bcad98a726bfa"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="3f40a9a1dc899b887d48f5b2247fc295"></form>
		<iframe name="wpcom-iframe-3f40a9a1dc899b887d48f5b2247fc295" width="560" height="340" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-3f40a9a1dc899b887d48f5b2247fc295').submit();</script><p><strong>Analyzing data with HBase (Structure: Data 2012)</strong></p>
		<form id="wpcom-iframe-form-8c8355955139c9602e1bebb181e36d43" target="wpcom-iframe-8c8355955139c9602e1bebb181e36d43" method="post" action="http://wpcomwidgets.com">
							<input type="hidden" name="frameborder" value="0"><input type="hidden" name="scrolling" value="no"><input type="hidden" name="resize" value="0"><input type="hidden" name="replace_attributes" value="1"><input type="hidden" name="fallback" value='&lt;p class="protected-embed-fallback"&gt;This embed is invalid&lt;/p&gt;'><input type="hidden" name="width" value="560"><input type="hidden" name="height" value="340"><input type="hidden" name="style" value="border:0;outline:0"><input type="hidden" name="_data" value="PGlmcmFtZSB3aWR0aD0iNTYwIiBoZWlnaHQ9IjM0MCIgc3JjPSJodHRwOi8vY2RuLmxpdmVzdHJlYW0uY29tL2VtYmVkL2dpZ2FvbWJpZ2RhdGE/bGF5b3V0PTQmYW1wO2NsaXA9cGxhXzI3NTFhZDgzLWRmMjQtNGYzMS04OGVhLWVjMWNjOTU2ZGRkNSZhbXA7aGVpZ2h0PTM0MCZhbXA7d2lkdGg9NTYwJmFtcDthdXRvcGxheT1mYWxzZSIgc3R5bGU9ImJvcmRlcjowO291dGxpbmU6MCIgZnJhbWVib3JkZXI9IjAiIHNjcm9sbGluZz0ibm8iPjwvaWZyYW1lPjxkaXYgc3R5bGU9ImZvbnQtc2l6ZTogMTFweDtwYWRkaW5nLXRvcDoxMHB4O3RleHQtYWxpZ246Y2VudGVyO3dpZHRoOjU2MHB4Ij5XYXRjaCA8YSBocmVmPSJodHRwOi8vd3d3LmxpdmVzdHJlYW0uY29tLz91dG1fc291cmNlPWxzcGxheWVyJmFtcDt1dG1fbWVkaXVtPWVtYmVkJmFtcDt1dG1fY2FtcGFpZ249Zm9vdGVybGlua3MiIHRpdGxlPSJsaXZlIHN0cmVhbWluZyB2aWRlbyI+bGl2ZSBzdHJlYW1pbmcgdmlkZW88L2E+IGZyb20gPGEgaHJlZj0iaHR0cDovL3d3dy5saXZlc3RyZWFtLmNvbS9naWdhb21iaWdkYXRhP3V0bV9zb3VyY2U9bHNwbGF5ZXImYW1wO3V0bV9tZWRpdW09ZW1iZWQmYW1wO3V0bV9jYW1wYWlnbj1mb290ZXJsaW5rcyIgdGl0bGU9IldhdGNoIGdpZ2FvbWJpZ2RhdGEgYXQgbGl2ZXN0cmVhbS5jb20iPmdpZ2FvbWJpZ2RhdGE8L2E+IGF0IGxpdmVzdHJlYW0uY29tPC9kaXY+,424d09678ec8f8ae1abb2fb66f04ec8947b15b69"><input type="hidden" name="_tag" value="protected-iframe"><input type="hidden" name="_hash" value="8c8355955139c9602e1bebb181e36d43"></form>
		<iframe name="wpcom-iframe-8c8355955139c9602e1bebb181e36d43" width="560" height="340" frameborder="0" scrolling="no"></iframe>
		<script type="text/javascript">document.getElementById('wpcom-iframe-form-8c8355955139c9602e1bebb181e36d43').submit();</script>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=614707&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=926217"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=926217" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614707+hadoop-through-the-years-a-gigaom-retrospective&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/11/unlocking-big-datas-potential-with-search/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614707+hadoop-through-the-years-a-gigaom-retrospective&utm_content=dharrisstructure">How search can unlock the power of big data</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614707+hadoop-through-the-years-a-gigaom-retrospective&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614707+hadoop-through-the-years-a-gigaom-retrospective&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" medium="image">
			<media:title type="html">gigaom hadoop icon final</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/2467450286_db547ef9ef_b1.jpg" medium="image">
			<media:title type="html">Cutting (center) flanked by Baldeschwieler and Om Malik at GigaOM’s Hadoop Meetup in 2008.</media:title>
		</media:content>
	</item>
		<item>
		<title>Data? What is it good for? Absolutely &#8230; something</title>
		<link>http://gigaom.com/2013/03/05/data-what-is-it-good-for-absolutely-something/</link>
		<comments>http://gigaom.com/2013/03/05/data-what-is-it-good-for-absolutely-something/#comments</comments>
		<pubDate>Tue, 05 Mar 2013 23:30:06 +0000</pubDate>
		<dc:creator>Om Malik</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Om Says]]></category>
		<category><![CDATA[Smart Data]]></category>
		<category><![CDATA[Structure Data]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=616948</guid>
		<description><![CDATA[It is fashionable these days to either like big data or just malign big data. Regardless of what your personal feelings are, the question has always been and will always be - what is data good for. Here are three stories to illustrate those questions.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=616948&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Data has been a subject of my deliberations, both public and private, for a long time — almost a decade. Long before the bulge bracket consultants discovered its virtue and long before short-term trumpeters of data showed up, data was something that helped shape my thinking and approach to decision making. It was not big data, or smart data, or little data or panda data. It was just data, and what one could do with, it that influenced my thinking.</p>
<p>With more network end-points and more digitization, it goes without saying that the amount of data in our lives and at work is only going to increase. But the size of the data isn’t the issue; instead, it’s “what you do with the data” that will be the key to the success in the emerging future economy. The companies (and individuals) who don’t think accordingly will find themselves on the losing side. Let me tell you three personal stories that will illustrate my point.</p>
<div id="attachment_617263" class="wp-caption alignnone" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/03/lufthansa2.jpg"><img alt="Getty Images" src="http://gigaom2.files.wordpress.com/2013/03/lufthansa2.jpg?w=708&#038;h=472" width="708" height="472" class="size-large wp-image-617263"></a><p class="wp-caption-text">Getty Images</p></div>
<h2 id="grounded">Grounded</h2>
<p>The first story involves an airline — Lufthansa, the German giant. I recently visited my parents in Delhi and a day before I returned I fell sick. I was quite feverish and somewhat of a pain to my fellow travelers. I developed a nasty cough and, well, I thought it might be a good idea to buy an upgrade and go to sleep on a lengthy (22-hour) flight. Instead of trying to use my points  — I know better — I offered to buy an upgrade. But a Lufthansa official declined to sell me the upgrade. It was not that there weren’t any empty seats — there were many. But since I had an unchangeable ticket, he refused. It was mildly irritating because I had been patronizing Lufthansa for nearly two decades and was hoping for a little compassion.</p>
<p>That episode made me wonder about why Lufthansa was so rigid and refused to use historical data they had on me to make a smart decision to appease a returning customer, especially since it allowed them to monetize empty seats. The inflexible policies basically lead the airline to leave money on the table.</p>
<p>In the age of big data and smart enterprises, how can a company not have a way to make smarter, real-time business decisions? I wonder if “out data-ing” will be the right way for a competitor to eat the German carrier for lunch. This lack of ability to not know the customer is going to be what I believe we will mean when we say “big, dumb company.”</p>
<p>I, for one would like my airline to know me, know my tastes and if possible have enough data on me to offer me a quasi-personalized experience. Yes, I do live in the future and sometimes get carried away about the possibilities of data, sensors and the notion hyper-personalization. But still, I am not talking mining on the moon — I am talking about tactics little companies like Uber are using in making smart customer decisions.</p>
<h2 id="disconnected">Disconnected</h2>
<p>The second story involves a wireless company — Verizon Wireless. Every time I leave the country, I call them up and sign up for the traveling data plan, letting them know where I am traveling and for how long. I actually don’t mind doing that because it makes a life a lot easier when I land in a new country. A week later when I return, I get ominous-sounding SMS alerts followed by a phone call from one of their agents who in an alarmist tone asks me if I have my phone and what not. Most of these calls are at early hours of the day when I am trying to deal with jet lag.</p>
<p>I cannot figure out why the carrier cannot figure out — using location data it obviously has — that I am actually back in the United States and in my city and perhaps even in my own neighborhood. As a customer, it would certainly be more convenient. I mean, these guys are willing to sift through my location data and my phone calls to do targeted advertising, why can’t they reconcile my location with their other databases to automatically update the records?</p>
<h2 id="shoe-side-story">Shoe Side Story</h2>
<p>Now let me tell you another story about a little store in Manhattan. In sharp contrast to my Lufthansa experience, I was reminded that I had a great visit to a shoe store in New York on my last visit to the big city. It proved to be educational. I had about half an hour between meetings and I walked through Soho, where I spotted Varda.</p>
<p>The last time I was there it was about 10 years ago, a few weeks before I moved to San Francisco. I had bought a pair of boots at the store. I was surprised that the store had survived the test of time and it was still going strong (it had <a href="http://www.vardashoes.com/about.php">started in 1981</a>). The fact that they made shoes that last forever might have something to do with it, I imagine.</p>
<p>As luck would have it, I was wearing the very same boots. They have given me excellent service and with the exception of being comfortable like old shoes can be, they are almost new. I decided to duck into the store — after all, I had a little bit of time. I saw that they had an identical pair to the one I was wearing, except they were made with suede of a different hue. And I am sucker for suede and boots.</p>
<p>The salesperson and I ended up in a conversation about shoes and when she swiped the credit card, she noticed that I had done business with the store previously on a couple of occasions. She gave me an instant discount — without me asking for it. It wasn’t a lot, but it was a nice feeling of being appreciated for my loyalty.</p>
<h2 id="data-designs-experiences">Data designs experiences</h2>
<p><a href="http://gigaom2.files.wordpress.com/2013/03/shutterstock_125574617.jpg"><img alt="Big Data" src="http://gigaom2.files.wordpress.com/2013/03/shutterstock_125574617.jpg?w=300&#038;h=257" width="300" height="257" class="alignleft size-medium wp-image-616721"></a>A small store like Varda created an experience that was all-encompassing and got my money. Lufthansa just alienated me, after twenty years of blindly buying from them. I don’t think there was any big data involved at the shoe store: the aging PC probably was pulling data off Excel or something similar. It was micro-data if there was any.</p>
<p>A few weeks ago, I wrote about <a href="http://om.co/2013/01/16/user-experience-is-immersive/">how a brand experience is multitouch and multimodal.</a> I don’t think large industrial-era dinosaurs like airlines such as Lufthansa and American Airlines quite understand that. And that is why it doesn’t matter how much data they have collected about their customers or how many millions of dollars they spend on their computing and data infrastructure. They don’t know one simple truth: <b>it is not the data, it is what you do with it, stupid. </b></p>
<p>Asking the right questions from the data and then creating an experience befitting customer happiness or drawing conclusions that are not obvious involves a level of humanity — something that is unfortunately missing from all the buzz about data. It is a pervasive problem across the industrial landscape.</p>
<p>As for me, I am shopping for a more-intelligent airline — one that values relationships and creates tailored experiences for me, the customer.</p>
<p>[<strong>Structure Data 2013</strong>: We will be discussing a lot about data and what you do with it at our upcoming <a href="http://event.gigaom.com/structuredata/?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=616948+data-what-is-it-good-for-absolutely-something&amp;utm_content=om">Structure Data conference</a>. For instance, we will have  <a href="http://event.gigaom.com/structuredata/speakers/?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=616948+data-what-is-it-good-for-absolutely-something&amp;utm_content=om#mohan_namboodiri">Mohan Namboodiri</a>, VP, Customer Analytics, Williams-Sonoma talk about how the lifestyle company uses data. The conference is being <a href="http://event.gigaom.com/structuredata/registration/?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=616948+data-what-is-it-good-for-absolutely-something&amp;utm_content=om">held on March 20 &amp; 21 in New York. More details here</a>.]</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=616948&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=739647"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=739647" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=616948+data-what-is-it-good-for-absolutely-something&utm_content=om">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=616948+data-what-is-it-good-for-absolutely-something&utm_content=om">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/01/12-tech-leaders-resolutions-for-2012/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=616948+data-what-is-it-good-for-absolutely-something&utm_content=om">12 tech leaders’ resolutions for 2012</a></li><li><a href="http://pro.gigaom.com/2012/09/listening-platforms-finding-the-value-in-social-media-data/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=616948+data-what-is-it-good-for-absolutely-something&utm_content=om">Listening platforms: finding the value in social media data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/05/data-what-is-it-good-for-absolutely-something/feed/</wfw:commentRss>
		<slash:comments>24</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/big-data.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/big-data.jpg?w=150" medium="image">
			<media:title type="html">big data magnifying glass</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/89c6ff98059617751fcf312690965fa0?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">om</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/lufthansa2.jpg?w=708" medium="image">
			<media:title type="html">Getty Images</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/shutterstock_125574617.jpg?w=300" medium="image">
			<media:title type="html">Big Data</media:title>
		</media:content>
	</item>
		<item>
		<title>The Hadoop ecosystem: the (welcome) elephant in the room (infographic)</title>
		<link>http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/</link>
		<comments>http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/#comments</comments>
		<pubDate>Tue, 05 Mar 2013 13:00:23 +0000</pubDate>
		<dc:creator>Rani Molla</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Structure Data]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=613888</guid>
		<description><![CDATA[How big an impact has Hadoop had on the technology world? Check out our infographic on the reach of the most important big data tool of our time.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=613888&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>To say Hadoop has become really big business would be to understate the case. At a broad level, it’s the focal point of a immense big data movement, but Hadoop itself is now a software and services market of its very own. In this graphic, we aim to map out the current ecosystem of Hadoop software and services — application and infrastructure software, as well as open source projects — and where those products fall in terms of use cases and delivery model. Click on a company name for more information about how they are using this technology.</p>
<p>A couple of points about the methodology might be valuable: The first is that these are products and projects that are built with Hadoop in mind and that aim to either extend its utility in some way or expose its core functions in a new manner. Another is that the “Hadoop Repackaged” category is used to characterize companies that are reselling Hadoop in some way (e.g., in an appliance or as part of their existing product suites) but that haven’t developed their own technology at the Hadoop level and instead rely on existing distribution software from companies such as Hortonworks or Cloudera.</p>
<p>This is the second installment of our four-part series on the past, present and future of Hadoop. <a href="http://gigaom.com/2013/03/04/the-history-of-hadoop-from-4-nodes-to-the-future-of-data/">Part I is the history of Hadoop</a> from the people who willed it into existence and took it mainstream. <a href="http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/">Part III will look into the future of Hadoop</a> and serve as an opening salvo for much of the discussion at our <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=613888+the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic&amp;utm_content=ranimolla">Structure: Data conference</a> March 20-21 in New York. Finally, <a href="http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective/">part IV will highlight some the best Hadoop applications and seminal moments in Hadoop history</a>, as reported by GigaOM over the years.</p>
<p><img alt="Hadoop-final-4" src="http://gigaom2.files.wordpress.com/2013/03/hadoop-final-4.jpg?w=600&#038;h=1400" usemap="#Hadoop-final-4" width="600" height="1400" class=""></p>
<map name="Hadoop-final-4"><area coords="150,257,295,275" shape="rect" href="http://www.mortardata.com/"><area coords="150,277,305,290" shape="rect" href="http://www.infochimps.com/"><area coords="149,292,290,319" shape="rect" href="https://www.hadooponazure.com/"><area coords="149,321,287,334" shape="rect" href="http://www-01.ibm.com/software/data/infosphere/biginsights/"><area coords="150, 336,298,350" shape="rect" href="http://aws.amazon.com/elasticmapreduce/"><area coords="306,252,457,274" shape="rect" href="http://joyent.com/products/joyent-cloud/features/hadoop"><area coords="305,277,457,289" shape="rect" href="http://www.skytap.com/blog/create-a-cloudera-hadoop-cluster-in-skytap-cloud"><area coords="305,291, 457,302" shape="rect" href="http://www.verticloud.com/"><area coords="305,305,457,332" shape="rect" href="http://www.sungardas.com/NewsandEvents/PressReleases/Pages/SunGardAvailabilityServicesAnnouncesTechnicalPreviewofEnterprise-Ready,ApacheHadoop-basedUnifiedAnalyticsService.aspx"><area coords="305,336,457, 350" shape="rect" href="http://www.gogrid.com/solutions/big-data"><area coords="151,373,290,400" shape="rect" href="http://www.kontagent.com/solutions/datamine/"><area coords="151,404,284,416" shape="rect" href="http://www.qubole.com/"><area coords="151,417,287,434" shape="rect" href="http://www.treasure-data.com/"><area coords="310,378,457,398" shape="rect" href="http://www.birst.com/product/technology/big-data-services"><area coords="310,400,457,412" shape="rect" href="http://www.cetas.net/"><area coords="310,415,455,439" shape="rect" href="https://www.packetloop.com/"><area coords="151,466, 232,487" shape="rect" href="http://www.wibidata.com/"><area coords="151,490,240,502" shape="rect" href="http://www.platfora.com/"><area coords="151,505,242,517" shape="rect" href="http://www.continuuity.com/"><area coords="151,520,243,533" shape="rect" href="http://www.datameer.com/"><area coords="248,460,335,488" shape="rect" href="http://www.karmasphere.com/"><area coords="248,491,341,503" shape="rect" href="http://www.hstreaming.com/"><area coords="248,505,344,517" shape="rect" href="http://www.tresata.com/"><area coords="248,519,336,534" shape="rect" href="http://www.ngdata.com/site/home.html"><area coords="344,465,454,487" shape="rect" href="http://0xdata.com/"><area coords="344,489,455,505" shape="rect" href="http://www.radoop.eu/"><area coords="344,506,456,518" shape="rect" href="http://blog.packetloop.com/2012/03/packetpig-open-source-big-data-security.html"><area coords="344,522,455,534" shape="rect" href="http://mahout.apache.org/"><area coords="150,573,245,600" shape="rect" href="http://hbase.apache.org/"><area coords="150, 602,247,616" shape="rect" href="http://drawntoscale.com/"><area coords="150,618,248,630" shape="rect" href="http://www.splicemachine.com/"><area coords="150,630,247,645" shape="rect" href="http://tempo-db.com/"><area coords="150,648,247,659" shape="rect" href="http://www.lilyproject.org/lily/index.html"><area coords="150,662,247,673" shape="rect" href="http://accumulo.apache.org/"><area coords="150,675,247,690" shape="rect" href="http://sqrrl.com/"><area coords="253,575,348,600" shape="rect" href="http://hive.apache.org/"><area coords="253,602,348,616" shape="rect" href="http://hadapt.com/"><area coords="253,617,348,631" shape="rect" href="http://rainstor.com/solutions/big-data-analytics-on-hadoop/"><area coords="253,633,347,646" shape="rect" href="//blog.cloudera.com/blog/2012/10/cloudera-impala-real-time-queries-in-apache-hadoop-for-real/"><area coords="253,646,347,659" shape="rect" href="http://incubator.apache.org/drill/"><area coords="253,661,347,675" shape="rect" href="http://www.citusdata.com/"><area coords="253,676,347,689" shape="rect" href="http://incubator.apache.org/giraph/"><area coords="253,691,348,704" shape="rect" href="http://www.greenplum.com/products/pivotal-hd"><area coords="253,706,348,717" shape="rect" href="http://www.cascading.org/lingual/"><area coords="253,720,348,731" shape="rect" href="http://hortonworks.com/blog/100x-faster-hive/"><area coords="253,733,348,748" shape="rect" href="https://github.com/forcedotcom/phoenix"><area coords="351,574,455,619" shape="rect" href="http://www.concurrentinc.com/cascading/"><area coords="351,621,455,632" shape="rect" href="http://pig.apache.org/"><area coords="351,634,455,648" shape="rect" href="https://github.com/twitter/scalding"><area coords="351,650,455,661" shape="rect" href="http://www.mortardata.com/"><area coords="351,664,455,678" shape="rect" href="http://hama.apache.org/"><area coords="351,679,455,705" shape="rect" href="http://wiki.apache.org/incubator/TezProposal"><area coords="151,774,291,797" shape="rect" href="http://www.zettaset.com/"><area coords="151,800,291,811" shape="rect" href="http://incubator.apache.org/ambari/"><area coords="151,815,291,831" shape="rect" href="http://incubator.apache.org/mesos/"><area coords="151,828,291,859" shape="rect" href="http://www.wandisco.com/hadoop"><area coords="304,776,455,797" shape="rect" href="https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona"><area coords="304,800,455,830" shape="rect" href="http://www.stackiq.com/products/stackiq-enterprise-data/"><area coords="151,853,285,885" shape="rect" href="http://www.cloudera.com/content/cloudera/en/home.html"><area coords="151,888,285,898" shape="rect" href="http://hortonworks.com/"><area coords="151,901,285,915" shape="rect" href="http://www.mapr.com/"><area coords="151,916,274,929" shape="rect" href="http://hadoop.intel.com/"><area coords="299,870,454,884" shape="rect" href="http://www.greenplum.com/"><area coords="299,886,455,899" shape="rect" href="http://hadoop.apache.org/"><area coords="299,901,455,922" shape="rect" href="http://www.ibm.com/us/en/"><area coords="40,1000,160,1027" shape="rect" href="http://www.oracle.com/index.html"><area coords="40,1028,160,1042" shape="rect" href="http://www.asterdata.com/"><area coords="40,1043,160,1057" shape="rect" href="http://www.ddn.com/"><area coords="40,1058,160,1070" shape="rect" href="http://www.microsoft.com/en-us/sqlserver/solutions-technologies/business-intelligence/big-data.aspx"><area coords="40,1072,160,1085" shape="rect" href="http://www.hp.com/go/hadoop"><area coords="40,1087,160,1099" shape="rect" href="http://go.nutanix.com/TechGuideNutanixHadoopReferenceArchitecture_LP.html"><area coords="40,1100,160,1114" shape="rect" href="http://www.sgi.com/solutions/bigdata/hadoop/?/"><area coords="40,1116,160,1145" shape="rect" href="http://www.dell.com/Learn/us/en/555/by-service-type-application-services-business-intelligence-hadoop?c=us&amp;l=en&amp;s=biz"><area coords="257,972,382,1003" shape="rect" href="http://hpccsystems.com/"><area coords="257,1005,382,1023" shape="rect" href="http://spark-project.org/"><area coords="384,980,561,1004" shape="rect" href="http://bigdata.pervasive.com/"><area coords="384,1007,561,1023" shape="rect" href="http://discoproject.org/"><area coords="258,1044,395,1091" shape="rect" href="http://www.cleversafe.com/overview/why-object-storage"><area coords="258,1095,395,1108" shape="rect" href="http://ceph.com/"><area coords="258,1111,394,1137" shape="rect" href="http://www.netapp.com/us/solutions/big-data/hadoop.aspx"><area coords="258,1140,395,1155" shape="rect" href="http://www.emc.com/domains/isilon/index.htm"><area coords="258,1158,393,1170" shape="rect" href="https://www.quantcast.com/inside-quantcast/2012/09/introducing-quantcast-file-system-1-0/"><area coords="395,1048,560,1093" shape="rect" href="http://www.datastax.com/products/enterprise"><area coords="395,1093,560,1109" shape="rect" href="http://www-03.ibm.com/systems/software/gpfs/"><area coords="395,1109,560,1121" shape="rect" href="http://wiki.lustre.org/index.php/Main_Page"><area coords="395,1125,560,1166" shape="rect" href="https://access.redhat.com/knowledge/videos/214803"></map>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=613888&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=471445"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=471445" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613888+the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic&utm_content=ranimolla">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613888+the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic&utm_content=ranimolla">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/03/dont-hold-your-breath-for-a-single-big-data-stack/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613888+the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic&utm_content=ranimolla">Don&#8217;t hold your breath for a single big data stack</a></li><li><a href="http://pro.gigaom.com/2012/03/big-data-budgets-on-the-rise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613888+the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic&utm_content=ranimolla">Big data budgets on the rise</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/gigaom-hadoop-icon-final.jpg?w=150" medium="image">
			<media:title type="html">gigaom hadoop icon final</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/f8fd0100aa0bc8966c428ba10b037712?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">ranimolla</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/03/hadoop-final-4.jpg" medium="image">
			<media:title type="html">Hadoop-final-4</media:title>
		</media:content>
	</item>
		<item>
		<title>Can LexisNexis build a Hadoop-killer?</title>
		<link>http://gigaom.com/2012/03/22/lexis-nexis-structure-data-2012/</link>
		<comments>http://gigaom.com/2012/03/22/lexis-nexis-structure-data-2012/#comments</comments>
		<pubDate>Thu, 22 Mar 2012 15:47:37 +0000</pubDate>
		<dc:creator>Mathew Ingram</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Armando Escalante]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Lexis-Nexis]]></category>
		<category><![CDATA[Structure Data]]></category>
		<category><![CDATA[Structure:Data 2012]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=502566</guid>
		<description><![CDATA[Hadoop may be the current leader of the pack when it comes to handling big data, but LexisNexis says at Structure:Data the system it developed for its own internal data use -- and recently open-sourced -- is a viable alternative and in some cases is superior.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=502566&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>LexisNexis is a giant company, part of the Reed Elsevier information empire, but when it comes to handling big data, it is in the unusual position of being the underdog. The leader of the pack is Hadoop, which has already amassed a large and rapidly-growing following for its ability to manage large databases — but Armando Escalante of LexisNexis told attendees at GigaOM’s <a href="http://event.gigaom.com/structuredata/?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=502566+lexis-nexis-structure-data-2012&amp;utm_content=mathewingram">Structure:Data</a> conference in New York on Thursday that he believes the company has built what <del datetime="2012-03-22T20:24:41+00:00">could be</del> others might call a Hadoop killer. Originally developed to handle LexisNexis’ own internal data needs, the HPCC system was open-sourced nine months ago, and Escalante said it is already outperforming Hadoop in a number of ways.</p>
<div id="attachment_502557" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom.com/2012/03/22/lexis-nexis-structure-data-2012/1z5o2297/" rel="attachment wp-att-502557"><img title="Armando Escalante of LexisNexis at Structure:Data 2012" src="http://gigaom2.files.wordpress.com/2012/03/1z5o2297.jpg?w=300&#038;h=200" alt="Armando Escalante of LexisNexis at Structure:Data 2012" width="300" height="200" class="size-medium wp-image-502557"></a><p class="wp-caption-text">(c) 2012 Pinar Ozger. pinar@pinarozger.com</p></div>
<p>Because LexisNexis has so much data that it needs to analyze and provide to clients for its legal and government services, Escalante said that the company began building its own internal data-handling platform almost a decade ago, before “big data” even became a buzzword. “We already run our business on this, end-to-end,” he said. Once it became obvious that Hadoop was becoming a popular solution, LexisNexis decided to open-source the project and use the knowledge of a community of users and developers to improve and expand it.</p>
<p>Escalante said the LexisNexis’ system offers a number of features that Hadoop doesn’t, including a big-data delivery engine, and that it is building a layer that will allow its system to handle data from Hadoop. In fact, he said in a recent test a single LexisNexis node was 20-percent faster than a multi-node Hadoop configuration. But the biggest advantage that LexisNexis has, according to Escalante, is that because it is a large company and has already been using the system internally for years, the banks and insurance companies that make up a majority of its clients are more likely to want to use it than Hadoop.</p>
<p>“We have most of the banks and insurance companies as clients, and we are doing proof-of-concept tests with many of them now, and I think they may be more comfortable working with a company that’s not a startup,” Escalante said. “Big companies want a neck to squeeze sometimes, and LexisNexis has a big neck.”</p>
<p><a href="http://pro.gigaom.com/do/structuredata2012-livestream-signup?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=502566+lexis-nexis-structure-data-2012&amp;utm_content=mathewingram">Watch the livestream</a> of Structure:Data here.</p>
<p><strong>Update:</strong> This post was updated to reflect that moderator Derrick Harris described the project as a “Hadoop killer,” not Escalante.</p>
<p><iframe width="560" height="340" src="http://cdn.livestream.com/embed/gigaombigdata?layout=4&amp;clip=pla_dbc41d19-3b0f-4aa6-a4ec-ce575fa09e2c&amp;height=340&amp;width=560&amp;autoplay=false" style="border:0;outline:0" frameborder="0" scrolling="no"></iframe>
</p><div style="font-size: 11px;padding-top:10px;text-align:center;width:560px">Watch <a href="http://www.livestream.com/?utm_source=lsplayer&amp;utm_medium=embed&amp;utm_campaign=footerlinks" title="live streaming video">live streaming video</a> from <a href="http://www.livestream.com/gigaombigdata?utm_source=lsplayer&amp;utm_medium=embed&amp;utm_campaign=footerlinks" title="Watch gigaombigdata at livestream.com">gigaombigdata</a> at livestream.com</div>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=502566&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=412638"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=412638" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502566+lexis-nexis-structure-data-2012&utm_content=mathewingram">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502566+lexis-nexis-structure-data-2012&utm_content=mathewingram">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502566+lexis-nexis-structure-data-2012&utm_content=mathewingram">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502566+lexis-nexis-structure-data-2012&utm_content=mathewingram">Dissecting the data: 5 issues for our digital future</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/22/lexis-nexis-structure-data-2012/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/03/1z5o2297.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/03/1z5o2297.jpg?w=150" medium="image">
			<media:title type="html">Armando Escalante of LexisNexis at Structure:Data 2012</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/0bdf7ab171ade0708a11fa3378e6d8cb?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">Mathew</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/1z5o2297.jpg?w=300" medium="image">
			<media:title type="html">Armando Escalante of LexisNexis at Structure:Data 2012</media:title>
		</media:content>
	</item>
		<item>
		<title>Big data allows your employer to be big brother</title>
		<link>http://gigaom.com/2012/03/22/charnock-structure-data-2012/</link>
		<comments>http://gigaom.com/2012/03/22/charnock-structure-data-2012/#comments</comments>
		<pubDate>Thu, 22 Mar 2012 15:02:19 +0000</pubDate>
		<dc:creator>Stacey Higginbotham</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Cataphora]]></category>
		<category><![CDATA[Structure Data]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=502508</guid>
		<description><![CDATA[Your corporation is watching you, and it might be using Cataphora's software, which mines employees emails, IMs and other electronic communications to determine how big of a risk a corporation might face from one bad apple.
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=502508&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_502560" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom.com/2012/03/22/charnock-structure-data-2012/1z5o2226/" rel="attachment wp-att-502560"><img title="Elizabeth Charnock of Cataphora at Structure:Data 2012" src="http://gigaom2.files.wordpress.com/2012/03/1z5o2226.jpg?w=300&#038;h=200" alt="Elizabeth Charnock of Cataphora at Structure:Data 2012" width="300" height="200" class="size-medium wp-image-502560"></a><p class="wp-caption-text">(c) 2012 Pinar Ozger. pinar@pinarozger.com</p></div>
<p>Your corporation is watching you, and it might be using Cataphora’s software, which mines employees emails, IMs and other electronic communications to determine how big of a risk a corporation might face from one bad apple.</p>
<p>Calling it software that can detect “people who are weird along many different dimensions,” Elizabeth Charnock CEO of Cataphora, claimed that the software isn’t intruding on an employee’s rights to privacy, because that right can’t really exist in today’s office environment where 90 percent to 95 percent of employer communication is electronic, and thus hidden to managers.</p>
<p>She compared today’s environment to a few decades past, when office workers would have to send communication in open mailers that passed through the mailroom and talk on phone calls in earshot of many other people. But today, someone can’t manage by walking around and trying to overhear problems. Now those problems have migrated to Facebook and email, so it makes sense that managers follow them there, she said.</p>
<p>“When we do hear people call it creepy, it’s a result of people being misinformed. Seventy percent of companies monitor employees electronics communications,” Charnock said.</p>
<p>Cataphora doesn’t just monitor those communications, however. The technology helps them filter those conversations and establish the digital tone of an employee, which is slightly different from looking for triggering keywords in an email or IM conversation. Charnock said this tool, like any tool, can be used for an employee’s benefit. She cited the example of a French company that is using it to help determine who the best managers are, despite that organization having a bunch of regional offices with few employees. There, now don’t you feel better?</p>
<p><a href="http://pro.gigaom.com/do/structuredata2012-livestream-signup?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=502508+charnock-structure-data-2012&amp;utm_content=shigginbotham">Watch the livestream</a> of Structure:Data here.</p>
<p><iframe width="560" height="340" src="http://cdn.livestream.com/embed/gigaombigdata?layout=4&amp;clip=pla_3d2c4413-f5c0-488e-b6ff-d6ae4e384076&amp;height=340&amp;width=560&amp;autoplay=false" style="border:0;outline:0" frameborder="0" scrolling="no"></iframe>
</p><div style="font-size: 11px;padding-top:10px;text-align:center;width:560px"><a href="http://www.livestream.com/gigaombigdata?utm_source=lsplayer&amp;utm_medium=embed&amp;utm_campaign=footerlinks" title="Watch gigaombigdata">gigaombigdata</a> on livestream.com. <a href="http://www.livestream.com/?utm_source=lsplayer&amp;utm_medium=embed&amp;utm_campaign=footerlinks" title="Broadcast Live Free">Broadcast Live Free</a></div>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=502508&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=435776"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=435776" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502508+charnock-structure-data-2012&utm_content=shigginbotham">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502508+charnock-structure-data-2012&utm_content=shigginbotham">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/aws-storage-gateway-jolts-cloud-storage-ecosystem/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502508+charnock-structure-data-2012&utm_content=shigginbotham">AWS Storage Gateway jolts cloud-storage ecosystem</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=502508+charnock-structure-data-2012&utm_content=shigginbotham">A near-term outlook for big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/22/charnock-structure-data-2012/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/03/1z5o2226.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/03/1z5o2226.jpg?w=150" medium="image">
			<media:title type="html">Elizabeth Charnock of Cataphora at Structure:Data 2012</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/aee37121e18bf76bb9fee4494bab237a?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">shigginbotham</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/1z5o2226.jpg?w=300" medium="image">
			<media:title type="html">Elizabeth Charnock of Cataphora at Structure:Data 2012</media:title>
		</media:content>
	</item>
		<item>
		<title>Exclusive: EMC Buys Pivotal Labs</title>
		<link>http://gigaom.com/2012/03/16/exclusive-emc-buys-pivotal-labs/</link>
		<comments>http://gigaom.com/2012/03/16/exclusive-emc-buys-pivotal-labs/#comments</comments>
		<pubDate>Fri, 16 Mar 2012 22:47:41 +0000</pubDate>
		<dc:creator>Om Malik</dc:creator>
				<category><![CDATA[EMC]]></category>
		<category><![CDATA[Pivotal Labs]]></category>
		<category><![CDATA[Structure Data]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=500538</guid>
		<description><![CDATA[EMC Corp., the Hopkinton, Mass.-based storage and cloud hardware company has bought Pivotal Labs, a San Francisco-based consulting firms well known for its tool Pivotal Tracker and also for its pioneering work on agile development methodology. I had first reported earlier this week.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=500538&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>EMC Corp., the Hopkinton, Mass.-based storage and cloud hardware company has bought Pivotal Labs, a San Francisco-based consulting firms well known for its tool Pivotal Tracker and also for its pioneering work on agile development methodology. I have confirmed the news after talking to sources familiar with the company. <a href="http://gigaom.com/2012/03/13/pivotal-labs-is-in-takeover-talks/">I had first reported on this earlier this week</a>.</p>
<p>Pivotal has been involved with many web and mobile startups, including Twitter. The company’s agile development chops made it a favorite hunting ground for other companies such as Square. It is not clear to me why EMC acquired Pivotal, though a reasonable guess would be to take agile development and Pivotal methodology into the enterprise.</p>
<p><strong>Changing Landscape</strong></p>
<p>It would a different — and a smart — kind of services division for a company like EMC. As I have said before, large hardware vendors are facing a unique and <a href="http://gigaom.com/cloud/so-what-happens-to-storage/">new challenge from the likes</a> of Amazon, Microsoft and Google, who are getting the attention of the next generation of businesses. In order for these hardware vendors to stay relevant, they need to offer their hardware as a service – storage or computing for example – and then attract others such as startups and enterprises to their platforms.</p>
<p>PS: EMC Greenplum’s Scott Yara will be speaking with me at our <a href="http://event.gigaom.com/structuredata/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=500538+exclusive-emc-buys-pivotal-labs&amp;utm_content=om">Structure Data conference</a> in <a href="http://event.gigaom.com/structuredata/registration/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=500538+exclusive-emc-buys-pivotal-labs&amp;utm_content=om">New York on March 21st &amp; March 22nd</a>. I guess he and I will have something to talk about!</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=500538&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=814652"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=814652" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=500538+exclusive-emc-buys-pivotal-labs&utm_content=om">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=500538+exclusive-emc-buys-pivotal-labs&utm_content=om">Infrastructure Q1: Cloud and big data woo enterprises</a></li><li><a href="http://pro.gigaom.com/2012/04/aws-storage-gateway-jolts-cloud-storage-ecosystem/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=500538+exclusive-emc-buys-pivotal-labs&utm_content=om">AWS Storage Gateway jolts cloud-storage ecosystem</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=500538+exclusive-emc-buys-pivotal-labs&utm_content=om">A near-term outlook for big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/16/exclusive-emc-buys-pivotal-labs/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2010/10/pivotallabs-e1286492608699.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2010/10/pivotallabs-e1286492608699.png?w=150" medium="image">
			<media:title type="html">PivotalLabs</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/89c6ff98059617751fcf312690965fa0?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">om</media:title>
		</media:content>
	</item>
		<item>
		<title>Can Visual.ly spawn the data scientists of the future?</title>
		<link>http://gigaom.com/2012/03/12/can-visual-ly-spawn-the-data-scientists-of-the-future/</link>
		<comments>http://gigaom.com/2012/03/12/can-visual-ly-spawn-the-data-scientists-of-the-future/#comments</comments>
		<pubDate>Tue, 13 Mar 2012 06:00:17 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[@CNN]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[Infographics]]></category>
		<category><![CDATA[Lee Sherman]]></category>
		<category><![CDATA[mint.com]]></category>
		<category><![CDATA[Structure Data]]></category>
		<category><![CDATA[Visual.ly]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=497835</guid>
		<description><![CDATA[We live in a big data world, full of complex algorithms among any type of information one can imagine. Gaining the skills to work with it requires a lot work, however -- and the first step in changing that might be realizing that data can be fun.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=497835&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>We live in a big data world, full of complex algorithms calculating trends and patterns among any type of data one can imagine — but gaining the skills to do big data requires a lot of work. The first step to changing that might be in realizing that data can be fun.</p>
<div id="attachment_498000" class="wp-caption alignright" style="width: 159px"><a href="http://gigaom2.files.wordpress.com/2012/03/database.jpg"><img title="database" src="http://gigaom2.files.wordpress.com/2012/03/database.jpg?w=149&#038;h=604" alt="" width="149" height="604" class="size-large wp-image-498000"></a><p class="wp-caption-text">My web database comparison using Visual.ly.</p></div>
<p>On Monday, infographic hub <a href="http://visual.ly">Visual.ly</a> began its quest to bring this idea to the masses, releasing a new feature that lets users instantly create infographics based on statistics from competing Twitter accounts or from their own Facebook accounts. While the tool’s current utility is questionable, Visual.ly Create could have some profound effects on the future of visualization as it opens its doors to different data sets.</p>
<p>Experimenting with the new <a href="http://create.visual.ly/">Visual.ly Create</a> feature as it currently works, it’s easy to spot the limitations. Showdown-style infographics comparing Twitter followers aren’t particularly useful for anyone not somehow connected to the social media field, and visualizing data from one’s own Facebook page isn’t exactly an ideal vehicle for spreading meaningful information. They’re fun, yes (so fun, in fact, that the system was hammered all day and resulted in some serious processing delays), but fun only goes so far.</p>
<p>However, co-founder and Chief Content Officer Lee Sherman told me the company is planning some serious customization options that I think could help introduce a generation of children and teenagers to working with data and visualization. We’re not talking about visualizations as unique or advanced as what you see certain data journalists doing — the type of stuff that pops up on <a href="http://flowingdata.com/">Flowing Data</a> every day — just a tool that makes it easy, fun and rewarding to spend a little time thinking about data.</p>
<p>The problem is that <a href="http://gigaom.com/cloud/spread-the-word-math-is-the-new-sexiness-in-it/">we need more people with math skills</a> to meet growing employer demand for data scientists and data analysts. But how do you get started caring about data in the first place when the barriers are so high? Really working with data requires a deep understanding of both math and statistics, and Excel isn’t exactly a barrel of monkeys (nor are the charts it creates). Applications like Tableau <a href="http://gigaom.com/cloud/thanks-to-consumerization-its-ipo-season-in-analytics/">make it a little easier and the results more visually stunning</a>, but they’re still business intelligence products aimed at professionals. At our <a href="http://event.gigaom.com/structuredata/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=497835+can-visual-ly-spawn-the-data-scientists-of-the-future&amp;utm_content=dharrisstructure">Structure:Data event</a> next week in New York, we’ll talk about all sorts of new tools for making big data easier, but none are apps in the sense that today’s teens are using them.</p>
<div id="attachment_498003" class="wp-caption aligncenter" style="width: 614px"><a href="http://gigaom2.files.wordpress.com/2012/03/us-wind-patterns-625x335.jpg"><img title="US-Wind-Patterns-625x335" src="http://gigaom2.files.wordpress.com/2012/03/us-wind-patterns-625x335.jpg?w=708" alt=""   class="size-full wp-image-498003"></a><p class="wp-caption-text">This wasn't made with Visual.ly. (<a href="http://blog.thejit.org/2012/02/27/wind-motion-patterns/" rel="nofollow">http://blog.thejit.org/2012/02/27/wind-motion-patterns/</a>)</p></div>
<p>Think about every lame report you ever wrote or every essay you turned in that was based on little more than your own opinions. If it were relatively simple to produce a visually captivating graphic that used data to illustrate the point you were trying to make, or to pose a point of comparison for consideration, wouldn’t you have used it? Quantified-self apps are hot right now in part because <a href="http://gigaom.com/2011/12/26/will-you-track-your-health-data-with-an-app-or-a-device/">people get addicted to data once it’s made easy to digest</a>. But, as with visualizing your Facebook profile, that type of data isn’t very useful beyond your own body.</p>
<p>That’s where Visual.ly — whose founders came from personal finance startup Mint.com — comes in. Sherman said they’re trying to marry the ease of use that exemplifies Mint with the greater world of data out there. And while right now that means just making some instant graphics using Twitter and Facebook data, that will soon mean access to various APIs and publicly available data sets, as well as letting users upload their own data and even mashup data sources. Ultimately, Sherman said, users will be able to move away from prepackaged infographics and actually edit the fields themselves.</p>
<p>Whether it’s Visual.ly or something similar that ends up making infographics and other visualizations mainstream, it has to happen. The data revolution is too important to remain the haunt of only the best and the brightest. But it will take something to convince everyone else to give data a chance — and the ability to create a worthwhile visualization via an app could be that something.</p>
<p><em>Wind-motion patterns image <a href="http://blog.thejit.org/2012/02/27/wind-motion-patterns/">courtesy of Nicolas Garcia Belmonte</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=497835&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=70710"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=70710" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=497835+can-visual-ly-spawn-the-data-scientists-of-the-future&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/07/cloud-computing-and-trickle-down-analytics/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=497835+can-visual-ly-spawn-the-data-scientists-of-the-future&utm_content=dharrisstructure">Cloud computing and trickle-down analytics</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=497835+can-visual-ly-spawn-the-data-scientists-of-the-future&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=497835+can-visual-ly-spawn-the-data-scientists-of-the-future&utm_content=dharrisstructure">Dissecting the data: 5 issues for our digital future</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/12/can-visual-ly-spawn-the-data-scientists-of-the-future/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/03/splash_preview.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/03/splash_preview.jpg?w=150" medium="image">
			<media:title type="html">splash_preview</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/database.jpg?w=149" medium="image">
			<media:title type="html">database</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/us-wind-patterns-625x335.jpg" medium="image">
			<media:title type="html">US-Wind-Patterns-625x335</media:title>
		</media:content>
	</item>
		<item>
		<title>Can big data fix a broken system for software patents?</title>
		<link>http://gigaom.com/2012/03/11/can-big-data-fix-a-broken-system-for-software-patents/</link>
		<comments>http://gigaom.com/2012/03/11/can-big-data-fix-a-broken-system-for-software-patents/#comments</comments>
		<pubDate>Sun, 11 Mar 2012 22:00:59 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[legal issues]]></category>
		<category><![CDATA[machine-learning]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[Patent Law]]></category>
		<category><![CDATA[semantic analysis]]></category>
		<category><![CDATA[semantic search]]></category>
		<category><![CDATA[software patents]]></category>
		<category><![CDATA[Structure Data]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=496853</guid>
		<description><![CDATA[Legal scholars are always searching for ways to improve the patent system, sometimes via sweeping changes, but big data -- especially techniques such as machine learning and natural-language processing -- could help provide a technological fix to a big part of the problem.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=496853&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/03/campus.jpg"><img title="campus" src="http://gigaom2.files.wordpress.com/2012/03/campus.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="alignleft size-medium wp-image-497151"></a>Legal scholars are always searching for ways to improve the U.S. patent system, sometimes via sweeping changes, but big data could help provide a technological fix to a big part of the problem.</p>
<p>The patent system is broken — on that <a href="http://gigaom.com/2011/08/17/patent-reform-is-coming-who-should-care/">almost everyone agrees</a>. There’s a backlog of applications that results in exorbitant wait times to get a patent issued, and <a href="http://cyberlaw.stanford.edu/blog/2012/03/rosenhan-experiment-pto">the merit<strong> </strong>of patents that do get granted is often questionable</a>. If you’re forced to litigate a patent-infringement suit — <a href="http://www.law.com/jsp/cc/PubArticleCC.jsp?id=1322399109049">an increasingly likely scenario</a> – the costs can be crippling.</p>
<p>When it comes to software patents, the situation is particularly dire, which leads many critics to argue that software patents should be abolished altogether. <a href="http://gigaom.com/cloud/red-hats-secret-patent-deal-and-the-fate-of-jboss-developers/">Patent trolls are a widely cited nuisance</a>, but there’s a more fundamental problem. Litigaton is expensive, but litigation is all too common because there are so many software patents out there, and it can be very difficult — and very expensive — to find out whether a new invention possibly infringes on even one of them.</p>
<p>As we’ll discuss in depth at our <a href="http://event.gigaom.com/structuredata/?utm_source=tech&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=496853+can-big-data-fix-a-broken-system-for-software-patents&amp;utm_content=dharrisstructure">Structure:Data conference</a> in New York later this month, techniques such as machine learning and natural-language processing are already having transformative effects in a number of fields. Why not the patent system, too?</p>
<h2>Software patents don’t scale …</h2>
<p>Timothy B. Lee, a Cato Institute fellow (and frequent <em>Ars Technica</em> contributor), and Christina Mulligan of Yale’s Information Society Project explore one big software-patent problem in a new research paper titled <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2016968">“Scaling the Patent System.”</a> The gist of Lee and Mulligan’s argument is simple: software is such a wide-ranging and nebulous topic that it’s nearly impossible to index software patents in a manner that would make it easier to search for them. The system just doesn’t scale.</p>
<div id="attachment_497150" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/03/uspto-search.jpg"><img title="uspto search" src="http://gigaom2.files.wordpress.com/2012/03/uspto-search.jpg?w=300&#038;h=152" alt="" width="300" height="152" class="size-medium wp-image-497150"></a><p class="wp-caption-text">Current USPTO search engine</p></div>
<p>Property records are easily searchable because county recorders organize them in a logical manner based on geography. Even chemical patents, the authors point out, are relatively easy to search by chemical formula. With software patents, however, there’s no such luck:</p>
<blockquote><p>[I]n the absence of a precise, standardized scheme for classifying software inventions, patent applicants are free to use any terms they like — or even make up new ones — to describe their software inventions. The scope of a patent’s claims will not always be obvious from a patent’s title or abstract. And a single software patent can claim multiple applications that are only loosely connected to each other.</p></blockquote>
<p>Lee and Mulligan’s paper doesn’t even touch on the problems that arise with <em><a href="http://en.wikipedia.org/wiki/Prior_art">prior art</a></em>, generally defined as “all information that has been disclosed to the public in any form about an invention before a given date.” It only compounds the issue of searching the USPTO database when attorneys or patent examiners are forced to search articles, presentations and anything else that might negate the novelty of a proposed invention.</p>
<p>Unfortunately, the authors conclude, “Only dramatic reforms — such as excluding industries with high discovery costs from patent protection, establishing an independent invention defense, or eliminating injunctions — can return the patent system to its proper role of promoting innovation.”</p>
<h2>… but big data does</h2>
<p>Looking outside the law, though, and into the world of big data analytics, one needn’t look too hard to find some methods for making it easier to search for patents. The answer lies in semantics. If the problem is that keyword searches aren’t effective, then build a search engine that addresses a wide variety of sources and that takes into account related terms based on how frequently they’re linked, or based on the ontologies present in different industries.</p>
<ul><li>A startup called Apixio is already <a href="http://gigaom.com/cloud/apixio-is-bringing-big-data-to-medical-records-in-the-cloud/">doing something similar in the field of medical records</a>. It uses natural-language processing, machine learning and sematic association to make its Medical Information Navigation Engine (MINE) as easy to use as possible. Describing the service last April, I wrote that “when a doctor types a patient’s name and ‘chest pain’ into the search box, MINE is able to find ontological references to chest pain that bear little resemblance to the actual term.”</li>
<li>
<div id="attachment_497147" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/03/gravity.jpg"><img title="gravity" src="http://gigaom2.files.wordpress.com/2012/03/gravity.jpg?w=300&#038;h=218" alt="" width="300" height="218" class="size-medium wp-image-497147"></a><p class="wp-caption-text">Factually accurate, but irrelevant connections for Vanessa Laine</p></div>
<p>Another method for doing this comes from Gravity, a startup that uses a hybrid man-machine process to personalize content for readers of sites such as the <em>Wall Street Journal</em>. <a href="http://www.gravity.com/technology#overview">Gravity’s system</a> is complex to say the least (<a href="http://vimeo.com/38074957">here’s a video tutorial</a> that explains part of it), but the gist is that humans first serve as guides for machine-learning algorithms by determining connections between terms within large data sets, then the algorithms take over to complete the job faster than humans ever could. When they’re done, the humans step in one more time to kill any bad connections between terms. The result is a system that can determine with high accuracy that a person tweeting about Vanessa Laine (Los Angeles Laker Kobe Bryant’s ex-wife), for example, is probably more interested in basketball than about Laine’s date of birth or other accurate but irrelevant information.</p></li>
<li>Even IBM’s <a href="http://gigaom.com/cloud/what-watson-taught-us-humans-are-very-smart/">now-famous Watson question-answering machine</a> could prove beneficial if the USPTO were to leverage its capabilities. The system has actually been <a href="http://yalelawjournal.org/the-yale-law-journal-pocket-part/legislation/judges-in-jeopardy!:-could-ibm%E2%80%99s-watson-beat-courts-at-their-own-game?%2F=">suggested as an aid to help judges better interpret statutes</a> against the Constitution, but loaded with patent data, it could help identify potential infringements and even answer with some certainty which ones might be the most relevant to any given application.</li>
</ul><p>Indeed, a startup company called <a href="http://ipstreet.com">IP Street</a> is already attempting to bring the benefits of semantic technology to bear on the patent field. By analyzing the entire library of patents issued by the USPTO, Founder and CEO Lewis Lee told me IP Street is able to extract meaning from patents using information from the patent claims. A succinct explanation on the company’s website explains that “[the core] technology, known as LSI or latent semantic indexing, uses complicated mathematics and matrix decomposition (SVD) to identify similarities among documents. This allows you to enter an entire document (such as a product description, idea for a patent, etc.) and compare it to the universe of patents and patent applications—comparing across just the claims or the entire document.”</p>
<p>Big data won’t solve all the complaints people have about patents, but it could make life a lot easier for the inventors, attorneys and examiners tasked with determining whether a patent infringes a previous patent, or is even patent-worthy in the first place. The question now is whether the USPTO wants to leave simplification of the process in the hands of private parties like IP Street, or if the agency wants to bring a few big data experts on board to improve what it’s able to offer those who rely on it.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=496853&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=336415"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=336415" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=496853+can-big-data-fix-a-broken-system-for-software-patents&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/11/unlocking-big-datas-potential-with-search/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=496853+can-big-data-fix-a-broken-system-for-software-patents&utm_content=dharrisstructure">How search can unlock the power of big data</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-and-data-second-quarter-2012-analysis-and-outlook-2/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=496853+can-big-data-fix-a-broken-system-for-software-patents&utm_content=dharrisstructure">Takeaways from the second quarter in cloud and data</a></li><li><a href="http://pro.gigaom.com/2012/06/cloud-computing-infrastructure-2012-and-beyond/?utm_source=tech&utm_medium=editorial&utm_campaign=auto3&utm_term=496853+can-big-data-fix-a-broken-system-for-software-patents&utm_content=dharrisstructure">Cloud computing infrastructure: 2012 and beyond</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/11/can-big-data-fix-a-broken-system-for-software-patents/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/03/campus1.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/03/campus1.jpg?w=150" medium="image">
			<media:title type="html">campus</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/campus.jpg?w=300" medium="image">
			<media:title type="html">campus</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/uspto-search.jpg?w=300" medium="image">
			<media:title type="html">uspto search</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/gravity.jpg?w=300" medium="image">
			<media:title type="html">gravity</media:title>
		</media:content>
	</item>
	</channel>
</rss>
