<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; big data</title>
	<atom:link href="http://gigaom.com/tag/big-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Wed, 22 May 2013 10:33:32 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; big data</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>New algorithm maps cancer cells like nodes on a social network</title>
		<link>http://gigaom.com/2013/05/20/new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network/</link>
		<comments>http://gigaom.com/2013/05/20/new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network/#comments</comments>
		<pubDate>Mon, 20 May 2013 20:58:53 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[cancer research]]></category>
		<category><![CDATA[graph analysis]]></category>
		<category><![CDATA[health care]]></category>
		<category><![CDATA[medical research]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=647256</guid>
		<description><![CDATA[A group of researchers from Columbia and Stanford have created a method for turning complex cellular datasets into visualizations that map the similarities between tens of thousands of cells within a tissue sample.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=647256&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Often times, the best way to to get a sense of your data is to look at it. A bunch of of numbers or words might not mean anything sitting within a table, but they start to make a lot more sense when they’re turned into a chart. In fields like mass cytometry, though, where doctors might want to analyze dozens of biological markers for each of tends of thousands of cells in a tissue sample, creating an easy-to-understand chart is easier said than done.</p>
<p>That’s why a group of researchers from Columbia University and Stanford University developed an algorithm that can do just that, turning those cells into something that resembles your social graph. This lets researchers see how the various cells are related to each other so they know , for example, where to focus cancer treatment and what to track as that treatment progresses.</p>
<p>The idea of representing large or complex data as a graph is nothing new, but it has taken on more prominence thanks to the rise of social media and those ubiquitous social graphs that map out who’s connected to whom. As we highlighted recently, however, <a href="http://gigaom.com/2013/05/14/were-witnessing-the-rise-of-the-graph-in-big-data/">graph analysis is becoming more popular</a> outside the realm of social networks, and is being applied to problems that are more complex than just figuring out simple relationships within a network. In cases such as medical research, especially, graphs can provide a very effective way of seeing how potentially hundreds of thousands of data points spanning perhaps hundreds of variables are similar to each other.</p>
<p>That’s exactly what the team at Columbia and Stanford has done with a new algorithm that they’ve demonstrated within the realm of mass cytometry. According to <a href="http://newsroom.cumc.columbia.edu/2013/05/20/computational-tool-translates-complex-data-into-simplified-2-dimensional-images/">a press release announcing the research</a> (which is <a href="http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.2594.html">available via paid download</a> at Nature Biotechnology):</p>
<blockquote id="quote-the-method-called-vi"><p>“The method, called viSNE (visual interactive Stochastic Neighbor Embedding), is based on a sophisticated algorithm that translates high-dimensional data (e.g., a dataset that includes many different simultaneous measurements from single cells) into visual representations similar to two-dimensional ‘scatter plots’ ….</p>
<p>“The viSNE software can analyze measurements of dozens of molecular markers. In the two-dimensional maps that result, the distance between points represents the degree of similarity between single cells. The maps can reveal clearly defined groups of cells with distinct behaviors (e.g., drug resistance) even if they are only a tiny fraction of the total population. This should enable the design of ways to physically isolate and study these cell subpopulations in the laboratory.”</p></blockquote>
<p>I assume they say <em>similar</em> to scatter plots because the algorithm is analyzing data across more than two dimensions, although the resulting chart is essentially the same (i.e., data points with similar characteristics will form clusters).</p>
<div id="attachment_647346" class="wp-caption aligncenter" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/05/screen-shot-2013-05-20-at-9-42-09-am.png"><img alt="The results of viSNE, showing cell densities in diagnosis and relapse samples." src="http://gigaom2.files.wordpress.com/2013/05/screen-shot-2013-05-20-at-9-42-09-am.png?w=708&#038;h=403" width="708" height="403" class="size-large wp-image-647346"></a><p class="wp-caption-text">The results of viSNE, showing cell densities in diagnosis and relapse samples.</p></div>
<p>Whether or not they’re technically similar, this research <a href="http://gigaom.com/2013/01/16/has-ayasdi-turned-machine-learning-into-a-magic-bullet/">seems similar to what Ayasdi is doing</a> with its new data-analysis software based on a technique called topological data analysis. In both cases, though, the algorithms aren’t necessarily concerned with how data points interact with one another (like in network graphs), but rather what similar characteristics the points share. Ayasdi’s software has been used in cancer research, too, including on datasets spanning hundreds of patients and tens of thousands of variables.</p>
<p>In theory — although not likely in practice considering the complexity of the datasets medical researchers are dealing with — these approaches are similar to clustering approaches that are also popular among data scientists working with web companies. In areas such as e-commerce or <a href="http://gigaom.com/2013/05/05/how-mailchimp-learned-to-treat-data-like-orange-juice-and-rethink-email-in-the-process/">email management</a>, for example, where there isn’t a strong social element, companies can broadly break customers into distinct groups based on their behavior or interests.</p>
<div id="attachment_642360" class="wp-caption aligncenter" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/05/marriedknit-tiff.jpg"><img alt="A sample cluster of subscribers." src="http://gigaom2.files.wordpress.com/2013/05/marriedknit-tiff.jpg?w=708&#038;h=427" width="708" height="427" class="size-large wp-image-642360"></a><p class="wp-caption-text">A sample cluster of MailChimp subscribers.</p></div>
<p>Of course, curing cancer is a slightly more compelling — and difficult — goal than targeted advertising. The algorithms have to be precise so as not to miss similarities hidden within the mass of data. In the case of viSNE, the researchers say they’ve been able to spot small groups of cells (like 20 out of tens of thousands) that might be able to survive chemotherapy and increase the likelihood of a recurring tumor.</p>
<p>But we probably shouldn’t bee too quick to discount the work that web companies do as somehow less valuable than that of cancers researchers, for example. The big data era arguably started with the web, and web companies have generated some of the most important data-analysis techniques and technologies around today (see, for example, Google’s Jeff Dean, with whom I’ll be speaking at our <a href="http://event.gigaom.com/structure/schedule/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=647256+new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network&amp;utm_content=dharrisstructure">Structure conference</a> next month). As <a href="http://gigaom.com/2012/11/27/why-data-is-the-key-to-better-medicine-and-maybe-a-cure-for-cancer/">medical researchers start generating more and more data</a> via cytometry, genome sequencing and even electronic medical records, it will be critical for individuals in all fields to keep track of what data scientists in other fields are doing and <a href="http://gigaom.com/2013/03/26/how-researchers-are-fighting-lung-cancer-using-pagerank/">figure out how that might apply to their own work</a>.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=647256&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=284393"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=284393" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647256+new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647256+new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network&utm_content=dharrisstructure">Connected world: the consumer technology revolution</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647256+new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/01/newnet-q4-platform-mania-and-social-commerce-shakeout/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647256+new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network&utm_content=dharrisstructure">NewNet Q4: Platform mania and social commerce shakeout</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/20/new-algorithm-maps-cancer-cells-like-nodes-on-a-social-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/screen-shot-2013-05-20-at-9-42-09-am1-e1369079018409.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/screen-shot-2013-05-20-at-9-42-09-am1-e1369079018409.png?w=150" medium="image">
			<media:title type="html">Screen-Shot-2013-05-20-at-9.42.09-AM</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/screen-shot-2013-05-20-at-9-42-09-am.png?w=708" medium="image">
			<media:title type="html">The results of viSNE, showing cell densities in diagnosis and relapse samples.</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/marriedknit-tiff.jpg?w=708" medium="image">
			<media:title type="html">A sample cluster of subscribers.</media:title>
		</media:content>
	</item>
		<item>
		<title>Alteryx raises $12M to make predictive analytics user-friendly</title>
		<link>http://gigaom.com/2013/05/20/alteryx-raises-12m-to-make-predictive-analytics-user-friendly/</link>
		<comments>http://gigaom.com/2013/05/20/alteryx-raises-12m-to-make-predictive-analytics-user-friendly/#comments</comments>
		<pubDate>Mon, 20 May 2013 14:55:03 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[Alteryx]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[predictive analytics]]></category>
		<category><![CDATA[statistical analysis]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=647059</guid>
		<description><![CDATA[Analytics provider Alteryx has raised another $12 million as it tries to make statistical analysis a more consumer-friendly experience. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=647059&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.alteryx.com/">Alteryx</a>, an Irvine, Calif.-based startup trying to be a hybrid of Tableau and statistical analysis software like SAS or R, raised $12 million in an extended Series A round. Newcomer firm Toba Capital led the round, with existing investor SAP Capital also contributing.</p>
<p>President and COO George Mathew says the company&#8217;s mission is to be a one-stop shop for statistical analysis. It wants to be the one place where analysts and data scientists can blend their data, model it on it and then visualize it. Often, he noted, that same process might require two or three separate products.</p>
<p>Another feature that Alteryx hopes will set it apart is its collection of prebuilt models in what the company calls an analytics gallery. Users can share their own work or find models others have built for tackling similar issues. Alteryx also offers up its own pre-formatted datasets for analysis, often public data <a href="http://www.alteryx.com/module-exchange-details/614">such as the U.S. census</a>.</p>
<p>&#8220;The canvas for creating an analytics application should never be blank for the analyst when they&#8217;re getting started,&#8221; Mathew explained. They often need to understand external data as well as their internal data, so Alteryx&#8217;s software gives them easy access to it.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/gallery.jpg"><img src="http://gigaom2.files.wordpress.com/2013/05/gallery.jpg?w=708&#038;h=392" alt="gallery" width="708" height="392"  class="aligncenter size-large wp-image-647099" /></a></p>
<p>Because it&#8217;s based on the R statistical-programming language, heavy R user Walmart has been able to transition some workloads to Alteryx when employees need an easier user experience. McDonald&#8217;s uses it to analyze data about franchisees and about its growth strategy in China, and Bloomin&#8217; Brands (parent of company of Outback Steakhouse and other restaurants) is using it to help build menus that take into account what diners in various parts of the country prefer to eat. Nine of the 10 leading top wireless providers providers are also users, Mathew said, trying to blend actual call data with traditional sources such as customer service data.</p>
<p>Mathew compares Alteryx&#8217;s current growth as analogous to software-as-a-service applications like Salesforce.com in the CRM space, or even <a href="http://gigaom.com/2013/05/17/tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent/">Tableau in the traditional business-intelligence space</a>. In a business world increasingly driven by at least the idea of big data, one might expect any vendor pushing a more consumer-like purchase and consumption experience to get interest from companies tired of dealing with legacy software or never wanting to experience it in the first place.</p>
<p>&#8220;The disruption that&#8217;s happening is creating a new space for ourselves,&#8221; Mathew said, &#8220;without having to go head to head, frankly, with the a status quo out there.&#8221;</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-896311p1.html">Shutterstock user ramcreations</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=647059&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=299927"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=299927" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647059+alteryx-raises-12m-to-make-predictive-analytics-user-friendly&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/the-internet-of-things-creating-tomorrows-health-care/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647059+alteryx-raises-12m-to-make-predictive-analytics-user-friendly&utm_content=dharrisstructure">The Internet of things: creating tomorrow&#8217;s health care</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647059+alteryx-raises-12m-to-make-predictive-analytics-user-friendly&utm_content=dharrisstructure">Dissecting the data: 5 issues for our digital future</a></li><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=647059+alteryx-raises-12m-to-make-predictive-analytics-user-friendly&utm_content=dharrisstructure">Connected world: the consumer technology revolution</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/20/alteryx-raises-12m-to-make-predictive-analytics-user-friendly/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/shutterstock_114471748.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/shutterstock_114471748.jpg?w=150" medium="image">
			<media:title type="html">analytics</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/gallery.jpg?w=708" medium="image">
			<media:title type="html">gallery</media:title>
		</media:content>
	</item>
		<item>
		<title>Tableau closes Day 1 as a $2.9B public company, up 64 percent</title>
		<link>http://gigaom.com/2013/05/17/tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent/</link>
		<comments>http://gigaom.com/2013/05/17/tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent/#comments</comments>
		<pubDate>Fri, 17 May 2013 22:59:24 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[ipo]]></category>
		<category><![CDATA[tableau]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=646748</guid>
		<description><![CDATA[Tableau had a successful IPO, closing the trading day up 64 percent and raking in $254 million. CEO Christian Chabot says the company is now set to make itself known around the world.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646748&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Data analytics star Tableau had a successful initial public offering on Friday, <a href="http://data.cnbc.com/quotes/DATA">closing the day up nearly 64 percent</a> at $50.75 per share. That means the company brought in about $254 million (it sold 5 million shares, while stockholders sold 3.4 million) and has a market cap of $2.9 billion. Shares have remained relatively steady in after-hours trading, trending down only slightly.</p>
<p>&#8220;We&#8217;re thrilled,&#8221; Tableau co-founder and CEO Christian Chabot told me during a call after the market closed. One should hope so.</p>
<p>Chabot and his fellow co-founders stand to make a lot of money if today&#8217;s closing price holds up, as does its sole investor NEA. The firm put $15 million into Tableau since it launched in 2003, and has rode that sum to profitability and more than $127 million in annual revenue.</p>
<p>Here&#8217;s a quick chart (made using Tableau Public) showing who owns how many share and what they&#8217;re potentially worth.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/tabipo.jpg"><img src="http://gigaom2.files.wordpress.com/2013/05/tabipo.jpg?w=708&#038;h=443" alt="tabipo" width="708" height="443"  class="aligncenter size-large wp-image-646811" /></a></p>
<p>The company didn&#8217;t really need more capital to operate, Chabot said, but one of the primary drivers was to raise awareness of the company. It has about 12,000 customers, he said, but there are millions more possible users. As part of attracting them, the company is going to expand globally and is working to improve its reach across mobile devices, the cloud and the Mac operating system.</p>
<p>&#8220;I don&#8217;t believe in the this whole &#8216;or&#8217; philosophy with computers,&#8221; Chabot said. &#8220;It&#8217;s &#8216;and&#8217;&#8221; &#8212; meaning people will use desktops and tablets and smartphones.</p>
<p>More prominence and more users singing its praises might also dispel the notion that Tableau is just about visualization. It has some fairly advanced features under the covers (as a commenter <a href="http://gigaom.com/2013/05/16/tableau-prices-its-stock-at-31-per-share-for-fridays-ipo/">to my earlier post</a> about the company&#8217;s influence pointed out), even if they&#8217;re hidden by the relatively simple user experience. </p>
<p>&#8220;Tableau is not a visualization company, per se, it&#8217;s really an analytics company,&#8221; Chabot said.</p>
<p>However, if the company really wants to expand its reach to everyone one who wants to gain knowledge from data &#8212; something Chabot calls a &#8220;timeless human need&#8221; &#8212; <a href="http://gigaom.com/2013/04/07/we-need-a-data-democracy-not-a-benevolent-data-dictatorship/">it might actually need to get simpler</a>. More marketing can let potential business users know about new features like forecasting and data-extraction, but it won&#8217;t make a dentist is Des Moines better at formatting his data.</p>
<p>After raising $254 million in its IPO, though, Tableau is in a good place to do whatever it has to.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646748&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=983867"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=983867" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646748+tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/01/newnet-q4-platform-mania-and-social-commerce-shakeout/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646748+tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent&utm_content=dharrisstructure">NewNet Q4: Platform mania and social commerce shakeout</a></li><li><a href="http://pro.gigaom.com/2012/01/newnet-q4-platform-mania-and-social-commerce-shakeout/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646748+tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent&utm_content=dharrisstructure">NewNet Q4: Platform mania and social commerce shakeout</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646748+tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/17/tableau-closes-day-1-as-a-2-9-billion-public-company-up-64-percent/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/tabyahoo.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/tabyahoo.png?w=150" medium="image">
			<media:title type="html">tabyahoo</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/tabipo.jpg?w=708" medium="image">
			<media:title type="html">tabipo</media:title>
		</media:content>
	</item>
		<item>
		<title>Database startup Drawn to Scale is closing down</title>
		<link>http://gigaom.com/2013/05/17/database-startup-drawn-to-scale-is-closing-down/</link>
		<comments>http://gigaom.com/2013/05/17/database-startup-drawn-to-scale-is-closing-down/#comments</comments>
		<pubDate>Fri, 17 May 2013 21:24:03 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Drawn to Scale]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[SQL on Hadoop]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=646718</guid>
		<description><![CDATA[Database startup Drawn to Scale, creator of the SQL-on-Hadoop technology called Spire, is closing down. The company's product, Spire, was one of the first SQL-on-Hadoop technologies.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646718&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Database startup Drawn to Scale, creator of the SQL-on-Hadoop technology called Spire, is closing down. Co-founder and CEO Bradford Stephens officially <a href="http://www.roadtofailure.com/?p=11">announced the closure in a blog post</a> on Friday.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/spirearchitecture-015-e1361407038325.png"><img  alt="spirearchitecture-015-e1361407038325" src="http://gigaom2.files.wordpress.com/2013/05/spirearchitecture-015-e1361407038325.png?w=300&#038;h=185" width="300" height="185" class="alignleft size-medium wp-image-646740" /></a>The company&#8217;s product, Spire, which provided full SQL support on top of the HBase NoSQL database, was one of the first products to <a href="http://gigaom.com/2012/07/24/how-one-startup-wants-to-inject-hadoop-into-your-sql/">try to blend Hadoop&#8217;s scalability with the robustness and familiarity of SQL</a>. That&#8217;s now <a href="http://gigaom.com/2013/03/05/the-hadoop-ecosystem-the-welcome-elephant-in-the-room-infographic/">an increasingly crowded space</a> (and has grown since that linked graphic was created). In March, Drawn to Scale <a href="http://gigaom.com/2013/03/19/drawn-to-scale-wants-to-solve-your-mongodb-scalability-problems/">expanded its support to MongoDB</a>, as well.</p>
<p>I wasn&#8217;t shocked when Stephens told me the news &#8212; questions about the four-year-old company&#8217;s financial health had been swirling for a while &#8212; but to hear of its financial woes was a bit surprising. His account in the post pretty much echoes what I had heard from others:</p>
<blockquote id="quote-it-seemed-we-had-eve"><p>&#8220;It seemed we had everything going for us — paid customers such as American Express, Orange Telecom, Flurry, and 4 others. Our technology worked brilliantly, we had a big hiring pipeline, and we had great media presence against our competitors who raised 10-100x more cash.&#8221;</p></blockquote>
<p>He added:</p>
<blockquote id="quote-yet-five-days-before2"><p>&#8220;Yet five days before we signed term sheets for a big A round or sold the company, we started getting hit by a series of black swans — and we just didn’t have what we needed to recover. I’ll leave the public detail at that level, but I will say that paying employees’ health insurance out of your meager savings is a powerful incentive to change course.&#8221;</p></blockquote>
<p>Up to this point, the company <a href="http://gigaom.com/2012/03/08/drawn-to-scale-raises-money-to-make-sql-big-data-ready/">had raised $925,000</a> from RTP Ventures, IA Ventures and SK Ventures. There&#8217;s no word yet on what will come of the company&#8217;s intellectual property.</p>
<p>As Stephens &#8212; who&#8217;s now doing an entrepreneur-in-residence gig at Ping Identity and helping out other startups (including popular wardrobe app <a href="http://www.clothapp.com/">Cloth</a>) &#8212; succinctly put it during a phone discussion, &#8220;We just don&#8217;t have the horsepower to keep running the company.&#8221;</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646718&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=796573"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=796573" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/07/scaling-hadoop-clusters-the-role-of-cluster-management/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">Scaling Hadoop clusters: the role of cluster management</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/sector-roadmap-hadoop-platforms-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646718+database-startup-drawn-to-scale-is-closing-down&utm_content=dharrisstructure">2012: The Hadoop infrastructure market booms</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/17/database-startup-drawn-to-scale-is-closing-down/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/dtsdragon.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/dtsdragon.png?w=150" medium="image">
			<media:title type="html">dtsdragon</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/spirearchitecture-015-e1361407038325.png?w=300" medium="image">
			<media:title type="html">spirearchitecture-015-e1361407038325</media:title>
		</media:content>
	</item>
		<item>
		<title>Tableau prices its stock at $31 per share for Friday&#8217;s IPO</title>
		<link>http://gigaom.com/2013/05/16/tableau-prices-its-stock-at-31-per-share-for-fridays-ipo/</link>
		<comments>http://gigaom.com/2013/05/16/tableau-prices-its-stock-at-31-per-share-for-fridays-ipo/#comments</comments>
		<pubDate>Fri, 17 May 2013 00:03:48 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[tableau]]></category>
		<category><![CDATA[Visualization]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=646412</guid>
		<description><![CDATA[Tableau's initial public offering is on Friday, and expectations are high. The company has inspired much of the next-generation analytics space, and how it fares could be telling about just how powerful the data movement is.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646412&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.tableausoftware.com/">Tableau Software</a> has priced shares for its initial public offering on Friday at $31. The company is offering up 5 million shares, while stockholders are offering 3.2 million shares. Tableau co-founder and CEO Christian Chabot will ring the opening bell on the New York Stock Exchange, where the company will list under the symbol &#8220;DATA.&#8221;</p>
<p>That&#8217;s an apt ticker symbol for a company that is in some ways a bellwether for the current fascination with all things data. Tableau isn&#8217;t a big data company, per se, but its visualization software breathes life into many big data calculations. Its <a href="http://gigaom.com/2013/04/07/we-need-a-data-democracy-not-a-benevolent-data-dictatorship/">focus on making software that&#8217;s easy to use</a> and that creates visually captivating charts has turned people from numerous professions into amateur data analysts. (I&#8217;ve even used it in the past, <a href="http://gigaom.com/2011/10/25/google-shows-the-limits-of-a-free-web/">including for the first time</a> in 2011.)</p>
<div id="attachment_646423" class="wp-caption alignright" style="width: 298px"><a href="http://gigaom2.files.wordpress.com/2013/05/und-leadership-christian-small.jpg"><img  alt="Christian Chabot" src="http://gigaom2.files.wordpress.com/2013/05/und-leadership-christian-small.jpg?w=708"   class="size-full wp-image-646423" /></a><p class="wp-caption-text">Christian Chabot</p></div>
<p>As Chabot <a href="http://gigaom.com/2012/02/23/thanks-to-consumerization-its-ipo-season-in-analytics/">told me during a conversation in 2011</a>, &#8220;In any field of human endeavor &#8230; there are a hundred to a thousand more people who understand the data of that field more than they understand reporting and analytics.&#8221;</p>
<p>Anytime you read about a hot new visualization or analytics startup promising the moon, you&#8217;re also seeing the results of what Tableau has sown in terms of the user experience. Many of those same companies will be quick to tell you how limited Tableau&#8217;s capabilities are. It&#8217;s memory-bound, it doesn&#8217;t have a database, it&#8217;s not available in the cloud (or on the Mac operating system), it can&#8217;t do predictive analytics. All true.</p>
<p>Of course, if it raises the kind of capital it expects to by going public, it can build and buy a lot of those capabilities. If pricing stays flat all day Friday, Tableau stands to make $155 million from its 5 million shares. Previous estimates <a href="http://www.forbes.com/sites/tomiogeron/2013/05/16/tableau-software-raises-ipo-price-range/">had Tableau&#8217;s market cap at around $1.7 billion</a> at a price of $29 per share (the company&#8217;s S-1 filing <a href="http://edgar.sec.gov/Archives/edgar/data/1303652/000119312513138700/d469057ds1.htm#rom469057_17">is available here</a>).</p>
<p>If investors have really bought into the company and the concept of a data-driven world, then who knows. Machine-data expert Splunk wnet public in 2012, flying the big data banner, and <a href="http://gigaom.com/2012/04/19/splunk-ipo-kills-lives-up-to-expectations/">saw shares peak at 91 percent above</a> its original asking price of $17.</p>
<p>I&#8217;m not suggesting Tableau is the biggest name in data, or even that it will some day become it. This next-generation analytics field is very young, with startups and larger vendors alike sometimes competing against themselves to win wholly new accounts than trying to displace legacy vendors within large enterprises. And every month, it seems, <a href="http://gigaom.com/2013/05/13/visualization-is-the-future-6-startups-re-imagining-how-we-consume-data/">I come across some new startup</a> that was built with the same principles in mind as Tableau, but with the advantage of having today&#8217;s best practices baked into its software.</p>
<p>But Tableau definitely commands a lot of the mindshare. How it fares as a public company <a href="http://gigaom.com/2013/04/03/a-tableau-ipo-could-validate-the-big-data-visualization-push-or-not/">could be a strong indicator</a> of just how powerful the data movement is, and how well it capitalizes on a new influx of cash will determine how long it stays on the top of customers&#8217; minds.</p>
<p><em>This post was updated at 7:01 p.m. to include previous estimates of the company&#8217;s market capitalization and a link to its S-1 filing.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646412&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=875862"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=875862" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646412+tableau-prices-its-stock-at-31-per-share-for-fridays-ipo&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646412+tableau-prices-its-stock-at-31-per-share-for-fridays-ipo&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/03/4-ipad-apps-to-help-wrangle-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646412+tableau-prices-its-stock-at-31-per-share-for-fridays-ipo&utm_content=dharrisstructure">4 iPad apps to help wrangle data</a></li><li><a href="http://pro.gigaom.com/2012/12/sector-roadmap-health-care-and-big-data-in-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646412+tableau-prices-its-stock-at-31-per-share-for-fridays-ipo&utm_content=dharrisstructure">Health care and big data in 2012</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/16/tableau-prices-its-stock-at-31-per-share-for-fridays-ipo/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/products_desktop.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/products_desktop.jpg?w=150" medium="image">
			<media:title type="html">products_desktop</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/und-leadership-christian-small.jpg" medium="image">
			<media:title type="html">Christian Chabot</media:title>
		</media:content>
	</item>
		<item>
		<title>This is why big data is the sweet spot for SaaS</title>
		<link>http://gigaom.com/2013/05/14/this-is-why-big-data-is-the-sweet-spot-for-saas/</link>
		<comments>http://gigaom.com/2013/05/14/this-is-why-big-data-is-the-sweet-spot-for-saas/#comments</comments>
		<pubDate>Wed, 15 May 2013 01:10:22 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[BloomReach]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[data science]]></category>
		<category><![CDATA[machine-learning]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[saas]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=645189</guid>
		<description><![CDATA[When it comes to using big data technology effectively, there's a lot to like about SaaS. When companies like BloomReach create and analyze massive web-wide data sets, they automate insights that almost no individual company could discover on its own.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=645189&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>People often ask me where the smart money is in big data. I often tell them that’s a foolish question, because I’m not an investor — but if I were, I’d look to software as a service.</p>
<p>There are two primary reasons why, the first of which is obvious: Companies are tired of managing applications and infrastructure, so something that optimizes a common task using techniques they don’t know on servers they don’t have to manage is probably compelling. It’s called cloud computing.</p>
<p>The other reason is that <a href="http://gigaom.com/2013/04/29/google-research-director-and-ai-expert-peter-norvig-elected-into-aaas/">the <em>big </em>part of big data really is important</a> if you want to get a really clear picture of what’s happening in any given space. While no single end-user company can (or likely would) address search-engine optimization, for example, by building a massive store comprised of data from hundreds or thousands of companies as well as the entire web, a cloud service dedicated to that specific task can.</p>
<p>From <a href="http://gigaom.com/2012/11/28/log-data-startup-sumo-logic-raises-30m/">web security</a> to <a href="http://gigaom.com/2012/06/21/how-collective-intelligence-is-reshaping-systems-management/">systems management</a>, we’re already seeing how centralized data stores provide SaaS companies a broad view into what’s happening that can then be filtered down to serve each individual customer’s specific situation. <a href="http://www.bloomreach.com/">BloomReach</a>, a SaaS startup that helps companies optimize web-page content, is another good example of this principle in action.</p>
<h2 id="how-do-you-say-cotton-maxi-dre">How do <em>you</em> say, “cotton maxi dress”</h2>
<p>Ideally, BloomReach Head of Marketing Joelle Kaufman told me, the company wants to help customers ensure they get found in web searches by making sure they’re not invisible (buried deep down), irrelevant (not saying anything meaningful on their sites) or incompatible (not speaking their consumers’ language). On Tuesday, the company <a href="http://www.bloomreach.com/buzz/media-center-pr/continuous-quality-management/">announced a new feature called Continuous Quality Management</a>, which lets customers continuously monitor their pages to ensure they’re still featuring the right products and the right terminology. It’s the latest addition to a seemingly useful service that’s built atop a big data foundation few — if any — of its customers would ever attempt to build themselves.</p>
<p>BloomReach is able to help companies optimize their sites because it’s constantly crawling the web in order to figure out how everyone else is describing their content, laying out their pages and structuring their links. Running on the Amazon Web Services cloud, BloomReach runs more than 1,000 Hadoop jobs a day that process about 5 terabytes of data and a billion data points about users’ site behavior. With the latter, co-founder and CTO Ashutosh Garg explained, the company is trying to figure out who’s visiting sites, what they’re doing, how long they’re spending there and how they’re related in terms of behavior.</p>
<p>“You need to have the right amount of data and from the right places before we can do anything with it,” he said. “… It’s a massive machine learning problem.”</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/br-stack.png"><img alt="BR stack" src="http://gigaom2.files.wordpress.com/2013/05/br-stack.png?w=708&#038;h=531" width="708" height="531" class="aligncenter size-large wp-image-645359"></a></p>
<p>When you consider all the possible ways something could be described or formatted, the scale of the problem becomes more evident. Simple semantic analysis like associating “desk” and “table” is easy, Garg explained, but what if some wants a lightweight camera and you only have its exact weight listed without any indication of how it compares to other options? What if people searching for “smartphones” really mean “Android phones,” but you’re top-loading your results with BlackBerry phones and Windows phones?</p>
<p>Another of Garg’s hypotheticals has to do with consumers’ presentation biases. If, for example, they’re looking at a lot of websites that look the same or focus on the same things (e.g., megapixels for digital cameras), they’ll expect to see the same things from every site.</p>
<h2 id="10-nonillion-possibilities-cho">10 nonillion possibilities: Choose 1.</h2>
<p>From a sheer numbers perspective, things get even hairier when you’re trying to determine the relationship between any two pages in order to figure out the best path for links to to take. Garg said this is what computer scientists call an <a href="http://en.wikipedia.org/wiki/NP-complete">NP-complete problem</a>, which means the amount of time it takes to process the results is exponentially greater than the amount of content you’re analyzing. So, for example, analyzing 40 pages doesn’t take 10 times as long as analyzing 4 pages, but more like 100 times longer.</p>
<p>Actually, BloomReach CEO Raj De Datta gave me another example of this problem <a href="http://gigaom.com/2012/02/22/bloomreach-wants-to-save-your-site-with-big-data/">when we spoke in early 2012</a>. Here’s how I described it then:</p>
<blockquote id="quote-if-a-company-wants-t"><p>[I]f a company wants to display just 1,000 products across 100 pages, De Datta explained, there are 10-to-the-28th-power (10 octillion) possibilities for how to do that. When it comes time to describe those products, there are 10-to-the-30th-power (10 nonillion) possibilities.</p></blockquote>
<p>If a website has a million pages, Garg said, “it will take you longer than the life of the universe to solve that problem.”</p>
<p>Where this type of problem arises, BloomReach turns to <a href="http://en.wikipedia.org/wiki/Monte_Carlo_method">Monte Carlo simluations</a>, a favorite technique of physicists and Wall Street quants. The method involves running lots of simulations over large data sets in order to determine approximate results in a reasonable time frame. (And if all this isn’t enough computer science and cloud infrastructure for you, I suggest attending our <a href="http://event.gigaom.com/structure/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=645189+this-is-why-big-data-is-the-sweet-spot-for-saas&amp;utm_content=dharrisstructure">Structure conference</a> in June, which features a who’s who list of speakers, including Google’s Jeff Dean, Facebook’s Jay Parikh and Netflix’s Adrian Cockroft.)</p>
<h2 id="different-queries-different-pa">Different queries, different pages</h2>
<p>Things get even trickier when you’re trying to change the content of web pages in real time as people are searching for things. This isn’t the best method for organic search, where pages need to stay pretty consistent with the indexed versions, but it can be ideal in situations such as paid search and mobile. There are millions of ways to segment buyers, Garg explained, and how accurately you assess their intent and display your content can make the all the difference. Whether someone is a new or repeat visitor often matters, as does whether someone is price-conscious (e.g., the query included “cheap”) or perhaps searching for a particular brand.</p>
<div id="attachment_645358" class="wp-caption aligncenter" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/05/llbean.png"><img alt="Source: BloomReach" src="http://gigaom2.files.wordpress.com/2013/05/llbean.png?w=708&#038;h=531" width="708" height="531" class="size-large wp-image-645358"></a><p class="wp-caption-text">Source: BloomReach</p></div>
<p>Around the holidays, the company actually realized something interesting: The bounce rate on queries for things like “gifts for dad” or “gifts for co-workers” was pretty high, but so was the conversion rate. The time to conversion was relatively fast, as well. It turns out, Garg explained, that people don’t like to overthink certain gifts too much, so if something is presented in a visually appealing manner and is within their price range, they’ll buy.</p>
<p>But creating these types of models involves more than meets the eye. For all the talk about machine learning — and machines do a majority of the work for BloomReach — people also play a critical role. A person might know better than a machine whether something was likely purchased as gift, Garg explained, or they might spot the offensive content on the T-shirt the machine decided was ideal.</p>
<p>“Humans are really good at creativity, thinking through stuff,” he said.</p>
<p>Smart humans are also good at knowing when they’re overmatched, which is why SaaS is so valuable in the big data era. CMOs could try doing what BloomReach or <a href="http://gigaom.com/2012/04/24/datapop-scores-7m-for-custom-built-ads/">similar companies such as DataPop</a> are doing, or they could pay someone to do it much better. Guess which route the smart ones will take.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-54269p1.html">Shutterstock user Andrea Danti</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=645189&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=919647"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=919647" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645189+this-is-why-big-data-is-the-sweet-spot-for-saas&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645189+this-is-why-big-data-is-the-sweet-spot-for-saas&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/06/cloud-computing-infrastructure-2012-and-beyond/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645189+this-is-why-big-data-is-the-sweet-spot-for-saas&utm_content=dharrisstructure">Cloud computing infrastructure: 2012 and beyond</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645189+this-is-why-big-data-is-the-sweet-spot-for-saas&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/14/this-is-why-big-data-is-the-sweet-spot-for-saas/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/shutterstock_119782672.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/shutterstock_119782672.jpg?w=150" medium="image">
			<media:title type="html">collective intelligence</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/br-stack.png?w=708" medium="image">
			<media:title type="html">BR stack</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/llbean.png?w=708" medium="image">
			<media:title type="html">Source: BloomReach</media:title>
		</media:content>
	</item>
		<item>
		<title>We&#8217;re witnessing the rise of the graph in big data</title>
		<link>http://gigaom.com/2013/05/14/were-witnessing-the-rise-of-the-graph-in-big-data/</link>
		<comments>http://gigaom.com/2013/05/14/were-witnessing-the-rise-of-the-graph-in-big-data/#comments</comments>
		<pubDate>Tue, 14 May 2013 14:33:33 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[data science]]></category>
		<category><![CDATA[graph analysis]]></category>
		<category><![CDATA[graph database]]></category>
		<category><![CDATA[GraphLab]]></category>
		<category><![CDATA[machine-learning]]></category>
		<category><![CDATA[open source]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=645059</guid>
		<description><![CDATA[Graph databases and graph-processing applications have been popping up all over the place lately, and now they're starting to go commercial. On Tuesday, popular open source project GraphLab joined the ranks of graph startups.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=645059&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>GraphLab, a popular <a href="http://graphlab.org/">open source project</a> dedicated to graph analysis and machine learning, is trying to capitalize on the excitement around graphs by spinning off a commercial entity, <a href="http://graphlab.com/">GraphLab Inc.</a> GraphLab creator &#8212; and University of Washington machine learning professor &#8212; Carlos Guestrin will lead the new Seattle-based company, which has raised $6.75 million from Madrona Venture Group and NEA.</p>
<p>Graph analysis is among the hottest techniques around for making sense of large datasets, primarily by determining how tightly different data points are related or how similar they are. The term &#8220;graph&#8221; came into the broader lexicon along with social networks, which built social graphs to <a href="http://gigaom.com/2013/03/14/facebook-tweaks-its-algorithms-to-improve-graph-search-comment-search-coming/">assess the relationships among their millions of users</a>, but the technique has much broader uses.</p>
<div id="attachment_645089" class="wp-caption aligncenter" style="width: 677px"><a href="http://gigaom2.files.wordpress.com/2013/05/lnkdmap-1.jpg"><img  alt="My LinkedIn social graph" src="http://gigaom2.files.wordpress.com/2013/05/lnkdmap-1.jpg?w=708"   class="size-full wp-image-645089" /></a><p class="wp-caption-text">My LinkedIn social graph</p></div>
<p>Guestrin said GraphLab&#8217;s algorithms are used in a lot of recommender systems, but he also cites fraud detection in banking networks and intrusion detection in computer networks as potential applications. We&#8217;ve covered graphs as the analytical model of choice for everything <a href="http://gigaom.com/2013/04/22/how-hbase-converted-myspaces-mysql-champion-and-is-driving-hadoop-mainstream/">from content recommendation</a> to <a href="http://gigaom.com/2013/01/22/biotech-startup-syapse-wants-to-be-salesforce-com-for-our-genomes/">tracking lab work in genomics</a>. Really, though &#8212; especially when combined with machine learning &#8212; graph analysis <a href="http://gigaom.com/2013/01/16/has-ayasdi-turned-machine-learning-into-a-magic-bullet/">can be applied to anything</a> where there&#8217;s too much data for a person to possibly analyze the relationships between every point.</p>
<div id="attachment_601469" class="wp-caption aligncenter" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/01/ayasdi-product-image-2-e1358295341371.jpg"><img  alt="One of Ayasdi's graph-like data maps" src="http://gigaom2.files.wordpress.com/2013/01/ayasdi-product-image-2-e1358295341371.jpg?w=708&#038;h=472" width="708" height="472" class="size-large wp-image-601469" /></a><p class="wp-caption-text">One of Ayasdi&#8217;s graph-like data maps</p></div>
<p>Google also famously uses <a href="http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html">a graph-processing system called Pregel</a> as part of PageRank. Although a number of graph databases and other projects have popped up in the past few years, Guestrin said GraphLab is actually a contemporary of Pregel. He and some colleagues at Carnegie Mellon built a small system for their lab about five years ago, then released it into the open-source world with few expectations that it would catch on. Now, he added, Pandora and WalmartLabs are among the project&#8217;s user base.</p>
<p>Among those other projects are graph databases such as <a href="http://giraph.apache.org/">Giraph</a> (an open source, Hadoop-based Pregel clone developed at Facebook) and <a href="http://www.neo4j.org/">Neo4j</a> (which also has a commercial arm, <a href="http://gigaom.com/2012/11/02/graph-startup-neo-raises-11m-as-specialized-databases-take-hold/">called Neo Technology</a>), as well as <a href="http://engineering.twitter.com/2012/03/cassovary-big-graph-processing-library.html">Twitter&#8217;s Cassovary</a> and fellow University of Washington project <a href="http://www.cs.washington.edu/node/4217/">Grappa</a>. Guestrin said GraphLab can work with most of them, particularly if they&#8217;re not designed to do machine learning at scale like GraphLab is. Some efforts, he noted, are focused on simply storing data in graph form (e.g., databases) or in providing simple graph analysis.</p>
<p>As for when we&#8217;ll actually see the results of the effort to commercialize GraphLab, Guestrin said it will be a while. Right now, he&#8217;s focused on the next open source release of GraphLab in July. However, the company will begin engaging with commercial users over the next several months to determine what types of features they would expect in commercial graph-analysis software.</p>
<p>The bigger question to come out of all this graph activity, though, is how big a market we&#8217;ll ultimately see for graph-analysis or any other specific technique. As companies get more comfortable with big data from a technical standpoint, they&#8217;re getting more interested in the different types of analysis it allows for too. This is evidenced by the <a href="http://gigaom.com/2013/03/07/5-reasons-why-the-future-of-hadoop-is-real-time-relatively-speaking/">quest to make Hadoop support myriad processing frameworks</a> aside from MapReduce.</p>
<p>We already have a handful of commercial graph products on the market &#8212; including an industrial grade one called <a href="http://www.yarcdata.com/">YarcData</a> from supercomputer maker Cray &#8212; but how many will there eventually be? And if graph analysis is all the rage right now, what comes next?</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=645059&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=428581"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=428581" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645059+were-witnessing-the-rise-of-the-graph-in-big-data&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645059+were-witnessing-the-rise-of-the-graph-in-big-data&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/01/12-tech-leaders-resolutions-for-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645059+were-witnessing-the-rise-of-the-graph-in-big-data&utm_content=dharrisstructure">12 tech leaders’ resolutions for 2012</a></li><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=645059+were-witnessing-the-rise-of-the-graph-in-big-data&utm_content=dharrisstructure">Connected world: the consumer technology revolution</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/14/were-witnessing-the-rise-of-the-graph-in-big-data/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/graphics2-3_final_cartoon.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/graphics2-3_final_cartoon.jpg?w=150" medium="image">
			<media:title type="html">graphics2-3_final_cartoon</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/lnkdmap-1.jpg" medium="image">
			<media:title type="html">My LinkedIn social graph</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/01/ayasdi-product-image-2-e1358295341371.jpg?w=708" medium="image">
			<media:title type="html">One of Ayasdi&#039;s graph-like data maps</media:title>
		</media:content>
	</item>
		<item>
		<title>With Lucky Sort creators on board, Twitter is officially a data company</title>
		<link>http://gigaom.com/2013/05/13/with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company/</link>
		<comments>http://gigaom.com/2013/05/13/with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company/#comments</comments>
		<pubDate>Mon, 13 May 2013 23:09:57 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[lucky-sort]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[real-time data]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[social-data]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=644866</guid>
		<description><![CDATA[With its acquisition of Lucky Sort, Twitter seems to be acknowledging that it's a data company after all. The plan appears to be building a services that would do for Twitter equivalent to services such as Google Trends and Google Analytics.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=644866&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>We all kind of knew that Twitter’s path to making money was paved with data, and the announcement on Monday that it’s buying analytics startup Lucky Sort makes it official. Unless I’m totally misreading the writing on the wall, this move is all about giving advertisers — and anyone, in theory — the tools to learn about what people are talking about.</p>
<p>Word that Lucky Sort is shutting down and that <a href="http://luckysort.com/">several of its team are joining Twitter’s revenue engineering department</a> suggests this is exactly what the acquisition aims to accomplish.</p>
<p>As it stands, companies use Twitter as a way to track how people are talking about them and maybe, if they’re really advanced, do some sentiment analysis. If they’re willing to pay a third party, Datasift and Gnip are more than happy to broaden marketers’ views to <a href="http://gigaom.com/2012/11/13/how-to-handle-a-firehose-an-interview-with-datasifts-ceo/">encompass the entirety of Twitter’s data, both real-time and historical</a>. What companies really can’t do, though, is run their own advanced analytics about topics straight from the Twitter platform.</p>
<div id="attachment_644884" class="wp-caption aligncenter" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/05/big-data.png"><img alt="big-data" src="http://gigaom2.files.wordpress.com/2013/05/big-data.png?w=708&#038;h=375" width="708" height="375" class="size-large wp-image-644884"></a><p class="wp-caption-text">One view of the Lucky Sort dashboard</p></div>
<p>The value proposition from such a product should be obvious at this point. Facebook, Google and Yahoo all collect a lot of data about how people are using their platforms and what topics are trending, and they all <a href="http://gigaom.com/2013/03/20/google-trends-youtube-data/">offer it up via a variety of products</a> targeting marketing types and the public at large. If Twitter wants to be taken seriously as a venue for advertising budgets and a platform for <a href="http://gigaom.com/2012/10/02/why-the-trick-to-twitter-as-a-data-source-is-more-data/">measuring the pulse of the nation</a>, people need to be able to ask questions of its data without relying on an intermediary or the occasional Twitter blog post.</p>
<p>As a journalist, I’d love to have access to this type of tool to track trending topics in real time and spot possible stories as they’re happening. The appeal to marketers should be obvious. As IBM’s Erick Brethenoux <a href="http://gigaom.com/2013/04/22/how-a-star-trek-convention-explains-the-secret-to-selling-more-stuff/">told me recently</a>, “[Marketers] talk a good game about social data. Very few actually leverage it effectively today.”</p>
<p>At Twitter, though, data is a slightly different beast than at other web companies. Twitter’s value lies largely in real-time data — topics can be peak, crest and all but vanish within a 48-hour window. This situation has <a href="http://gigaom.com/2012/06/04/twitter-shows-when-we-tweet-and-explains-why-its-search-sucks/">hampered some of Twitter’s efforts</a> to surface optimal search results, and it has spurred the decision to buy companies such as Backtype (for its <a href="http://gigaom.com/2011/08/04/twitter-to-open-source-hadoop-like-tool/">streaming-processing Storm technology</a>) and <a href="http://previously.ubalo.com">parallel-processing startup Ubalo</a>.</p>
<p>The latter move, <a href="https://ubalo.com/">which happened last week</a>, should help Twitter’s development team create new features without worrying about the intricacies of making them run — and run fast — across a cluster of machines. (You can learn a lot more about how companies such as Google, Facebook and Box are rethinking infrastructure to handle their unique data needs at our <a href="http://event.gigaom.com/structure/schedule/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=644866+with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company&amp;utm_content=dharrisstructure">Structure conference</a> next month in San Francisco.)</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=644866&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=289290"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=289290" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644866+with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/01/why-the-next-front-in-big-data-might-be-psychological/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644866+with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company&utm_content=dharrisstructure">Why the next front in big data might be psychological</a></li><li><a href="http://pro.gigaom.com/2011/04/finding-the-value-in-social-media-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644866+with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company&utm_content=dharrisstructure">Finding the Value in Social Media Data</a></li><li><a href="http://pro.gigaom.com/2012/09/listening-platforms-finding-the-value-in-social-media-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644866+with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company&utm_content=dharrisstructure">Listening platforms: finding the value in social media data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/13/with-lucky-sort-creators-on-board-twitter-is-officially-a-data-company/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/big-data.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/big-data.png?w=150" medium="image">
			<media:title type="html">big-data</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/big-data.png?w=708" medium="image">
			<media:title type="html">big-data</media:title>
		</media:content>
	</item>
		<item>
		<title>Visualization is the future: 6 startups re-imagining how we consume data</title>
		<link>http://gigaom.com/2013/05/13/visualization-is-the-future-6-startups-re-imagining-how-we-consume-data/</link>
		<comments>http://gigaom.com/2013/05/13/visualization-is-the-future-6-startups-re-imagining-how-we-consume-data/#comments</comments>
		<pubDate>Mon, 13 May 2013 18:20:25 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[Ayasdi]]></category>
		<category><![CDATA[BeyondCore]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[ClearStory]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data democratization]]></category>
		<category><![CDATA[Datahero]]></category>
		<category><![CDATA[Platfora]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[Zoomdata]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=643727</guid>
		<description><![CDATA[If the big data era is really going to revolutionize our world, visualizations that let more people make sense of data will be critical. Here are six startups trying to change how we interact with and look at our data.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=643727&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Although visualization is hardly the most technologically challenging part of the data-analysis puzzle, it’s arguably the most important.</p>
<p>Storage, databases, query processing and algorithms are all extremely important — heck, visualization is next to nothing without them — but in a data-driven world where is obsessed with insights, they’re just the foundational layers. They are to big data what server and network configurations are to mobile-app development on <a href="http://gigaom.com/2013/04/25/facebook-acquires-mobile-development-platform-parse/">platforms like Parse</a>. If you’re going to find out new things from massive and highly complex data sets, or <a href="http://gigaom.com/2013/04/07/we-need-a-data-democracy-not-a-benevolent-data-dictatorship/">going to give new types of people the ability to analyze even simple data</a>, the presentation of that data and the ability to create consumable presentations are critical.</p>
<p>With that in mind, here are six startups I’ve seen trying to fundamentally change the way that data is visualized. Some are highly complex under the covers, some are not and none are perfect, but they’re all doing their part to make us rethink what it means to look at data and make spreadsheets and static charts look like relics. (And this list is by no means exhaustive, so feel free to add your favorite visualization tools in the comments.) We’ll be highlighting data visualization at our design-focused RoadMap conference in San Francisco in November (<a href="http://event.gigaom.com/gigaomroadmap/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=643727+visualization-is-the-future-6-startups-re-imagining-how-we-consume-data&amp;utm_content=dharrisstructure">sign up here</a> to get first access to tickets this Summer).</p>
<h2 id="ayasdi">Ayasdi</h2>
<p>The idea of network graphs isn’t new, but <a href="http://ayasdi.com/">Ayasdi’s</a> approach to it is. Under the covers, there’s an HBase data store, a technique called <del>topographical</del> topological data analysis and hundreds of machine learning algorithms to churn through complex data sets and determine the similarity among the data points. To the end user, though, <a href="http://gigaom.com/2013/01/16/has-ayasdi-turned-machine-learning-into-a-magic-bullet/">there’s a map of the data set that looks a lot like a network graph</a> (only it’s probably not network data) highlighting clusters of related data points that analysts might want to investigate further.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/tcga.png"><img alt="tcga" src="http://gigaom2.files.wordpress.com/2013/05/tcga.png?w=708"   class="aligncenter size-full wp-image-644682"></a></p>
<h2 id="beyondcore">BeyondCORE</h2>
<p><a href="http://beyondcore.com/">BeyondCore</a> actually operates under the same basic premise as Ayasdi — show users the significant correlations so they don’t have to think of the queries that will uncover them — but it uses some different techniques to get there. It uses a different visualization method, too: BeyondCore sticks to standard charts, but actually offers the option of <a href="http://gigaom.com/2012/11/20/a-startup-asks-what-if-you-didnt-have-to-analyze-data-at-all/">having an avatar talk users through the correlations</a> the software has discovered.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/animatedbriefing.jpg"><img alt="animatedbriefing" src="http://gigaom2.files.wordpress.com/2013/05/animatedbriefing.jpg?w=708"   class="aligncenter size-full wp-image-644685"></a></p>
<h2 id="clearstory">ClearStory</h2>
<p><a href="http://www.clearstorydata.com/">ClearStory</a> has a pretty unique product in the works — even if it’s keeping many details and all of its screenshots under lock and key until its formally launches. Essentially, though, <a href="http://gigaom.com/2012/12/05/clearstory-data-raises-9m-and-might-actually-make-data-your-friend/">it’s trying to tell stories via visualizations</a> that display mashups of numerous data sources, update automatically when the source data changes, and invoke collaboration and social concepts. Here’s Co-founder and CEO Sharmila Mulligan explaining the idea behind ClearStory at Structure: Data in March.</p>
<span class="embed-youtube" style="text-align:center; display: block;"><iframe class="youtube-player" type="text/html" width="604" height="370" src="http://www.youtube.com/embed/O62VVrKD1NE?version=3&amp;rel=1&amp;fs=1&amp;showsearch=0&amp;showinfo=1&amp;iv_load_policy=1&amp;wmode=transparent" frameborder="0"></iframe></span>
<h2 id="datahero">Datahero</h2>
<p>Unlike so many data startups, <a href="http://www.datahero.com/">Datahero</a> isn’t trying to woo people fed up with business-intelligence software or the difficulties of getting insights from Hadoop data. Rather, it’s <a href="http://gigaom.com/2013/04/23/visualization-startup-datahero-opens-its-doors-and-delivers-data-analysis-for-the-masses/">trying to let people with simple business or personal data make simple charts</a> without ever having to enter an Excel function or worry too much about how their spreadsheets are formatted. Early on, Datahero’s visualizations are still pretty commonplace (bars, pies, plots, etc.), but it’s the ease of creating them that’s so unique.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/dh-10-e1366704037117.jpg"><img alt="dh-10-e1366704037117" src="http://gigaom2.files.wordpress.com/2013/05/dh-10-e1366704037117.jpg?w=708&#038;h=402" width="708" height="402" class="aligncenter size-full wp-image-644697"></a></p>
<h2 id="platfora">Platfora</h2>
<p><a href="http://platfora.com/">Platfora</a> has undertaken the ambitious task of trying to make analyzing mountains of data stored in Hadoop clusters as easy as analyzing their own <a href="https://stripe.com/">Stripe</a> data might be for developers using Datahero. It’s <a href="http://gigaom.com/2012/10/23/platfora-shows-a-whole-new-way-to-do-business-intelligence-on-big-data/">based on a foundation of Hadoop and massively parallel query processing</a>, but is presented like an HTML5 version of <a href="http://gigaom.com/2013/04/03/a-tableau-ipo-could-validate-the-big-data-visualization-push-or-not/">current visualization golden boy Tableau</a> that’s all about dragging, dropping, and visually slicing and dicing through data. The latter capability is actually critical in a big data world where there are likely more data points than you can ever digest at once.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/explore_slide_4.jpg"><img alt="explore_slide_4" src="http://gigaom2.files.wordpress.com/2013/05/explore_slide_4.jpg?w=708&#038;h=375" width="708" height="375" class="aligncenter size-large wp-image-644705"></a></p>
<h2 id="zoomdata">Zoomdata</h2>
<p><a href="http://www.zoomdata.com/">Zoomdata</a> is far from the only analytics company to support mobile devices, but it’s one of the few I know of (<a href="http://www.roambi.com/analytics-overview.html">Roambi</a> also comes to mind) designed primarily for them. Zoomdata connects to standard business data sources, but takes advantage of touch screens and the D3.js visualization project to offer up some visually interesting charts that are <a href="http://gigaom.com/2012/11/13/heres-how-it-looks-when-big-data-goes-mobile-first/">designed to be manipulated like an artist’s palette</a>.</p>
<p><a href="http://gigaom2.files.wordpress.com/2013/05/ticketstatus_101812.jpg"><img alt="ticketstatus_101812" src="http://gigaom2.files.wordpress.com/2013/05/ticketstatus_101812.jpg?w=708&#038;h=531" width="708" height="531" class="aligncenter size-full wp-image-644709"></a></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=643727&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=91229"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=91229" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=643727+visualization-is-the-future-6-startups-re-imagining-how-we-consume-data&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=643727+visualization-is-the-future-6-startups-re-imagining-how-we-consume-data&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-computing-and-trickle-down-analytics/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=643727+visualization-is-the-future-6-startups-re-imagining-how-we-consume-data&utm_content=dharrisstructure">Cloud computing and trickle-down analytics</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=643727+visualization-is-the-future-6-startups-re-imagining-how-we-consume-data&utm_content=dharrisstructure">A near-term outlook for big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/13/visualization-is-the-future-6-startups-re-imagining-how-we-consume-data/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/tcga.png?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/tcga.png?w=150" medium="image">
			<media:title type="html">tcga</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/tcga.png" medium="image">
			<media:title type="html">tcga</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/animatedbriefing.jpg" medium="image">
			<media:title type="html">animatedbriefing</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/dh-10-e1366704037117.jpg" medium="image">
			<media:title type="html">dh-10-e1366704037117</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/explore_slide_4.jpg?w=708" medium="image">
			<media:title type="html">explore_slide_4</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/ticketstatus_101812.jpg" medium="image">
			<media:title type="html">ticketstatus_101812</media:title>
		</media:content>
	</item>
		<item>
		<title>So UK carriers are selling anonymized customer data? That may not be a bad thing.</title>
		<link>http://gigaom.com/2013/05/13/so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing/</link>
		<comments>http://gigaom.com/2013/05/13/so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing/#comments</comments>
		<pubDate>Mon, 13 May 2013 16:09:20 +0000</pubDate>
		<dc:creator>David Meyer</dc:creator>
				<category><![CDATA[4G]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Customer data]]></category>
		<category><![CDATA[Data Protection]]></category>
		<category><![CDATA[EE]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[Telefonica]]></category>
		<category><![CDATA[Vodafone]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=644594</guid>
		<description><![CDATA[British privacy advocates have reacted with horror to the idea of EE and market research firm Ipsos Mori selling anonymized customer data. On balance, they shouldn't worry so much.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=644594&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The news that British 4G carrier EE is trying to sell anonymized user data, in league with market research firm Ipsos Mori, has been greeted with wrinkle-nosed <a href="http://www.information-age.com/technology/mobile-and-networking/123457043/ee-and-ipsos-mori-face-privacy-backlash-over-mobile-data-analysis">outrage</a> &#8212; particularly the part about the Metropolitan Police being a potential customer. After all, the UK has just (<a href="http://www.guardian.co.uk/politics/2013/may/08/queens-speech-snoopers-charter">mostly</a>) dodged proposed legislation that would have led to monolithic registers of citizens&#8217; online communications. This is just a <a href="http://www.bbc.co.uk/news/technology-22510792">privatized version</a> of the same thing, right?</p>
<p>The short answer is <em>no</em>. <a href="http://www.thesundaytimes.co.uk/sto/news/uk_news/Society/article1258380.ece"><em>The Sunday Times</em> (paywall alert)</a> may have billed its story as being about the potential sale of 27 million people&#8217;s details to the cops, but the reality is somewhat less alarming. As Ipsos Mori has been forced to <a href="http://www.ipsos-mori.com/newsevents/latestnews/1390/Ipsos-MORI-response-to-the-Sunday-Times.aspx">explain</a> in response to the exposé:</p>
<blockquote id="quote-in-conducting-this-r"><p>&#8220;In conducting this research we only receive anonymized data without any personally identifiable information… We do not have access to any names, personal address information, nor postcodes or phone numbers. We can see the volume of people who have visited a website domain, but we cannot see the detail of individual visits, nor what information is entered on that domain. We only ever report on aggregated groups of 50 or more customers. We will never release any data that in any way allows an individual to be identified.&#8221;</p></blockquote>
<p>So what <em>does</em> this data tell us? According to the original article, it provides insights based on &#8220;gender, age, postcode, websites visited, time of day text is sent [and] location of customer when call is made&#8221;.</p>
<h2 id="reverse-engineering">Reverse engineering</h2>
<p>Now, as we discussed recently, it is <a href="http://gigaom.com/2013/03/25/why-the-collision-of-big-data-and-privacy-will-require-a-new-realpolitik/">easier than you might think to de-anonymize data</a> due to the uniqueness of our personal movement patterns &#8212; as long as you have the will, the datasets and the pieces of identifying information that can be correlated with the anonymized individuals effectively described in those datasets. So those horrified reactions to the weekend&#8217;s revelations are not entirely groundless. They are over-the-top, though.</p>
<p>There is a significant difference between a register of communications (who contacted whom and when) and a pool of anonymized data where the most fine-grained nugget of information that <em>might</em> be reverse-engineered would tell you that Person X visited the Gmail domain while within a 100 meter radius of the corner of Oxford Street and Tottenham Court Road. To assume equivalence between the two ideas is to ignore the elements of intent, will, data-crunching capacity and, frankly, competence. In short, there are far easier ways for the police to track individuals through their handsets, such as just going to the carrier and demanding to do so.</p>
<p>(<em>The Sunday Times</em> said sources claimed &#8220;officers had been enthusiastic about the potential for tracking users of pay-as-you-go phones,&#8221; but – quality of sources notwithstanding &#8212; I suspect those officers may have been slightly overestimating their own data-crunching powers. They may have also overlooked the fact that the operators would have no idea of their pay-as-you-go users&#8217; age or gender, making it near-impossible to tease out an individual from the anonymized mass. Either way, they backed off once the story broke.)</p>
<h2 id="not-damning">Not damning</h2>
<p>And then there&#8217;s the matter of this data&#8217;s innocent utility. Of all the sources of &#8220;big data&#8221; that is both largely untapped and genuinely useful, mobile operators must be among the most potentially fruitful. In societies where everyone is carrying a phone, there can be no better way to establish the density and fluidity of traffic flows and footfall. This data is gold dust, not just for retailers, but also for town planners and councils. It shows us how our cities and roads really work, and it can help us make them more efficient and pleasant to live in or use.</p>
<p>I feel a bit sorry for EE in this particular case. After all, its rivals Telefonica (trading as O2) and Vodafone are also offering up their customer data for analytics purposes – Telefonica&#8217;s <a href="http://dynamicinsights.telefonica.com/view-news/?i=100">&#8220;Dynamic Insights&#8221; program</a> is being carried out in partnership with market research firm GfK, while Voda <a href="http://enterprise.vodafone.com/insight_news/2013-05-10-unleashing-powerful-insights-with-mobile-analytics.jsp">launched its mobile analytics play</a> just last Friday.</p>
<p>&#8220;Everyone is doing it&#8221; would be a lousy apology in itself, but I don&#8217;t think any of these carriers or their partners are doing anything wrong, <em>as long as their datasets are suitably anonymized</em>. If people could feasibly be personally identified from this data, the carriers and their market research partners would instantly find themselves on the wrong side of existing data protection legislation &#8212; the fines in the UK for this stuff are pretty paltry, but they would also quickly lose the trust of their customers, so there&#8217;s little motivation for the telcos and their partners to cross the line.</p>
<p>It&#8217;s great that people are concerned and watchful about their privacy, and long may they continue to be. However, this is a case where the potential benefits of the data are both great and realistically attainable, and where the downsides are so unfeasible as to be worth discounting, at least at this stage. It&#8217;s now up to the carriers to explain this to their customers in understandable and honest terms.</p>
<p>There will be great battles worth fighting in the war over our personal data and its exploitation. This ain&#8217;t one of them.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=644594&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=54574"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=54574" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644594+so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing&utm_content=superglaze">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644594+so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing&utm_content=superglaze">Connected world: the consumer technology revolution</a></li><li><a href="http://pro.gigaom.com/2011/09/the-future-of-mobile-a-segment-analysis-by-gigaom-pro/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644594+so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing&utm_content=superglaze">The future of mobile: a segment analysis by GigaOM Pro</a></li><li><a href="http://pro.gigaom.com/2009/05/4g-state-of-the-union/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=644594+so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing&utm_content=superglaze">4G: State of the Union</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/13/so-uk-carriers-are-selling-anonymized-customer-data-that-may-not-be-a-bad-thing/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/05/privacy.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/05/privacy.jpg?w=150" medium="image">
			<media:title type="html">Privacy</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/6599daccfd7e897e68744fe0065e5a2e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">superglaze</media:title>
		</media:content>
	</item>
	</channel>
</rss>