<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: The big data world is operating at 1 percent</title>
	<atom:link href="http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/</link>
	<description></description>
	<lastBuildDate>Mon, 20 May 2013 16:01:29 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Jack Rivkin</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1319450</link>
		<dc:creator><![CDATA[Jack Rivkin]]></dc:creator>
		<pubDate>Wed, 13 Mar 2013 00:09:42 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1319450</guid>
		<description><![CDATA[Can&#039;t wait until we are using more than 1% of the data available and accumulating. There is a &quot;law&quot; that applies here: &quot;The number of conclusions reached expands proportionately to the data available and inversely to the time required to reach a conclusion.&quot; What one measures or analyzes affects behavior. It does start with strategy and what one is trying to accomplish. Big Data offers some exciting opportunities, but our history of data analysis says &quot;proceed with caution.&quot; See &quot;What is the Big Deal About Big Data?&quot; for some more thoughts on this: http://bit.ly/YCd7DE]]></description>
		<content:encoded><![CDATA[<p>Can&#8217;t wait until we are using more than 1% of the data available and accumulating. There is a &#8220;law&#8221; that applies here: &#8220;The number of conclusions reached expands proportionately to the data available and inversely to the time required to reach a conclusion.&#8221; What one measures or analyzes affects behavior. It does start with strategy and what one is trying to accomplish. Big Data offers some exciting opportunities, but our history of data analysis says &#8220;proceed with caution.&#8221; See &#8220;What is the Big Deal About Big Data?&#8221; for some more thoughts on this: <a href="http://bit.ly/YCd7DE" rel="nofollow">http://bit.ly/YCd7DE</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: gurjeets</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1319191</link>
		<dc:creator><![CDATA[gurjeets]]></dc:creator>
		<pubDate>Mon, 11 Mar 2013 22:33:23 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1319191</guid>
		<description><![CDATA[Thanks to everyone for posting such thoughtful comments. I find it interesting that several of you pointed out the value of queries. I agree that queries are important, but only after the entire data set is algorithmically mapped into a topology that a data scientist, or business user, can begin with. In other words, once the relationships are mapped across the entire dataset, then the user can “zoom into” specific areas of the visualization to explore the meaning of shape and color. It’s about starting with a machine-generated view of the data that is unbiased and holistic (using 100% of the data). Thomas is correct when he says that “starting with queries is a dead ethos that gets increasingly meaningful as we approach huge datasets and very complex correlated sciences that no on has yet dreamt about a possible hypothesis to start with.” We are in the first inning of the Big Data game and I look forward to what we will all discover when we explore new approaches.]]></description>
		<content:encoded><![CDATA[<p>Thanks to everyone for posting such thoughtful comments. I find it interesting that several of you pointed out the value of queries. I agree that queries are important, but only after the entire data set is algorithmically mapped into a topology that a data scientist, or business user, can begin with. In other words, once the relationships are mapped across the entire dataset, then the user can “zoom into” specific areas of the visualization to explore the meaning of shape and color. It’s about starting with a machine-generated view of the data that is unbiased and holistic (using 100% of the data). Thomas is correct when he says that “starting with queries is a dead ethos that gets increasingly meaningful as we approach huge datasets and very complex correlated sciences that no on has yet dreamt about a possible hypothesis to start with.” We are in the first inning of the Big Data game and I look forward to what we will all discover when we explore new approaches.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: arjkay</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1319045</link>
		<dc:creator><![CDATA[arjkay]]></dc:creator>
		<pubDate>Mon, 11 Mar 2013 14:21:12 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1319045</guid>
		<description><![CDATA[Here here! Insights are harder to derive when the ingrained assumptions of causality lead us to look for affirming evidence, rather than understanding more fully the interactions, or impacts of different event chains.  
The problems Gurjeet raises reflect the persistence of linear thinking and heuristic bias.  The opportunities that Genetic Algorithms, or self-learning models present parallels  processing that occurs in the human brain requires additional automation to make the dynamic feedback or output manageable.  I&#039;m all in favor of adding new tools that allow us to reliably reconstruct scenarios and iteratively discover what doesn&#039;t fit the expected patterns. What makes us well, may be different than what makes us sick. The timing of data capture, which accounts for why we may be only at 1%, doesn&#039;t have to be a limitation. To the contrary it&#039;s proven useful to sample.  
The difference between data capture and analysis to date and what the future holds hinges on correctly applying extrapolation methods.  Only when life and death are at stake, does eliminating the confidence interval in your sample based prediction warrant inclusion of more data.
In a dynamic world, certainty may be over-rated and leaving some room for doubts will help us continue to keep looking, probing and discovering.]]></description>
		<content:encoded><![CDATA[<p>Here here! Insights are harder to derive when the ingrained assumptions of causality lead us to look for affirming evidence, rather than understanding more fully the interactions, or impacts of different event chains.<br />
The problems Gurjeet raises reflect the persistence of linear thinking and heuristic bias.  The opportunities that Genetic Algorithms, or self-learning models present parallels  processing that occurs in the human brain requires additional automation to make the dynamic feedback or output manageable.  I&#8217;m all in favor of adding new tools that allow us to reliably reconstruct scenarios and iteratively discover what doesn&#8217;t fit the expected patterns. What makes us well, may be different than what makes us sick. The timing of data capture, which accounts for why we may be only at 1%, doesn&#8217;t have to be a limitation. To the contrary it&#8217;s proven useful to sample.<br />
The difference between data capture and analysis to date and what the future holds hinges on correctly applying extrapolation methods.  Only when life and death are at stake, does eliminating the confidence interval in your sample based prediction warrant inclusion of more data.<br />
In a dynamic world, certainty may be over-rated and leaving some room for doubts will help us continue to keep looking, probing and discovering.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Thomas Chacko</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1318991</link>
		<dc:creator><![CDATA[Thomas Chacko]]></dc:creator>
		<pubDate>Mon, 11 Mar 2013 08:59:27 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1318991</guid>
		<description><![CDATA[Agree to this mostly Gurjeet..We have enough historical data available in some form within specific domain datastores. This can be mined by tools from Ayasdi and others to usher in critical breakthroughs in cancer research, energy exploration, drug discovery, financial fraud detection and more. However most entities, unless mandated by regulation OR not involved in any traditional data mining and/or piloting a big data project - attach a  very high cost with data retention. With the result, most of these valuable data keeps spilling  outside the retension-window and lost for ever for any future application.

&quot;Starting with queries is a dead end&quot; -- this ethos gets increasing meaningful as we approach huge datasets and very complex correlated sciences where no one as yet dreamt about a possible hypothesis to start with. The caveat being that we are to identify and steer clear of &#039;false positive&#039; traps and be able to verify the outcome confidently using multiple approaches. 
Nevertheless, a very interesting era we are all staring at..]]></description>
		<content:encoded><![CDATA[<p>Agree to this mostly Gurjeet..We have enough historical data available in some form within specific domain datastores. This can be mined by tools from Ayasdi and others to usher in critical breakthroughs in cancer research, energy exploration, drug discovery, financial fraud detection and more. However most entities, unless mandated by regulation OR not involved in any traditional data mining and/or piloting a big data project &#8211; attach a  very high cost with data retention. With the result, most of these valuable data keeps spilling  outside the retension-window and lost for ever for any future application.</p>
<p>&#8220;Starting with queries is a dead end&#8221; &#8212; this ethos gets increasing meaningful as we approach huge datasets and very complex correlated sciences where no one as yet dreamt about a possible hypothesis to start with. The caveat being that we are to identify and steer clear of &#8216;false positive&#8217; traps and be able to verify the outcome confidently using multiple approaches.<br />
Nevertheless, a very interesting era we are all staring at..</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nico</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1318983</link>
		<dc:creator><![CDATA[Nico]]></dc:creator>
		<pubDate>Mon, 11 Mar 2013 05:44:55 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1318983</guid>
		<description><![CDATA[The most important aspect to big data is the company strategy and how the company is willing to align to the insights from big data. First they must have a strategy and know what they want from the data, and &quot;more sales&quot; is not valid. Then, when they have the results they must be able to act on those results, make decisions and be more flexible in implementing those decisions.]]></description>
		<content:encoded><![CDATA[<p>The most important aspect to big data is the company strategy and how the company is willing to align to the insights from big data. First they must have a strategy and know what they want from the data, and &#8220;more sales&#8221; is not valid. Then, when they have the results they must be able to act on those results, make decisions and be more flexible in implementing those decisions.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Schills</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1318973</link>
		<dc:creator><![CDATA[Schills]]></dc:creator>
		<pubDate>Mon, 11 Mar 2013 02:58:11 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1318973</guid>
		<description><![CDATA[I am not a Sara scientist by any stretch of the imgaination so forgive me if I sbuse the terminology.  Queries are not valuable solely for finding positives. Their real value lies in finding negatives, and eliminating those from the data being considered.]]></description>
		<content:encoded><![CDATA[<p>I am not a Sara scientist by any stretch of the imgaination so forgive me if I sbuse the terminology.  Queries are not valuable solely for finding positives. Their real value lies in finding negatives, and eliminating those from the data being considered.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cappetit60</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1318949</link>
		<dc:creator><![CDATA[cappetit60]]></dc:creator>
		<pubDate>Sun, 10 Mar 2013 23:02:55 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1318949</guid>
		<description><![CDATA[Entrada de pruebas hoy primer día .]]></description>
		<content:encoded><![CDATA[<p>Entrada de pruebas hoy primer día .</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Brill</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1318936</link>
		<dc:creator><![CDATA[Michael Brill]]></dc:creator>
		<pubDate>Sun, 10 Mar 2013 20:37:17 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1318936</guid>
		<description><![CDATA[While I understand the sentiment behind &quot;starting with queries is a dead end,&quot; I think that&#039;s exactly where you need to start.  More precisely, getting to user intent is critical. Otherwise, we&#039;ll continue down this path where our fancy analytics just generate a steady stream of false-positives and otherwise non-actionable output.

Maybe the trick is creating software that can elicit queries from humans in a structure that is more usable than simple database queries against fixed schema.  This gives machines and data scientists something market-driven to aim for rather than guessing what the market might way... kind of lean principles applied to analytics.]]></description>
		<content:encoded><![CDATA[<p>While I understand the sentiment behind &#8220;starting with queries is a dead end,&#8221; I think that&#8217;s exactly where you need to start.  More precisely, getting to user intent is critical. Otherwise, we&#8217;ll continue down this path where our fancy analytics just generate a steady stream of false-positives and otherwise non-actionable output.</p>
<p>Maybe the trick is creating software that can elicit queries from humans in a structure that is more usable than simple database queries against fixed schema.  This gives machines and data scientists something market-driven to aim for rather than guessing what the market might way&#8230; kind of lean principles applied to analytics.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark McAndrew</title>
		<link>http://gigaom.com/2013/03/10/the-big-data-world-is-operating-at-1-percent/#comment-1318914</link>
		<dc:creator><![CDATA[Mark McAndrew]]></dc:creator>
		<pubDate>Sun, 10 Mar 2013 17:47:52 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=618443#comment-1318914</guid>
		<description><![CDATA[Hear, hear. 

It&#039;s not just 99% of data that isn&#039;t being used. Same applies to the world&#039;s total computing power. Put them together, it really will change the world: http://www.charityengine.com/news/blogs/charity-engine-mission]]></description>
		<content:encoded><![CDATA[<p>Hear, hear. </p>
<p>It&#8217;s not just 99% of data that isn&#8217;t being used. Same applies to the world&#8217;s total computing power. Put them together, it really will change the world: <a href="http://www.charityengine.com/news/blogs/charity-engine-mission" rel="nofollow">http://www.charityengine.com/news/blogs/charity-engine-mission</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
