<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; big data analytics</title>
	<atom:link href="http://gigaom.com/tag/big-data-analytics/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Thu, 23 May 2013 05:14:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; big data analytics</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>Black box software: a problem for science that extends to big data</title>
		<link>http://gigaom.com/2013/05/16/black-box-software-a-problem-for-science-that-extends-to-big-data-2/</link>
		<comments>http://gigaom.com/2013/05/16/black-box-software-a-problem-for-science-that-extends-to-big-data-2/#comments</comments>
		<pubDate>Thu, 16 May 2013 18:00:44 +0000</pubDate>
		<dc:creator>Amanda Alvarez</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[data science]]></category>
		<category><![CDATA[ecology]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[scientific computing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=646192</guid>
		<description><![CDATA[Blind trust in black box, or click-and-run, software is a growing problem in science, and the concern extends to big data and high performance computing.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646192&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>You probably don’t need to know how a calculator makes two plus two equal four, or how your favorite smartphone app works, but the way the background software is implemented can make a big difference to the output. Slight rounding errors or slow load times in these cases might be annoying, but when you scale up to big data modeling, for instance, you might want to take a closer look at the software running your calculations before you click go.</p>
<p>Blind trust in black box, or click-and-run, software is a growing problem in science, according to a <a href="http://www.sciencemag.org/lookup/doi/10.1126/science.1231535">commentary published Thursday in the journal <i>Science</i></a>, and the concern extends beyond formal research to other domains that use high performance computing.</p>
<p>The researchers who addressed the “troubling trend in scientific software use” were motivated by a growing unease that the abundance of powerful software is letting scientists derive answers without a thorough understanding of what the software is doing. Software snafus have been responsible for some high-profile <a href="http://www.ligo-wa.caltech.edu/~michael.landry/calibration/S5/getsignright.pdf">data misinterpretations and retractions</a>.</p>
<p>This wouldn’t normally cause a blip on the average citizen’s radar, but now a lot of these scientific conclusions have real-world implications, from climate modeling and weather forecasting to high volume financial trading. In any domain using big data, misplaced trust in the power of software can be problematic, particularly when the decision makers don’t know what the software they are using is doing, said lead author Lucas Joppa of Microsoft Research.</p>
<p>So what does ecology have to do with any of this? Joppa is an ecologist by training, and works on computational techniques in that field that may also have applications for big data more broadly. He and his colleagues surveyed scientists in a sub-field of ecology &#8212; species distribution modeling (SDM) &#8212; to find out how they choose software and how well they understand its inner workings.</p>
<p>“Lots of SDM techniques are only available as computational methods, but there is a lot of discourse going on in the literature about whether the methods themselves are correct,” said Joppa. Scientists use SDM to forecast where plants and animals will be in the future given current numbers, known habitats, and climate change. It’s a niche area of research, but the disquieting survey results should be noted in any domain where forecasting is done by plugging data into software.</p>
<p>Only 8 percent of the more than 400 scientists who responded had validated their modeling software against other methods. “The number speaks for itself,” said Joppa. “The real crux of the problem is the results from software being published in a peer-reviewed journal, versus the software itself having been peer-reviewed,” which is rare. Software packages, whether proprietary or not, are often black box systems that can’t be opened and inspected. Even if you can get under the proverbial hood, like with open source software, said Joppa, most people will still have no idea what they are looking at, or how to judge its quality.</p>
<p><img  alt="catch 22" src="http://gigaom2.files.wordpress.com/2013/05/91201888.jpg?w=347&#038;h=231" width="347" height="231" class="alignleft" /></p>
<p>To top it all off, having confidence in what your software is doing results in a massive computational catch-22: how do you know the software is giving you the right answer, if you can’t get the answer without running the software? The level of confusion over what algorithms are doing in the SDM field is illustrated by a debate over <a href="http://methodsblog.wordpress.com/2013/02/20/some-big-news-about-maxent/">which of two statistical techniques is superior</a>. It turns out, Joppa explained, that the two techniques were mathematically equivalent, but the ways they were implemented in software resulted in big predictive differences.</p>
<p>This sort of mix-up isn’t surprising given the messy nature of software development (if you can even call it that) in research environments. Joppa lauded efforts like Software Carpentry that teach scientists basic software fundamentals for better programming, and said the days of getting a doctorate by merely pushing a button are over.</p>
<p>“Scientists themselves can learn a bare minimum of software engineering,” said Joppa. On the flip side, he said computer science students should have more exposure to scientific methods. “People with traditional software engineering training become uncomfortable with the way scientists want to work with software, where the design and specs are constantly changing. The way that scientific software is built is fundamentally different from consumer apps.”</p>
<p>Developers of scientific software, like MathWorks or SAS, may want to watch this space. If Joppa’s suggestions are implemented, journals may start requiring that even proprietary software be opened up for inspection and peer-review. Nearly half of the surveyed ecologists report using free statistical language R as their primary software, so maybe there is hope yet, both for open, inspectable code, and for computational science becoming more accessible while yielding trustworthy, high impact results.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=646192&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=330535"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=330535" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646192+black-box-software-a-problem-for-science-that-extends-to-big-data-2&utm_content=neuroamanda">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/12/sector-roadmap-health-care-and-big-data-in-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646192+black-box-software-a-problem-for-science-that-extends-to-big-data-2&utm_content=neuroamanda">Health care and big data in 2012</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646192+black-box-software-a-problem-for-science-that-extends-to-big-data-2&utm_content=neuroamanda">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/05/pervasive-software-retools-for-cloud-big-data-will-it-be-heard/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=646192+black-box-software-a-problem-for-science-that-extends-to-big-data-2&utm_content=neuroamanda">Pervasive Software retools for cloud, big data: will it be heard?</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/16/black-box-software-a-problem-for-science-that-extends-to-big-data-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/146799217.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/146799217.jpg?w=150" medium="image">
			<media:title type="html">black box</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/e37323b74d1f383817d82c9f906b7bcf?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">neuroamanda</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/91201888.jpg?w=708" medium="image">
			<media:title type="html">catch 22</media:title>
		</media:content>
	</item>
		<item>
		<title>Sector RoadMap: Social customer service in 2013</title>
		<link>http://pro.gigaom.com/report/sector-roadmap-social-customer-service-in-2013/</link>
		<comments>http://pro.gigaom.com/report/sector-roadmap-social-customer-service-in-2013/#comments</comments>
		<pubDate>Tue, 23 Apr 2013 06:55:38 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/laurastuart/" rel="author">Laura Stuart</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[business-to-business]]></category>
		<category><![CDATA[cellular telephone]]></category>
		<category><![CDATA[chatter]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[cloud technology]]></category>
		<category><![CDATA[cloud-applications]]></category>
		<category><![CDATA[cloud-based social media]]></category>
		<category><![CDATA[Cloud-based solutions]]></category>
		<category><![CDATA[Collaboration Software]]></category>
		<category><![CDATA[Collective Intellect]]></category>
		<category><![CDATA[CRM]]></category>
		<category><![CDATA[Customer experience]]></category>
		<category><![CDATA[Customer experience management]]></category>
		<category><![CDATA[Customer relationship management]]></category>
		<category><![CDATA[Customer service applications]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[data-analytics]]></category>
		<category><![CDATA[database processing]]></category>
		<category><![CDATA[database.com]]></category>
		<category><![CDATA[disruption vectors]]></category>
		<category><![CDATA[ECRM]]></category>
		<category><![CDATA[Electronic commerce]]></category>
		<category><![CDATA[enterprise customer service applications]]></category>
		<category><![CDATA[enterprise IT]]></category>
		<category><![CDATA[enterprise-applications]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[IBM cloud computing]]></category>
		<category><![CDATA[integration]]></category>
		<category><![CDATA[internet-based customer service]]></category>
		<category><![CDATA[Kana]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Microsoft Dynamics]]></category>
		<category><![CDATA[microsoft dynamics crm]]></category>
		<category><![CDATA[Microsoft Dynamics CRM solutions]]></category>
		<category><![CDATA[mobile devices]]></category>
		<category><![CDATA[mobility]]></category>
		<category><![CDATA[multilatent clouds]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[NetBase]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle CRM]]></category>
		<category><![CDATA[oracle-corporation]]></category>
		<category><![CDATA[oracle-database]]></category>
		<category><![CDATA[predictive analytics]]></category>
		<category><![CDATA[radian6]]></category>
		<category><![CDATA[rightnow]]></category>
		<category><![CDATA[Rightnow Technologies]]></category>
		<category><![CDATA[Salesforce.com]]></category>
		<category><![CDATA[SAP AG]]></category>
		<category><![CDATA[Siebel]]></category>
		<category><![CDATA[smartphone]]></category>
		<category><![CDATA[smartphones]]></category>
		<category><![CDATA[social and contact-center applications]]></category>
		<category><![CDATA[social business]]></category>
		<category><![CDATA[social customer service]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[social media data]]></category>
		<category><![CDATA[social media sites]]></category>
		<category><![CDATA[social tools]]></category>
		<category><![CDATA[social-communications]]></category>
		<category><![CDATA[social-data]]></category>
		<category><![CDATA[social-media tools]]></category>
		<category><![CDATA[software as a service]]></category>
		<category><![CDATA[Tablet computer]]></category>
		<category><![CDATA[tablets]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Web 2.0]]></category>
		<category><![CDATA[work media tools]]></category>
		<category><![CDATA[Yammer]]></category>
		<category><![CDATA[YouTube]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?post_type=go-report&#038;p=172865/</guid>
		<description><![CDATA[“Social customer service” refers to those services that provide customer support via social media channels. Providing such services is no longer merely a niche or specialty sideline. Challengers, or disruptors who were early with the new technology, are working to expand and integrate their offerings into enterprise systems and processes.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648541&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>“Social customer service” refers to those services that provide customer support via social media channels. Providing such services is no longer merely a niche or specialty sideline. Challengers, or disruptors who were early with the new technology, are working to expand and integrate their offerings into enterprise systems and processes.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=648541&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=683711"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=683711" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_medium=editorial&utm_campaign=auto3&utm_term=648541+sector-roadmap-social-customer-service-in-2013&utm_content=gigaedit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_medium=editorial&utm_campaign=auto3&utm_term=648541+sector-roadmap-social-customer-service-in-2013&utm_content=gigaedit">Connected world: the consumer technology revolution</a></li><li><a href="http://pro.gigaom.com/2012/01/newnet-q4-platform-mania-and-social-commerce-shakeout/?utm_medium=editorial&utm_campaign=auto3&utm_term=648541+sector-roadmap-social-customer-service-in-2013&utm_content=gigaedit">NewNet Q4: Platform mania and social commerce shakeout</a></li><li><a href="http://pro.gigaom.com/2012/12/it-spending-update-fourth-quarter-2012/?utm_medium=editorial&utm_campaign=auto3&utm_term=648541+sector-roadmap-social-customer-service-in-2013&utm_content=gigaedit">IT spending update, fourth quarter 2012</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/report/sector-roadmap-social-customer-service-in-2013/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://pro.gigaom.com/wp-content/uploads/2013/04/cashregister.jpg?w=150" />
		<media:content url="http://pro.gigaom.com/wp-content/uploads/2013/04/cashregister.jpg?w=150" medium="image">
			<media:title type="html">cashregister</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>For big data achievements, IT and analysts need to work together</title>
		<link>http://gigaom.com/2013/03/20/for-big-data-achievements-it-and-analysts-need-to-work-together/</link>
		<comments>http://gigaom.com/2013/03/20/for-big-data-achievements-it-and-analysts-need-to-work-together/#comments</comments>
		<pubDate>Wed, 20 Mar 2013 19:21:27 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[financial services]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Structure Data 2013]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=622502</guid>
		<description><![CDATA[Representatives at IBM and the New York Stock Exchange laid out a schematic for doing big data analytics and showed how it can work in practice.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=622502&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>One trend emerging throughout <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=622502+for-big-data-achievements-it-and-analysts-need-to-work-together&amp;utm_content=gigajordan">GigaOM’s Structure:Data conference</a> today is the collaboration between man and machine to solve big-data problems. Speaking with Phil Francisco, vice president of product management for big data at IBM, and Emile Werr, head of enterprise data architecture at the New York Stock Exchange, my colleague Barb Darrow spent a session Wednesday explaining how people — a company’s IT experts and business experts — sometimes need to work in different ways to achieve the same business goals.</p>
<p>Developers need to build systems for crossing lots of data sets from legacy data warehouses as well as Hadoop clusters and make available options for visualizing trends that might otherwise be obvious, Francisco said. That’s when business experts come into play and ask questions and derive insights that could lead to new strategies and campaigns.</p>
<p>How does that work in practice? Facing greater volumes of data, the NYSE has trained business analysts as “data architects” to develop a system with IBM products for capacity planning and spotting patterns to detect fraud in billions of transactions each day, Werr said. Analysts also need to be able to figure out if a a possible fraud case is a false positive. Those are early-stage use cases for analyzing data in near-real time.</p>
<p>For now, financial deployments tend to play out on premise. Werr pointed out places where public clouds make sense. Developers can test out new data architectures for data sets. But the cost advantage of running on production scale on Infrastructure as a Service (IaaS) such as Amazon Web Services is appealing, Werr said. But, at least for now, bandwidth across multiple data centers is an issue, he said.</p>
<p>Check out <a href="http://gigaom.com/2013/03/20/structuredata-2013-live-coverage/">the rest of our Structure:Data 2013 live coverage here</a>, and a video embed of the session follows below.</p>
<p><iframe src="http://new.livestream.com/accounts/74987/events/1927733/videos/14312357/player?autoPlay=false&amp;height=360&amp;mute=false&amp;width=640" height="360" width="640" frameborder="0" scrolling="no"></iframe><br>
A transcription of the video follows on the next page</p>
<p><a href="http://gigaom.com/2013/03/20/for-big-data-achievements-it-and-analysts-need-to-work-together/2/">Go to page 2 (of 2) on GigaOM .</a></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=622502&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=233140"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=233140" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622502+for-big-data-achievements-it-and-analysts-need-to-work-together&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2013/01/how-hr-can-make-the-case-for-workforce-analytics/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622502+for-big-data-achievements-it-and-analysts-need-to-work-together&utm_content=gigajordan">How HR can make the case for workforce analytics</a></li><li><a href="http://pro.gigaom.com/2013/01/the-2013-task-management-tools-market/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622502+for-big-data-achievements-it-and-analysts-need-to-work-together&utm_content=gigajordan">The 2013 task management tools market</a></li><li><a href="http://pro.gigaom.com/2012/12/social-2013-the-enterprise-strikes-back/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=622502+for-big-data-achievements-it-and-analysts-need-to-work-together&utm_content=gigajordan">Social 2013: The enterprise strikes back</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/20/for-big-data-achievements-it-and-analysts-need-to-work-together/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/h2nnsc_3zg8sou8aojou25alpipcbzi3j09y4zptwd8.jpeg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/h2nnsc_3zg8sou8aojou25alpipcbzi3j09y4zptwd8.jpeg?w=150" medium="image">
			<media:title type="html">Emile Werr NYSE Structure Data 2013</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
		<item>
		<title>For big data analytics, recall the tried and true old-school rules</title>
		<link>http://gigaom.com/2013/03/08/for-big-data-analytics-recall-the-tried-and-true-old-school-rules/</link>
		<comments>http://gigaom.com/2013/03/08/for-big-data-analytics-recall-the-tried-and-true-old-school-rules/#comments</comments>
		<pubDate>Fri, 08 Mar 2013 18:27:21 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Structure Data 2013]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=618509</guid>
		<description><![CDATA[As companies implement big data analytics strategies, they ought to consider some of the best practices in place before the rise of the term "big data."<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=618509&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Data analysis didn’t start with <a href="http://gigaom.com/2013/03/08/hadoop-through-the-years-a-gigaom-retrospective">Hadoop</a>. Companies have been working with data to get insights for decades. While technology has changed, some of the rules from the past still apply, or ought to, as data gets bigger and bigger.</p>
<p>Jack Rivkin, an occasional blogger with deep investment experience, recently <a href="http://blog.contracarbon.com/2013/02/18/what-is-the-big-deal-about-big-data/">shared some of the best practices </a>he was exposed to early in his career working on economic forecasts. He shared some sage suggestions for enterprises to bear in mind as they consider and implement big data strategies. Among his insights:</p>
<ul><li>Forecasting models can only be as good as the data inputs.</li>
<li>Be skeptical and hedge when sharing the models by noting factors that could lead to different results.</li>
<li>The less time it takes to process data, the more valuable it is.</li>
<li>Constantly improve models and inputs.</li>
</ul><p>Of course, big data isn’t wholly evolutionary — it does bring its own all-new opporunities and risks. Some of the world’s leading data scientists, IT executives and business users will address them at <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=618509+for-big-data-analytics-recall-the-tried-and-true-old-school-rules&amp;utm_content=gigajordan">GigaOM’s Structure:Data conference</a> in New York on March 20-21.</p>
<p><em>Feature image courtesy of <a href="http://www.flickr.com/photos/75279887@N05/6914441342/">Flickr user luckey_sun</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=618509&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=68484"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=68484" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=618509+for-big-data-analytics-recall-the-tried-and-true-old-school-rules&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/12/whats-driving-the-next-phase-of-the-e-commerce-evolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=618509+for-big-data-analytics-recall-the-tried-and-true-old-school-rules&utm_content=gigajordan">What&#8217;s driving the next phase of the e-commerce evolution</a></li><li><a href="http://pro.gigaom.com/2013/01/ces-2013-flash-analysis-disruptions-and-disappointments-from-consumer-techs-biggest-show/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=618509+for-big-data-analytics-recall-the-tried-and-true-old-school-rules&utm_content=gigajordan">GigaOM Research highs and lows from CES 2013</a></li><li><a href="http://pro.gigaom.com/2013/01/how-hr-can-make-the-case-for-workforce-analytics/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=618509+for-big-data-analytics-recall-the-tried-and-true-old-school-rules&utm_content=gigajordan">How HR can make the case for workforce analytics</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/03/08/for-big-data-analytics-recall-the-tried-and-true-old-school-rules/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/03/luckey_sun-data.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/03/luckey_sun-data.jpg?w=150" medium="image">
			<media:title type="html">Luckey_sun data</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
		<item>
		<title>The Guardian&#8217;s data journalism is cool, but it can take three weeks to make</title>
		<link>http://gigaom.com/2013/02/26/the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make/</link>
		<comments>http://gigaom.com/2013/02/26/the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make/#comments</comments>
		<pubDate>Wed, 27 Feb 2013 01:17:05 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[Data journalism]]></category>
		<category><![CDATA[The Guardian]]></category>
		<category><![CDATA[Visualization]]></category>
		<category><![CDATA[Zoomdata]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=614838</guid>
		<description><![CDATA[Media outlets such as the Guardian take a long time to produce data-backed reports and visualizations, while big data analytics apps move fast but don't lack a human touch. Is there a happy medium?<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=614838&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This headline and body of this story were corrected at 11:10 p.m. with a more accurate description for <a href="https://twitter.com/fcage/status/306590026144284672">the typical period of time</a> for the deployment of Guardian journalist Feilding Cage’s data visualizations. Also, Guardian Datablog Editor Simon Rogers was incorrectly described as Cage’s boss, and that reference has been removed.</em></p>
<p>Once he finds a suitable topic, Feilding Cage, a New York-based developer and journalist for <em>The Guardian</em>, can easily spend three weeks generating the source information and designing a visualization for what’s become known as data journalism. The results bring understanding and reader engagement to topics that are otherwise discussed with a lot of words or static numbers. Readers can and do play around with the information, share it widely and discuss it for long periods after it appears online.</p>
<p></p><div id="attachment_614844" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/02/screen-shot-2013-02-26-at-4-32-07-pm.png"><img src="http://gigaom2.files.wordpress.com/2013/02/screen-shot-2013-02-26-at-4-32-07-pm.png?w=300&#038;h=196" alt="The Guardian's interactive guide to gay rights in the United States" width="300" height="196" class="size-medium wp-image-614844"></a><p class="wp-caption-text"><em>The Guardian’s</em> interactive guide to gay rights in the United States</p></div>Cage is one of a handful of <em>Guardian</em> journalists who generate reports that say new things about topics that <a href="http://www.guardian.co.uk/world/interactive/2012/may/08/gay-rights-united-states">pop up</a> <a href="http://www.guardian.co.uk/sport/interactive/2012/jul/31/london-2012-most-popular-athletes">in the news</a> or are just <a href="http://www.guardian.co.uk/news/datablog/2010/jul/16/doctor-who-villains-list">plain old interesting</a>. Cage and Simon Rogers, editor of <em>The Guardian</em> <a href="http://www.guardian.co.uk/news/datablog">Datablog</a> and <a href="http://www.guardian.co.uk/data">Data Store</a>, spoke about their work at the Strata conference at Santa Clara, Calif., on Tuesday.
<p>Along with <em>The Guardian</em>, a few other news organizations have been putting an emphasis on data-driven reporting and visualizations, apps and even games in the past few years, such as <a href="http://www.chicagotribune.com/news/data/">the Chicago Tribune</a>, <a href="http://datadesk.latimes.com/">the Los Angeles Times</a> and <a href="http://www.propublica.org/tools/">ProPublica</a> (Check out the <a href="http://datajournalismhandbook.org/">Data Journalism Handbook</a> for more information on this sort of work.)</p>
<p>Data journalism and visualization stand out for the verification and occasional gray-area explanations that journalists provide. Cage, for example, accompanied his <a href="http://www.guardian.co.uk/news/datablog/2012/may/10/data-visualisation-us-gay-rights">interactive visualization of gay rights in the United States</a> with a <a href="http://www.guardian.co.uk/news/datablog/2012/may/10/data-visualisation-us-gay-rights">blog post</a> explaining his methodology and disclosing his assumptions.</p>
<p></p><div id="attachment_614849" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2013/02/app-store-image-2.png"><img src="http://gigaom2.files.wordpress.com/2013/02/app-store-image-2.png?w=300&#038;h=225" alt="Screenshot from the Zoomdata's big data analytics iPad app" width="300" height="225" class="size-medium wp-image-614849"></a><p class="wp-caption-text">Screenshot from the Zoomdata’s big data analytics iPad app</p></div>It’s certainly one way to say something fresh with data, but it’s time-consuming when you consider big data analytics apps that provide users with real-time information users can compare against Hadoop-processed historical data, such as Zoomdata. (That company, which my colleague Derrick Harris <a href="http://gigaom.com/2012/11/13/heres-how-it-looks-when-big-data-goes-mobile-first/">covered last year</a>, released the beta version of its iPad app on Tuesday.)
<p>It would be neat to find a happy medium for enterprises that want original insights that every employee can see and use and act on but doesn’t take three weeks to generate. That’s especially true because the return on investment for work like Cage’s is hard to identify, although it’s possible the content could indirectly generate revenue by driving users to content they have to pay for.</p>
<p>Bridging the gap might be a matter of finding the perfect data scientist for the company. Or it might be a matter of time before the kind of work Cage does is automated. A computer already <a href="http://www.wired.com/gadgetlab/2012/04/can-an-algorithm-write-a-better-news-story-than-a-human-reporter/">can write</a> an earnings story, although it might be a few years before computers put wordsmiths out of business.</p>
<p>Maybe it just doesn’t make sense to cross data journalism visualizations with big data analytics apps. But I, for one, would like to play with such a tool.</p>
<p>Entrepreneurs from companies that work with and make visualizations from big data, such as <a href="http://gigaom.com/2012/03/21/quid-structure-data-2012/">Quid</a>, will speak at the <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=614838+the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make&amp;utm_content=gigajordan">GigaOM Structure:Data conference</a> on March 20-21 in New York.</p>
<p><em>Disclosure: The Guardian is an investor in Giga Omni Media, which publishes GigaOM.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=614838&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=484372"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=484372" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614838+the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/12/big-data-2013-key-trends-and-companies-to-watch/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614838+the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make&utm_content=gigajordan">Big data 2013: key trends and companies to watch</a></li><li><a href="http://pro.gigaom.com/2012/12/sector-roadmap-health-care-and-big-data-in-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614838+the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make&utm_content=gigajordan">Health care and big data in 2012</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-computing-and-trickle-down-analytics/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614838+the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make&utm_content=gigajordan">Cloud computing and trickle-down analytics</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/02/26/the-guardians-data-journalism-is-cool-but-it-takes-three-months-to-make/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/02/photo1.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/02/photo1.jpg?w=150" medium="image">
			<media:title type="html">photo</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/02/screen-shot-2013-02-26-at-4-32-07-pm.png?w=300" medium="image">
			<media:title type="html">The Guardian&#039;s interactive guide to gay rights in the United States</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/02/app-store-image-2.png?w=300" medium="image">
			<media:title type="html">Screenshot from the Zoomdata&#039;s big data analytics iPad app</media:title>
		</media:content>
	</item>
		<item>
		<title>DataDirect Networks brings out Hadoop appliance for enterprises</title>
		<link>http://gigaom.com/2013/02/26/datadirect-networks-brings-out-hadoop-appliance-for-enterprises/</link>
		<comments>http://gigaom.com/2013/02/26/datadirect-networks-brings-out-hadoop-appliance-for-enterprises/#comments</comments>
		<pubDate>Tue, 26 Feb 2013 08:01:15 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[DataDirect Networks]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[hardware]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=614077</guid>
		<description><![CDATA[Known for its supercomputing storage capabilities, DataDirect Networks is introducing a box for running Hadoop jobs, and a company executive sees the Hadoop hardware trend continuing.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=614077&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.ddn.com/en/">DataDirect Networks</a>, a hardware vendor with roots in providing storage for high-performance computing, is introducing a Hadoop appliance for enterprises, adding another notch to the trend of going with hardware for big data deployments. </p>
<p>DataDirect built hScaler to meet the speed and performance needs of those customers while offering ease of use for enterprise customers keen on Hadoop. Speed aside, hScaler stands out because it does away with direct-attached storage and incorporates RAID architecture instead. It lets users scale computing and storage resources independent of one another, precluding the chore of swapping out a server when a disk fails, as my colleague Derrick Harris has <a href="http://gigaom.com/2011/11/07/netapp-does-network-attached-hadoop/">written</a>.</p>
<p>The hScaler appliance, which runs with the Hortonworks Data Platform, can move fast with InfiniBand storage capable of operating at 40 gigabytes per second. In a sample configuration, 504 terabytes of storage are possible in a rack. The rack is four times as dense as a conventional data center rack, requiring less spending for cooling and square footage.</p>
<p>Because they aim to speed up and simplify Hadoop deployments, appliances such as hScaler are catching on, and DataDirect Chief Technology Officer Jean-Luc Chatelain expects the trend to continue. <a href="http://gigaom.com/2011/05/09/emc-hadoop/">Greenplum</a>, <a href="http://gigaom.com/2011/09/29/get-ready-for-oracles-takes-on-hadoop-nosql/">Oracle</a>, <a href="http://gigaom.com/2012/10/17/batten-down-the-analysts-its-a-big-data-bi-storm/">Teradata</a> and other companies sell appliances capable of running Hadoop jobs. For the sake of taking advantage of easy and quick data analytics processing, Chatelain sees the Hadoop hardware trend only getting bigger.</p>
<p>Appliances could be useful for enterprises looking to run Hadoop jobs, as employees can save time and focus more on building applications. Big data veterans will talk about innovative uses of Hadoop and other big data technologies at the <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=614077+datadirect-networks-brings-out-hadoop-appliance-for-enterprises&amp;utm_content=gigajordan">GigaOM Structure:Data conference</a> on March 20-21 in New York.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=614077&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=115459"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=115459" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614077+datadirect-networks-brings-out-hadoop-appliance-for-enterprises&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614077+datadirect-networks-brings-out-hadoop-appliance-for-enterprises&utm_content=gigajordan">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/01/12-tech-leaders-resolutions-for-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614077+datadirect-networks-brings-out-hadoop-appliance-for-enterprises&utm_content=gigajordan">12 tech leaders’ resolutions for 2012</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=614077+datadirect-networks-brings-out-hadoop-appliance-for-enterprises&utm_content=gigajordan">Dissecting the data: 5 issues for our digital future</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/02/26/datadirect-networks-brings-out-hadoop-appliance-for-enterprises/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/02/ddn-hadoop-appliance_-hscaler-hi-res-2-25-131.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/02/ddn-hadoop-appliance_-hscaler-hi-res-2-25-131.jpg?w=150" medium="image">
			<media:title type="html">DDN Hadoop appliance_ hScaler hi res 2.25.13</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
		<item>
		<title>VCs pour money into data startups during 2012</title>
		<link>http://gigaom.com/2013/02/22/vcs-pour-money-into-data-startups-during-2012/</link>
		<comments>http://gigaom.com/2013/02/22/vcs-pour-money-into-data-startups-during-2012/#comments</comments>
		<pubDate>Fri, 22 Feb 2013 20:47:42 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[big-data infrastructure]]></category>
		<category><![CDATA[venture capital]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=613205</guid>
		<description><![CDATA[Venture capitalists made more big data investments than ever before in 2012, and a few more deals have already closed in 2013. Entrepreneurs from several venture-backed data startups will speak at Structure:Data next month.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=613205&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Venture capitalists made more big data investments in 2012 than in any previous year, according to a <a href="http://www.cbinsights.com/blog/venture-capital/big-data-report">report</a> released Thursday from CB Insights and law firm Orrick.</p>
<p>“As evidenced by financing and deal activity, Big Data is gaining steam,” the report stated. The number of big data venture deals in 2012 rose by nearly 20 percent as compared with the previous year, going from 132 deals to 164. In the fourth quarter of 2012 alone, there were 49 venture deals for big data plays.</p>
<p>Among them are <a href="http://gigaom.com/2012/07/17/nosql-startup-basho-raises-11-1m-and-storms-japan/">Basho</a>, <a href="http://gigaom.com/2012/12/05/clearstory-data-raises-9m-and-might-actually-make-data-your-friend/">ClearStory Data</a>, <a href="http://gigaom.com/2012/11/14/continuuity-gets-10m-to-free-hadoop-from-itself/">Continuuity</a>, <a href="http://gigaom.com/2012/03/08/drawn-to-scale-raises-money-to-make-sql-big-data-ready/">Drawn to Scale</a>, <a href="http://gigaom.com/2012/11/19/mortar-data-closes-1-8m-seed-round-for-python-wrapped-hadoop/">Mortar Data</a> and <a href="http://gigaom.com/2012/02/07/hadoop-startup-wibidata-raises-5m-to-power-web-analytics/">WibiData</a>.</p>
<p>Despite the bump in number of deals, the total amount of money VCs threw at big data startups in 2012 — $1.39 billion — was down by nearly 7 percent year over year. The median deal size decreased slightly, from $6 million to $5.7 million.</p>
<p>The recipients of the biggest big data investments of 2012 were, in descending order, <a href="http://gigaom.com/2012/12/06/cloudera-snares-big-65m-more-to-boost-international-enterprise-growth/">Cloudera</a> ($65 million), Palantir Technologies ($56.1 million), Rocket Fuel ($50 milion), <a href="http://gigaom.com/2012/05/29/with-42m-more-10gen-wants-to-take-mongodb-mainstream/">10gen</a> ($42 million) and <a href="http://gigaom.com/2012/09/10/nimble-storage-gets-40m-as-ipo-approaches/">Nimble Storage</a> ($40.7 million).</p>
<p>Which VC firms closed the most big data deals last year? SV Angel, which invested in 14 companies. Sequoia Capital and IA Ventures tied for third, with 13 deals each, followed by New Enterprise Associates with 12 and First Round Capital with 10.</p>
<p>The percentage of deals for big data infrastructure in relation to all big data funding continued to fall, while big data analytics has risen to a high of 48 percent of all deals. (The other category listed in the report is big data applications.)</p>
<p>By far, California has been the leading state for big data venture deals in the past five years, with 230 in 2008-2012. New York, with 67 in that same time window, and Massachusetts, with 57, lag far behind.</p>
<p>Assuming the trend lines continue, then, you would be most likely to get funded if you run a big data analytics company in California.</p>
<p>As for 2013, the report cites a few big data investments VCs have already made so far: <a href="http://gigaom.com/2013/02/11/email-sender-sailthru-gets-19m-to-expand-custom-content-offerings/">Sailthru</a> and Nomi. </p>
<p>Others include <a href="http://gigaom.com/2013/01/16/has-ayasdi-turned-machine-learning-into-a-magic-bullet/">Ayasdi</a> and <a href="http://gigaom.com/2013/02/05/think-big-analytics-wants-to-help-companies-make-the-most-of-hadoop/">Think Big Analytics</a>, whose founder and CEO, Ron Bodkin, will speak at <a href="http://event.gigaom.com/structuredata?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=613205+vcs-pour-money-into-data-startups-during-2012&amp;utm_content=gigajordan">GigaOM’s Structure:Data conference</a> in New York in a few weeks.</p>
<p>In addition to Bodkin, entrepreneurs from other venture-backed big data startups who will speak at the conference include Justin Sheehy, chief technology officer at Basho; Jonathan Gray, Continuuity’s founder; Bradford Stephens, Drawn to Scale’s CEO; and Doug Daniels, chief technology officer of Mortar Data.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-1068023p1.html">Shutterstock user extradeda</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=613205&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=955113"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=955113" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613205+vcs-pour-money-into-data-startups-during-2012&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613205+vcs-pour-money-into-data-startups-during-2012&utm_content=gigajordan">Connected world: the consumer technology revolution</a></li><li><a href="http://pro.gigaom.com/2012/12/sector-roadmap-health-care-and-big-data-in-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613205+vcs-pour-money-into-data-startups-during-2012&utm_content=gigajordan">Health care and big data in 2012</a></li><li><a href="http://pro.gigaom.com/2012/09/listening-platforms-finding-the-value-in-social-media-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=613205+vcs-pour-money-into-data-startups-during-2012&utm_content=gigajordan">Listening platforms: finding the value in social media data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/02/22/vcs-pour-money-into-data-startups-during-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/01/shutterstock_1166971421.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/01/shutterstock_1166971421.jpg?w=150" medium="image">
			<media:title type="html">data</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
		<item>
		<title>Open-source search tool Elasticsearch gets $24 million</title>
		<link>http://gigaom.com/2013/02/19/open-source-search-tool-elasticsearch-gets-24m/</link>
		<comments>http://gigaom.com/2013/02/19/open-source-search-tool-elasticsearch-gets-24m/#comments</comments>
		<pubDate>Tue, 19 Feb 2013 13:00:37 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[Elasticsearch]]></category>
		<category><![CDATA[open-source enterprise search]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=611277</guid>
		<description><![CDATA[Elasticsearch's Series B round of funding shows continuing interest among easy-to-use, open-source big-data analytics tools. The funding also heats up the competition for a market leader.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=611277&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Open-source search provider <a href="http://elasticsearch.com/">Elasticsearch</a> has secured $24 million in Series B venture funding, showing business demand for free and simple big-data analytics. Mike Volpi of Index Ventures led the funding round, which included contributions from Benchmark Capital and SV Angel.</p>
<p>Amsterdam-based Elasticsearch, which has now raised a total of $34 million, generates revenue by teaching people how to use the tool at training courses and help them solve problems by way of support subscriptions. <a href="http://www.elasticsearch.org/blog/2010/02/08/youknowforsearch.html">Introduced in 2010</a> after founder Shay Banon developed it in his free time, the open-source Elasticsearch program today gets downloaded 200,000 times a month. Banon launched the company itself six months ago, when CEO Steven Schuurman got involved, in time to take on a Series A round.</p>
<p>Elasticsearch can make quick work of searching billions of documents and petabytes of data, structured and unstructured alike, said Banon, now the company&#8217;s chief technology officer. A single developer can use it to find needles amid haystacks of tweets and other kinds of data, eliminating the need for a team of data scientists, Banon said.</p>
<p>Like LucidWorks, Elasticsearch was developed on top of open-source Apache Lucene. But LucidWorks (which until August 2012 was named Lucid Imagination and has raised at least $16 million) focuses more on the enterprise, as my colleague Barb Darrow has <a href="http://gigaom.com/2012/05/09/enterprise-search-doesnt-begin-and-end-with-google/">reported</a>, while Elasticsearch has caught on with startups and enterprises alike in several industries, according to Schuurman.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=611277&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=891383"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=891383" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611277+open-source-search-tool-elasticsearch-gets-24m&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/12/whats-driving-the-next-phase-of-the-e-commerce-evolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611277+open-source-search-tool-elasticsearch-gets-24m&utm_content=gigajordan">What&#8217;s driving the next phase of the e-commerce evolution</a></li><li><a href="http://pro.gigaom.com/2013/01/ces-2013-flash-analysis-disruptions-and-disappointments-from-consumer-techs-biggest-show/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611277+open-source-search-tool-elasticsearch-gets-24m&utm_content=gigajordan">GigaOM Research highs and lows from CES 2013</a></li><li><a href="http://pro.gigaom.com/2013/01/how-hr-can-make-the-case-for-workforce-analytics/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611277+open-source-search-tool-elasticsearch-gets-24m&utm_content=gigajordan">How HR can make the case for workforce analytics</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/02/19/open-source-search-tool-elasticsearch-gets-24m/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/02/shay_banon_creator_of_elasticsearch2-11-131.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/02/shay_banon_creator_of_elasticsearch2-11-131.jpg?w=150" medium="image">
			<media:title type="html">Shay_Banon_Creator_of_Elasticsearch2.11.13</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
		<item>
		<title>Bina launches box to analyze genomes; cloud on the way</title>
		<link>http://gigaom.com/2013/02/18/bina-launches-box-to-analyze-genomes-cloud-on-the-way/</link>
		<comments>http://gigaom.com/2013/02/18/bina-launches-box-to-analyze-genomes-cloud-on-the-way/#comments</comments>
		<pubDate>Tue, 19 Feb 2013 05:01:15 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[Bina Technologies]]></category>
		<category><![CDATA[Genomics]]></category>
		<category><![CDATA[on-premise hardware]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=611236</guid>
		<description><![CDATA[New hardware from Bina Technologies gives analysts another example of a use case in which the public cloud isn't always the most appropriate solution.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=611236&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://binatechnologies.com/">Bina Technologies</a> is launching its Bina Box for on-premise genome processing, enabling researchers to quickly and cheaply analyze genomes and give doctors data-driven suggestions for custom treatments. </p>
<p>Use a genome sequencer to see one person’s DNA profile, and you’ll get 6 billion unique characters, or half a terabyte of data, said Bina co-founder and CEO Narges Bani Asadi. Start processing it to find mutations and variations, and you’ll find yourself with more than one terabyte. It’s not small data. As the price of sequencing a genome keeps dropping, scientists will want to do this more and more. It’s a big data problem, Bani Asadi said. The company wants to solve the problem on premises, with hardware and software.</p>
<p>The Bina Box will run on “high-end Intel processors and very high-bandwidth memory,” Bani Asadi said, and can scale out with additional Bina Boxes as customers processing needs change. Price depends on how much processing customers have in mind. If a customer wants to process 100 samples a month, for instance, it would cost $12,500 per month, or $125 per sample, said Mark Sutherland, Bina’s senior vice president of business development.</p>
<p>A Bina Cloud to tie in with the Bina Box will come later this year. The Bina Cloud will host just the needle of genomic data isolated from among the haystack of the entire genome, and it will enable scientists to aggregate many genomes, run data visualizations and collaborate to derive big-picture insights. Early customers are already using a pilot version of the cloud.</p>
<p>The box offering contributes more proof of the notion that, for certain uses, public clouds might not make sense, not yet anyway. (It remains a largely popular perspective in financial services, as my colleague Barb Darrow <a href="http://gigaom.com/2012/12/24/financial-services-and-the-public-cloud-go-or-no-go/">reported</a> a couple of months ago.) The Bina Box, for its part, “provides security that on-premise solutions have, versus cloud solutions, (which) sometimes people in this industry are not completely ready to move into,” Bani Asadi said. Big pharmaceutical companies are a perfect example, as a breach could hamper product development using genomes. Aside from security, there’s the matter of performance. “It’s impossible to send (half a terabyte of raw data from a sequencer) to the cloud easily,” Bani Asadi said.</p>
<p>Meanwhile, other genomics-focused startups, including <a href="http://gigaom.com/2011/10/12/dnanexus-cloudant-biotech-deals/">DNAnexus</a> and <a href="http://gigaom.com/2012/03/22/appistry-structure-data-2012/">Appistry</a>, are eschewing hardware and relying exclusively on cloud resources. </p>
<p>Whether hardware is involved or not, as my colleague Derrick Harris <a href="http://gigaom.com/2012/04/30/straight-outta-stanford-bina-wants-to-remake-genome-analysis/">mentioned</a> when he wrote about Bina last year, it’s clear that the rise of big genomics inherently equates to a rise in data. </p>
<p>The practice of merging life sciences and other industries with big data will come up in conversation when Ayasdi CEO Gurjeet Singh hits the stage at <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=611236+bina-launches-box-to-analyze-genomes-cloud-on-the-way&amp;utm_content=gigajordan">GigaOM’s Structure:Data conference</a> on March 20 in New York.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=611236&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=602149"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=602149" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611236+bina-launches-box-to-analyze-genomes-cloud-on-the-way&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/12/sector-roadmap-health-care-and-big-data-in-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611236+bina-launches-box-to-analyze-genomes-cloud-on-the-way&utm_content=gigajordan">Health care and big data in 2012</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611236+bina-launches-box-to-analyze-genomes-cloud-on-the-way&utm_content=gigajordan">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/aws-storage-gateway-jolts-cloud-storage-ecosystem/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=611236+bina-launches-box-to-analyze-genomes-cloud-on-the-way&utm_content=gigajordan">AWS Storage Gateway jolts cloud-storage ecosystem</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/02/18/bina-launches-box-to-analyze-genomes-cloud-on-the-way/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/02/bina-111.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/02/bina-111.jpg?w=150" medium="image">
			<media:title type="html">Bina 1[1][1]</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
		<item>
		<title>With new pricing scheme, ParAccel lets users analyze unlimited big data</title>
		<link>http://gigaom.com/2013/02/12/with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data/</link>
		<comments>http://gigaom.com/2013/02/12/with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data/#comments</comments>
		<pubDate>Tue, 12 Feb 2013 14:00:01 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[big data analytics]]></category>
		<category><![CDATA[paraccel]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=609680</guid>
		<description><![CDATA[To enable big-data analytics, ParAccel will charge users without consideration of nodes or terabytes through its Right to Deploy model.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=609680&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Analytic database company <a href="http://www.paraccel.com/">ParAccel</a> is addressing the realities of big data by letting customers pay a flat rate for storing as much data as they want, instead of paying by the terabyte.</p>
<p>The new licensing model, officially called Right to Deploy, supports what the company calls “unconstrained analytics.” The idea is that data scientists can stop worrying about how many nodes or terabytes they’re storing in ParAccel and instead focus on drawing insights from their data, said John Santaferraro, ParAccel’s vice president of solutions and product marketing.</p>
<p>Santaferraro declined to describe the formula for determining the price for a company to go with Right to Deploy, saying only that it will depend on the company and what the company wants to do with its data. Still, he believes it’s a good choice for clients. “It’s even more cost-effective than the per-node pricing (model),” Santaferraro said. “There’s no reason why customers wouldn’t want to do it.”</p>
<p>Right to Deploy seems to make sense as more companies are embracing big data and storing more data than ever before in order to run more thorough and complex analytics. GigaOM Research analyst Lynn Langit expects other companies to try the model “as (clients) want to keep more of their data,” she wrote in an email.</p>
<p>Amazon<a href="http://gigaom.com/2011/07/07/amazon-invests-big-in-big-data-startup/">became a ParAccel investor</a> in 2011 and <a href="http://gigaom.com/2012/11/28/amazons-new-data-warehousing-service-takes-aim-at-old-guard-it-giants/">gained more exposure</a> in November when Amazon Web Services announced its Redshift data warehouse service, which incorporates technology licensed from ParAccel.</p>
<p>Big data luminaries will discuss use cases, challenges and achievements at the <a href="http://event.gigaom.com/structuredata/?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=609680+with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data&amp;utm_content=gigajordan">GigaOM Structure:Data conference</a> in New York on March 20-21.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=609680&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=7238"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=7238" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=609680+with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/12/whats-driving-the-next-phase-of-the-e-commerce-evolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=609680+with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data&utm_content=gigajordan">What&#8217;s driving the next phase of the e-commerce evolution</a></li><li><a href="http://pro.gigaom.com/2011/03/putting-big-data-to-work-opportunities-for-enterprises/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=609680+with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data&utm_content=gigajordan">Putting Big Data to Work: Opportunities for Enterprises</a></li><li><a href="http://pro.gigaom.com/2013/01/ces-2013-flash-analysis-disruptions-and-disappointments-from-consumer-techs-biggest-show/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=609680+with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data&utm_content=gigajordan">GigaOM Research highs and lows from CES 2013</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/02/12/with-new-pricing-scheme-paraccel-lets-users-analyze-unlimited-big-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/02/john-santaferraro.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/02/john-santaferraro.jpg?w=150" medium="image">
			<media:title type="html">John Santaferraro</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>
	</item>
	</channel>
</rss>
