<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; parallel processing</title>
	<atom:link href="http://gigaom.com/tag/parallel-processing/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Tue, 21 May 2013 13:44:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; parallel processing</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>The role of converged infrastructure in the data center</title>
		<link>http://pro.gigaom.com/2012/12/why-converged-infrastructure-is-crucial-to-the-data-center/</link>
		<comments>http://pro.gigaom.com/2012/12/why-converged-infrastructure-is-crucial-to-the-data-center/#comments</comments>
		<pubDate>Fri, 28 Dec 2012 13:55:35 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/benwoo/" rel="author">Benjamin Woo</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Amazon Web Services]]></category>
		<category><![CDATA[amd]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[cloud security]]></category>
		<category><![CDATA[Cloud Storage]]></category>
		<category><![CDATA[cloud-infrastructure]]></category>
		<category><![CDATA[converged infrastructure]]></category>
		<category><![CDATA[convergence]]></category>
		<category><![CDATA[enterprise IT]]></category>
		<category><![CDATA[ethernet]]></category>
		<category><![CDATA[fibre-channel]]></category>
		<category><![CDATA[Flash storage]]></category>
		<category><![CDATA[iaas]]></category>
		<category><![CDATA[infrastructure as a service]]></category>
		<category><![CDATA[InifBand]]></category>
		<category><![CDATA[Moore's Law]]></category>
		<category><![CDATA[networks]]></category>
		<category><![CDATA[PaaS]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[ProfitBricks]]></category>
		<category><![CDATA[Public Clouds]]></category>
		<category><![CDATA[Rackspace]]></category>
		<category><![CDATA[saas]]></category>
		<category><![CDATA[saleseforce-com]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[service-level-agreements]]></category>
		<category><![CDATA[single network infrastructure]]></category>
		<category><![CDATA[SLAs]]></category>
		<category><![CDATA[software as a service]]></category>
		<category><![CDATA[Virtual machines]]></category>
		<category><![CDATA[x86]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=164371</guid>
		<description><![CDATA[Cloud computing's increased performance cannot be sustained if the corresponding cost to the service provider (SP) for delivering this performance also increases. What service providers need is a way of delivering low latency, fast response, and increasing performance while minimizing the cost of the network.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=597115&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>GigaOM Research projects that the cloud computing market will grow from $70.1 billion in 2012 to $158.8 billion in 2014. This adoption comes with a compensatory need for sustainable performance from cloud service providers. However, this increased performance cannot be sustained if the corresponding cost to the service provider (SP) for delivering this performance also increases. What service providers need is a way of delivering low latency, fast response, and increasing performance while minimizing the cost of the network.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=597115&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=879967"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=879967" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=597115+why-converged-infrastructure-is-crucial-to-the-data-center&utm_content=benwoony">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/12/migrating-media-applications-to-the-private-cloud-best-practices-for-businesses/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=597115+why-converged-infrastructure-is-crucial-to-the-data-center&utm_content=benwoony">Migrating media applications to the private cloud: best practices for businesses</a></li><li><a href="http://pro.gigaom.com/2011/04/infrastructure-q1-iaas-comes-down-to-earth-big-data-takes-flight/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=597115+why-converged-infrastructure-is-crucial-to-the-data-center&utm_content=benwoony">Infrastructure Q1: IaaS Comes Down to Earth; Big Data Takes Flight</a></li><li><a href="http://pro.gigaom.com/2012/06/cloud-computing-infrastructure-2012-and-beyond/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=597115+why-converged-infrastructure-is-crucial-to-the-data-center&utm_content=benwoony">Cloud computing infrastructure: 2012 and beyond</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2012/12/why-converged-infrastructure-is-crucial-to-the-data-center/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://pro.gigaom.com/files/2011/03/datacenter.jpg?w=150" />
		<media:content url="http://pro.gigaom.com/files/2011/03/datacenter.jpg?w=150" medium="image">
			<media:title type="html">datacenter</media:title>
		</media:content>

		<media:content url="http://2.gravatar.com/avatar/b58d6d331ef3725b56f64faa3d21be12?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">benwoony</media:title>
		</media:content>
	</item>
		<item>
		<title>RainStor raises $12M to make your big data small</title>
		<link>http://gigaom.com/2012/10/04/rainstor-raises-12m-to-turn-your-big-data-small/</link>
		<comments>http://gigaom.com/2012/10/04/rainstor-raises-12m-to-turn-your-big-data-small/#comments</comments>
		<pubDate>Thu, 04 Oct 2012 07:01:23 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[RainStor]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=569739</guid>
		<description><![CDATA[Big data company RainStor has raised $12 million is Series C funding for its database that's designed to shrink data footprints by at least 95 percent. It also plays nice with Hadoop, meaning a system can handle ad hoc SQL queries as well as MapReduce jobs.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=569739&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://rainstor.com/">RainStor</a>, a database vendor that focuses on extreme compression of large data sets, has raised a $12 million Series C round from Credit Suisse and Rogers Venture Partners as well as existing investors Doughty Hanson Technology Ventures, Storm Ventures and The Dow Chemical Company. RainStor is riding quite a wave of momentum right now, no doubt thanks to claims it can reduce data volumes by at least 95 percent using its unique compression and de-duplication technology.</p>
<p>The company focuses on historical data that might need to be stored for long periods of time and isn’t likely to change. In some cases that might be data stored for regulatory compliance, while in others it might be machine data such as server logs that would never be written over in the first place. RainStor has also beating the drum pretty loudly around big data, where it certainly has a compelling proposition.</p>
<p>Because RainStor utilizes massively parallel processing, it can ingest and query data in a hurry. The company claims ingest speeds of 30,000 to 50,000 records per second per core. Query results also return faster because the analyses are taking place over such a smaller volume of files.</p>
<p><a href="http://gigaom2.files.wordpress.com/2012/10/diagram-1-scale.jpeg"><img title="Diagram-1-Scale" src="http://gigaom2.files.wordpress.com/2012/10/diagram-1-scale.jpeg?w=708" alt=""   class="aligncenter size-full wp-image-569746"></a></p>
<p>RainStor also can sit atop Hadoop for users that want to churn through unstructured data via MapReduce or Pig jobs as well as run more-traditional SQL queries. This type of hybrid system is becoming a hot topic in big data circles, and is the premise of a handful of other products, including <a href="http://gigaom.com/cloud/hadapt-raises-9-5m-for-hadoop-data-warehouse/">Boston-based startup Hadapt</a>. I’ll actually be speaking about the intersection of Hadoop and SQL at our <a href="http://event.gigaom.com/structureeurope?utm_source=data&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=569739+rainstor-raises-12m-to-turn-your-big-data-small&amp;utm_content=dharrisstructure">Structure: Europe conference</a> on Oct. 16 with Cloudera’s Amr Awadallah and NuoDB’s Barry Morris.</p>
<p>RainStor says it can provide 50 to 80 percent smaller Hadoop clusters and increase performance by up to 100 times. These are both big concerns as data volumes continue to explode. Quantcast <a href="http://gigaom.com/data/quantcast-releases-bigger-faster-stronger-hadoop-file-system/">created its newly open sourced distributed file system</a> so the company could bring operational expenses under control while also lowering the time it takes to process its ever-expanding data set.</p>
<p>You can watch RainStor Chief Architect Mark Cusack discussing big data at our Structure: Data conference below.<br><iframe style="border: 0; outline: 0;" src="http://cdn.livestream.com/embed/gigaombigdata?layout=4&amp;clip=pla_afb3ecbe-ca33-4a17-81eb-02a1a14ec2fe&amp;height=340&amp;width=560&amp;autoplay=false" frameborder="0" scrolling="no" width="560" height="340"></iframe></p>
<div style="font-size: 11px; padding-top: 10px; text-align: center; width: 560px;">Watch <a title="live streaming video" href="http://www.livestream.com/?utm_source=lsplayer&amp;utm_medium=embed&amp;utm_campaign=footerlinks">live streaming video</a> from <a title="Watch gigaombigdata at livestream.com" href="http://www.livestream.com/gigaombigdata?utm_source=lsplayer&amp;utm_medium=embed&amp;utm_campaign=footerlinks">gigaombigdata</a> at livestream.com</div>
<div style="font-size: 11px; padding-top: 10px; width: 560px; text-align: left;"></div>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-511585p1.html">Shutterstock user z0w</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=569739&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=114237"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=114237" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=569739+rainstor-raises-12m-to-turn-your-big-data-small&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=569739+rainstor-raises-12m-to-turn-your-big-data-small&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/sector-roadmap-hadoop-platforms-2012/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=569739+rainstor-raises-12m-to-turn-your-big-data-small&utm_content=dharrisstructure">2012: The Hadoop infrastructure market booms</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=569739+rainstor-raises-12m-to-turn-your-big-data-small&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/10/04/rainstor-raises-12m-to-turn-your-big-data-small/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_113600470.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_113600470.jpg?w=150" medium="image">
			<media:title type="html">Shiny database</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/10/diagram-1-scale.jpeg" medium="image">
			<media:title type="html">Diagram-1-Scale</media:title>
		</media:content>
	</item>
		<item>
		<title>Researchers using AI to build robotic bees</title>
		<link>http://gigaom.com/2012/10/01/researchers-using-ai-to-build-robotic-bees/</link>
		<comments>http://gigaom.com/2012/10/01/researchers-using-ai-to-build-robotic-bees/#comments</comments>
		<pubDate>Mon, 01 Oct 2012 19:54:25 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[algorithms]]></category>
		<category><![CDATA[artificial intelligence]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[distributed computing]]></category>
		<category><![CDATA[GPUs]]></category>
		<category><![CDATA[internet of things]]></category>
		<category><![CDATA[parallel processing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=568438</guid>
		<description><![CDATA[Building a robotic bee that acts like a real bee is a lot more complicated than programming a robot to fly around from flower to flower. A project called Green Brain aims to build an artificial intelligence system that can actually mimic a bee's brain.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=568438&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In the near future, we might have to be a little more careful about swatting pesky bees while we&#8217;re trying to enjoy some time outdoors. British researchers at the Universities of Sussex and Sheffield are <a href="http://www.sheffield.ac.uk/news/nr/green-brain-honey-bee-model-sheffield-university-1.212235">developing a computer model of a bee&#8217;s brain</a> that they hope can help scientists better understand the brains of more-complex animals, such as humans, and perhaps power artificial intelligence systems for bee-like robots.</p>
<p>Called &#8220;Green Brain,&#8221; the project is trying to advance the science of AI beyond systems that just follow a predetermined set of rules, and into an area where AI systems can actually act autonomously and respond to sensory signals. The researchers are focusing on the parts of a bee&#8217;s brain responsible for vision and sense of smell, and will expect the system to be able to find the &#8220;source of particular odours or gases in the same way that a bee can identify particular flowers,&#8221; among other things.</p>
<p>Although a very difficult mission to accomplish, the relatively narrow focus of this project should make it easier to pull off than <a href="http://gigaom.com/2012/08/21/vicarious-gets-15m-to-search-for-the-key-to-artificial-intelligence/">other AI efforts that focus on more-complex human brains</a>. Scientists have tried modeling human decision-making for decades, but humans&#8217; irrationality and seemingly random choices make it difficult to do so outside of specific situations or controlled experiments.</p>
<p>The Green Brain team suggests its AI system could be used to power robotic bees that can help pollinate plants in the face of declining bees populations worldwide, and also could be beneficial in search-and-rescue missions. In order to carry out any of these tasks, researchers have to design systems that are capable of adapting to the world around them. Especially when acting as a research tool for understanding how bees react to sensory stimuli in the manners they do, too strong a reliance on fixed rules and instructions about how it should act might limit the effectiveness of a robotic bee.</p>
<div id="attachment_568523" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/10/image_large.jpg"><img  title="image_large" src="http://gigaom2.files.wordpress.com/2012/10/image_large.jpg?w=300&#038;h=162" alt="" width="300" height="162" class="size-medium wp-image-568523" /></a><p class="wp-caption-text">The RoboBees prototpye</p></div>
<p>The researchers working on Green Brain think its work on AI might physically manifest itself in a project like <a href="http://robobees.seas.harvard.edu/">RoboBees</a>, which is currently underway by a group at Harvard University. Aside from pollination and search-and-rescue, the RoboBees team suggests its robotic bees could be used for weather-mapping, traffic -monitoring and even military surveillance. That project also focuses heavily on bees&#8217; colony behavior to coordinate group decision-making and action.</p>
<p>Of course, robotic bees are as much hardware as they are artificial intelligence &#8212; how they consume and process data will affect how the decisions they ultimately make &#8212; which is why various research projects might want to combine their forces to some degree. Whereas Green Brain has partnered with GPU manufacturer Nvidia to ensure fast modeling and fast calculations within the bees&#8217; brains, RoboBees is working on the whole package. It&#8217;s building sensors, wings and everything else necessary to make a robotic bee fly and sense the world like an actual insect.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-495565p1.html">Shutterstock user Andrej Vodolazhskyi</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=568438&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=574546"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=574546" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=568438+researchers-using-ai-to-build-robotic-bees&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=568438+researchers-using-ai-to-build-robotic-bees&utm_content=dharrisstructure">Connected world: the consumer technology revolution</a></li><li><a href="http://pro.gigaom.com/2011/11/the-internet-of-things-creating-tomorrows-health-care/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=568438+researchers-using-ai-to-build-robotic-bees&utm_content=dharrisstructure">The Internet of things: creating tomorrow&#8217;s health care</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=568438+researchers-using-ai-to-build-robotic-bees&utm_content=dharrisstructure">Dissecting the data: 5 issues for our digital future</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/10/01/researchers-using-ai-to-build-robotic-bees/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_67885621.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/10/shutterstock_67885621.jpg?w=150" medium="image">
			<media:title type="html">Robotic bee</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/10/image_large.jpg?w=300" medium="image">
			<media:title type="html">image_large</media:title>
		</media:content>
	</item>
		<item>
		<title>Supercomputing&#8217;s problem isn&#8217;t power, it&#8217;s software</title>
		<link>http://gigaom.com/2011/09/02/supercomputings-problem-isnt-power-its-software/</link>
		<comments>http://gigaom.com/2011/09/02/supercomputings-problem-isnt-power-its-software/#comments</comments>
		<pubDate>Fri, 02 Sep 2011 21:45:58 +0000</pubDate>
		<dc:creator>Stacey Higginbotham</dc:creator>
				<category><![CDATA[energy efficiency]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[multicore]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[webscale]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=400829</guid>
		<description><![CDATA[Getting to next generation systems in high performance computing has inspired technologies that we now use everyday in data centers, but as the drive for exascale computing continues, it seems ingenuity is coming to an end. But is power consumption the real hurdle for bigger systems?<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=400829&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_247758" class="wp-caption alignleft" style="width: 260px"><a href="http://gigaom.files.wordpress.com/2008/07/roadrunner.jpg"><img  title="roadrunner" src="http://gigaom.files.wordpress.com/2008/07/roadrunner.jpg?w=708" alt=""   class="size-full wp-image-247758" /></a><p class="wp-caption-text">The first petaflop supercomputer, IBM&#39;s Roadrunner.</p></div>
<p>The quest to develop <a href="http://gigaom.com/2010/06/23/structure-2010-the-quest-for-exascale-computing-power/">next-generation systems in high-performance computing</a> has inspired technologies such as InfiniBand and parallel processing that have made their ways into data centers, but as the drive for exascale computing continues, it seems ingenuity is coming to an end. The government sees power consumption as the <a href="http://gigaom.com/cleantech/biggest-problem-for-exascale-computing-power/">biggest problem</a> and cost associated with exascale HPC (that&#8217;s a billion billion calculations per second) but Andrew Jones, <a href="http://www.hpcwire.com/hpcwire/2011-08-29/exascale:_power_is_not_the_problem_.html">writing at HPCwire, argues,</a> that power isn&#8217;t the primary problem, programming is.</p>
<blockquote><p>Power<em> is</em> a problem for exascale computing, and with current budget expectations is probably the biggest technical challenge for the hardware. Demonstrating the value of increased investment in supercomputing to funders and the public/media is probably an urgent challenge, too. But the top roadblock for achieving the hugely beneficial potential output from exascale computing is software. There are many challenges to do with the software ecosystem that will take years, lots of skilled workers, and sustained/predictable investment to solve.</p></blockquote>
<p>I&#8217;ve seen this debate play out in the comments here at GigaOM on stories <a href="http://gigaom.com/cloud/question-everything-a-new-processor-for-big-data/">like this one</a>, and find myself wondering if we have indeed relied on the &#8220;easy&#8221; fix of Moore&#8217;s Law to carry us forward in terms of performance. But now, as we&#8217;re reaching the end of that road in terms of manufacturing chips as well as power consumption, the hardware industry is trying to deliver new forms of silicon such as those based on <a href="http://gigaom.com/cloud/question-everything-a-new-processor-for-big-data/">memristors</a> or some <a href="http://gigaom.com/2011/08/17/for-our-sensor-heavy-future-ibm-cooks-up-a-new-silicon-brain/">designed after the brain</a>.</p>
<p>But before we talk about a wholesale shift in hardware platforms, Jones, from Numerical Algorithms Group, <a href="http://www.hpcwire.com/hpcwire/2011-08-29/exascale:_power_is_not_the_problem_.html">asks us to consider software</a>. Parallel programming is still in its early days in terms of harnessing the massive compute available in a supercomputer, and Jones argues that figuring out solutions to just-identified problems associated with exascale computing will take large teams of experts and long-term investment.</p>
<p>I&#8217;d also argue that it needs to make the HPC industry attractive to the folks who are excited by solving these types of problems, but who might be currently creating startups or working for webscale companies wrestling with similar problems in different areas. Perhaps bringing some of these new, software-savvy minds into the HPC space might help spark the programming innovation that Jones thinks we need.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=400829&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=53639"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=53639" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=400829+supercomputings-problem-isnt-power-its-software&utm_content=shigginbotham">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2010/11/how-to-make-cloud-computing-greener/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=400829+supercomputings-problem-isnt-power-its-software&utm_content=shigginbotham">How to Make Cloud Computing Greener</a></li><li><a href="http://pro.gigaom.com/2009/04/as-devices-converge-chip-vendors-girding-for-a-fight/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=400829+supercomputings-problem-isnt-power-its-software&utm_content=shigginbotham">As Devices Converge, Chip Vendors Girding For a Fight</a></li><li><a href="http://pro.gigaom.com/2013/01/cleantech-fourth-quarter-2012-analysis/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=400829+supercomputings-problem-isnt-power-its-software&utm_content=shigginbotham">The fourth quarter of 2012 in cleantech</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/09/02/supercomputings-problem-isnt-power-its-software/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:thumbnail url="http://gigaom.files.wordpress.com/2008/07/roadrunner.jpg?w=150" />
		<media:content url="http://gigaom.files.wordpress.com/2008/07/roadrunner.jpg?w=150" medium="image">
			<media:title type="html">roadrunner</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/aee37121e18bf76bb9fee4494bab237a?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">shigginbotham</media:title>
		</media:content>

		<media:content url="http://gigaom.files.wordpress.com/2008/07/roadrunner.jpg" medium="image">
			<media:title type="html">roadrunner</media:title>
		</media:content>
	</item>
		<item>
		<title>Meet the new breed of HPC vendor</title>
		<link>http://gigaom.com/2011/08/03/meet-the-new-breed-of-hpc-vendor/</link>
		<comments>http://gigaom.com/2011/08/03/meet-the-new-breed-of-hpc-vendor/#comments</comments>
		<pubDate>Wed, 03 Aug 2011 22:00:10 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[Amazon Web Services]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[clusters]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[hpc]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[platform-computing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=387976</guid>
		<description><![CDATA[The face of high-performance computing is changing. That means new technologies and new names, but also familiar names in new places. Anyone that doesn't have a cloud computing story to tell, possibly a big data one too, might starting looking really old really quickly.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=387976&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_388171" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer.jpg"><img  title="Columbia_Supercomputer" src="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="size-medium wp-image-388171" /></a><p class="wp-caption-text">These things are expensive.</p></div>
<p>The face of high-performance computing is changing. That means new technologies and new names, but also familiar names in new places. Sure, cluster management is still important, but anyone that doesn&#8217;t have a cloud computing story to tell, possibly a big data one too, might starting looking really old really quickly.</p>
<p>We&#8217;ve been seeing the change happening over the past couple years, as Amazon Web Services and Hadoop, in particular, have changed the nature of HPC by democratizing access to resources and technologies. AWS did it by making lots of cores available on demand, freeing scientists from the need to buy expensive clusters or wait for time on their organization&#8217;s system. That story clearly caught on, and even large pharmaceutial companies and space agencies <a href="http://gigaom.com/2010/03/22/to-space-and-beyond-the-rise-of-research-driven-cloud-computing/">began running certain research tasks </a>on AWS.</p>
<p>Amazon <a href="http://gigaom.com/2010/07/13/amazons-cloud-gets-a-supercomputing-cluster/">took things a step further</a> by supplementing its virtual machines with physical speed in the form of Cluster Compute Instances. With a 10 GbE backbone, Intel Nehalem processors and the <a href="http://gigaom.com/cloud/amazon-gets-graphic-with-cloud-gpu-instances/">option of Nvidia Tesla GPUs</a>, users can literally have a Top500 supercomputer available on demand for a fraction of the cost of buying one. Cycle Computing, a startup that helps customers configure AWS-based HPC clusters, recently <a href="http://blog.cyclecomputing.com/2011/04/single-click-starts-a-10000-core-cyclecloud-cluster-for-1060-hr.html">launched a 10,000-core offering</a> that costs only $1,060 per hour.</p>
<p><a href="http://gigaom2.files.wordpress.com/2011/08/hadoop-logo.jpg"><img  title="hadoop-logo" src="http://gigaom2.files.wordpress.com/2011/08/hadoop-logo.jpg?w=708" alt=""   class="alignright size-full wp-image-388178" /></a>Hadoop, for its part, made Google- or Yahoo-style parallel data-processing available to anyone with the ambition to learn how to do it &#8212; and a few commodity servers. It&#8217;s not the be all, end all of the big data movement, but Hadoop&#8217;s certainly driving the ship and has opened mainstream businesses to the promise of advanced analytics. Most organizations have lots of data, some of it not suitable for a database or data warehouse, and tools like Hadoop let them get real value from it if they&#8217;re willing to put in the effort.</p>
<p><strong>New blood</strong></p>
<p>This change in the way organizations think about obtaining advanced computing capabilities has opened the door for new players that operate at the intersection of HPC, cloud computing and big data.</p>
<p>One relative newcomer to HPC &#8212; and someone that should give Appistry and everyone else a run for their money &#8212; is Microsoft. It only got into the space in the late &#8217;00s, so it didn&#8217;t have much of a legacy business to disrupt when the cloud took over. In a <a href="http://www.hpcwire.com/hpcwire/2011-07-27/microsoft_reshuffles_hpc_organization,_azure_cloud_looms_large.html">recent interview with <em>HPCwire</em></a>, Microsoft HPC boss Ryan Waite details, among other things, an increasingly HPC-capable Windows Azure offering and &#8220;the emergence of a new HPC workload, the data intensive or &#8216;big data&#8217; workload.&#8221;</p>
<p>Indeed, Microsoft has been busy trying to accommodate big data workloads. It just <a href="http://research.microsoft.com/en-us/projects/azure/daytona.aspx">launched an Azure-based MapReduce service</a> called Project Daytona, and has been <a href="http://gigaom.com/cloud/with-dryad-microsoft-is-trying-to-democratize-big-data/">developing its on-premise Hadoop alternative</a> called Dryad for quite some time.</p>
<p><a href="http://gigaom2.files.wordpress.com/2011/08/daytona.jpg"><img  title="daytona" src="http://gigaom2.files.wordpress.com/2011/08/daytona.jpg?w=708" alt=""   class="aligncenter size-full wp-image-388190" /></a></p>
<p>The latest company to get into the game is Appistry. As I noted when <a href="http://gigaom.com/cloud/appistry-raises-12m-realigns-around-big-data/">covering its $12 million funding round</a> yesterday, Appistry actually made a natural shift from positioning itself as a cloud software vendor to positioning itself as an HPC vendor. Sultan Meghji, Appistry&#8217;s vice president of analytics applications, explained to me just how far down the HPC path the company already has gone.</p>
<p>Probably the most extreme change is that Appistry is now offering its own cloud service for running HPC computational or analytic workloads. It&#8217;s based on a per-pipeline pricing model, and today is targeted at the life sciences community. Meghji said the scope will expand, but the cloud service just &#8220;soft launched&#8221; in May, and life sciences is a new field of particular interest to Appistry.</p>
<p>The new cloud service is built using Appistry&#8217;s existing CloudIQ software suite, which already is tuned for HPC on commodity gear thanks to parallel-processing capabilities, &#8220;computational storage&#8221; (i.e., co-locating processors and relevant data to speed throughput) and Hadoop compatibility.</p>
<p>Appistry is also <a href="http://www.appistry.com/solutions/life-sciences">tuning its software</a> to work with common HPC and data-processing algorithms, as well as some it&#8217;s writing itself, and is bringing in expertise in fields like life sciences to help the company better serve those markets.</p>
<p>&#8220;Cloud has become, frankly, meaningless,&#8221; Meghji explained. Appistry had a choice between trying to get heard of the noise of countless other private cloud offerings or trying to add distinct value in areas where its software was always best suited. It chose the latter, in part because Appistry&#8217;s products are best taken as a whole. If you need just cloud, HPC or analytics, Meghji said, Appistry might not be the right choice.</p>
<p>One would be remiss to ignore AWS as a potential HPC heavyweight, too, although it seems content to simply provide the infrastructure and let specialists handle the management. However, its Cluster Compute Instances and Elastic MapReduce service do open the doors for other companies, such as Cycle Computing, to make their mark on the HPC space by leveraging that readily available computing power.</p>
<p><strong>The old guard gets it</strong></p>
<p>But the emergence of new vendors isn&#8217;t to say that mainstay HPC vendors were oblivious to the sea change. Many, including <a href="http://www.adaptivecomputing.com/news/2011moab-hpcsa.php">Adaptive Computing</a> and <a href="http://www.univa.com/products">Univa UD</a>, have been particularly willing to embrace the cloud movement.</p>
<p>Platform Computing has really been making a name for itself in this new HPC world. It recently <a href="http://gigaom.com/cloud/forrester-on-private-clouds-platform-looks-the-best-for-now/">outperformed the competition</a> in Forrester Research&#8217;s comparison of private-cloud software offerings, and its ISF software powers SingTel&#8217;s nationwide cloud service. Spotting an opportunity to cash in on the hype around Hadoop, Platform also has <a href="http://gigaom.com/cloud/hadoop-may-be-hot-but-it-needs-to-be-useful/">turned its attention to big data</a> with a management product that&#8217;s compatible a number of other data-processing frameworks and storage engines.</p>
<p>Whoever the vendor, though, there&#8217;s lots of opportunity. That&#8217;s because the new HPC opens the doors to an endless pipeline of new customers and new business ideas that could never justify buying a supercomputer or developing a MapReduce implementation, but that can enter a credit-card number or buy a handful of commodity servers with the best of them.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=387976&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=569516"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=569516" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2010/12/9-companies-that-pushed-the-infrastructure-discussion-in-2010/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">9 Companies that Pushed the Infrastructure Discussion in 2010</a></li><li><a href="http://pro.gigaom.com/2010/07/infrastructure-overview-q2-2010/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">Infrastructure Overview, Q2 2010</a></li><li><a href="http://pro.gigaom.com/2012/12/cloud-computing-2013-how-to-navigate-without-a-map/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">Cloud computing 2013: how to navigate without a map</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/08/03/meet-the-new-breed-of-hpc-vendor/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=150" medium="image">
			<media:title type="html">Columbia_Supercomputer</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=300" medium="image">
			<media:title type="html">Columbia_Supercomputer</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/hadoop-logo.jpg" medium="image">
			<media:title type="html">hadoop-logo</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/daytona.jpg" medium="image">
			<media:title type="html">daytona</media:title>
		</media:content>
	</item>
		<item>
		<title>Concurrent raises $900K to make Hadoop easier</title>
		<link>http://gigaom.com/2011/07/26/concurrent-raises-900k-to-make-hadoop-easier/</link>
		<comments>http://gigaom.com/2011/07/26/concurrent-raises-900k-to-make-hadoop-easier/#comments</comments>
		<pubDate>Tue, 26 Jul 2011 23:00:47 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[@CNN]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cascading]]></category>
		<category><![CDATA[Concurrent]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[parallel processing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=383747</guid>
		<description><![CDATA[Concurrent, the company providing the Cascading data workflow API, has raised a $900,000 seed round to capitalize on the newfound excitement around Hadoop. Cascading is an open-source API for creating and running data workflows atop Hadoop clusters. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=383747&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://concurrentinc.com">Concurrent</a>, the company providing the Cascading data workflow API, has raised a $900,000 seed round to capitalize on the newfound excitement around Hadoop. The funding came from Rembrandt Venture Partners, True Ventures (see disclosure below) and several angel investors.</p>
<p>Cascading, which Concurrent Founder and CEO Chris Wensel created, is an open-source API for creating and running data workflows atop Hadoop clusters. It’s an alternative to MapReduce, the standard framework for writing Hadoop applications, as well as Hive, the Facebook-created Apache project that provides data warehouse features for Hadoop environments. The Concurrent web site describes Cascading like this:</p>
<blockquote><p>The processing API lets the developer quickly assemble complex distributed processes without having to “think” in MapReduce. And to efficiently schedule them based on their dependencies and other available meta-data.</p></blockquote>
<p><a href="http://gigaom2.files.wordpress.com/2011/07/cluster.png"><img title="cluster" src="http://gigaom2.files.wordpress.com/2011/07/cluster.png?w=708" alt=""   class="aligncenter size-full wp-image-384176"></a></p>
<p>Concurrent has been around since 2007, but only now is there enough activity around Hadoop and big data to justify putting much effort into building new products and hiring a team of engineers, said Wensel.</p>
<p>Certainly, Hadoop is at its pinnacle right now, with <a href="http://gigaom.com/cloud/emc-hadoop/">EMC</a>, <a href="http://gigaom.com/cloud/battle-on-mapr-cloudera-pimp-their-version-of-hadoop/">MapR</a> and <a href="http://gigaom.com/cloud/what-it-means-if-yahoo-hadoop-spinoff-doesnt-do-distribution/">Hortonworks</a> all making very public entrances into the distribution space lately to join incumbents such as Cloudera, IBM and Amazon Web Services (with Elastic MapReduce). Now that companies are comfortable with the prospect of Hadoop, and possibly using it to some degree, Wensel thinks they’re ready to start hearing about MapReduce alternatives.</p>
<p>Looking forward, Wensel thinks there’s an opportunity to expand Cascading support beyond Hadoop distributions (it’s currently certified for Apache Hadoop, MapR, EMC and Elastic MapReduce) and into new Hadoop-based “forks, derivatives and re-imaginings” that gain enough traction. Longer term, he sees an opportunity for a common API to support analytic workflows across a variety of distributed systems, Hadoop-based or not.</p>
<p>In the near future, though, Cascading users can look forward to version 2.0 in the fall, which includes a number of significant improvements, including the ability to use system memory for faster analysis of small datasets. He also said Concurrent plans to create products complementary to the Cascading framework that will help monitor monitor workflows and let users make better decisions by giving them more insights.</p>
<p>Although Concurrent’s seed funding is relatively small compared some of the other big data investments we’ve seen lately, it’s significant. I predicted in my <a href="http://pro.gigaom.com/2011/07/infrastructure-q2-big-data-and-paas-gain-more-momentum/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=383747+concurrent-raises-900k-to-make-hadoop-easier&amp;utm_content=dharrisstructure">second-quarter wrap-up for GigaOM Pro</a> that we’ll start seeing more investment in higher-level Hadoop tools, and Cascading is one of them.</p>
<p>With the distribution layer locked down, there’s plenty of room for alternative data-processing frameworks such as Cascading and turnkey analytics products such as Zettaset, which just <a href="http://gigaom.com/cloud/zettaset-raises-3m-for-the-consumerization-of-big-data/">raised $3 million</a> itself, to steal some of the spotlight and make it easier to take advantage of Hadoop’s parallel-processing prowess.</p>
<p><em><strong>Disclosure</strong>: Concurrent is backed by True Ventures, a venture capital firm that is an investor in the parent company of this blog, Giga Omni Media. Om Malik, the founder of Giga Omni Media, is also a venture partner at True.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=383747&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=720130"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=720130" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=383747+concurrent-raises-900k-to-make-hadoop-easier&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/07/infrastructure-q2-big-data-and-paas-gain-more-momentum/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=383747+concurrent-raises-900k-to-make-hadoop-easier&utm_content=dharrisstructure">Infrastructure Q2: Big data and PaaS gain more momentum</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=383747+concurrent-raises-900k-to-make-hadoop-easier&utm_content=dharrisstructure">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li><li><a href="http://pro.gigaom.com/2012/11/unlocking-big-datas-potential-with-search/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=383747+concurrent-raises-900k-to-make-hadoop-easier&utm_content=dharrisstructure">How search can unlock the power of big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/07/26/concurrent-raises-900k-to-make-hadoop-easier/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/07/handing-over-money-e1309964553912.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/07/handing-over-money-e1309964553912.jpg?w=150" medium="image">
			<media:title type="html">handing over money</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/07/cluster.png" medium="image">
			<media:title type="html">cluster</media:title>
		</media:content>
	</item>
		<item>
		<title>Need cash? Forget plasma, and donate CPU time instead.</title>
		<link>http://gigaom.com/2011/07/15/need-cash-forget-plasma-and-donate-cpu-time-instead/</link>
		<comments>http://gigaom.com/2011/07/15/need-cash-forget-plasma-and-donate-cpu-time-instead/#comments</comments>
		<pubDate>Fri, 15 Jul 2011 20:07:29 +0000</pubDate>
		<dc:creator>Stacey Higginbotham</dc:creator>
				<category><![CDATA[@NYT]]></category>
		<category><![CDATA[80legs]]></category>
		<category><![CDATA[Broadband]]></category>
		<category><![CDATA[CPUs]]></category>
		<category><![CDATA[cpusage]]></category>
		<category><![CDATA[Grid Computing]]></category>
		<category><![CDATA[hardware]]></category>
		<category><![CDATA[innovation]]></category>
		<category><![CDATA[jeff martens]]></category>
		<category><![CDATA[parallel computing]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[plura processing]]></category>
		<category><![CDATA[process node]]></category>
		<category><![CDATA[setihome]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=377027</guid>
		<description><![CDATA[Do you sleep? Have a laptop or desktop that sits idle during those eight hours? Need an extra $10 a month? If so, startup CPUsage has a proposition that you should hear.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=377027&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_377169" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2011/07/martens_avatar_wide-e1310756246112.jpg"><img  title="martens_avatar_wide" src="http://gigaom2.files.wordpress.com/2011/07/martens_avatar_wide-e1310756246112.jpg?w=300&#038;h=199" alt="" width="300" height="199" class="size-medium wp-image-377169" /></a><p class="wp-caption-text">Jeff Martens CEO of CPUsage</p></div>
<p>Do you sleep? Have a laptop or desktop that sits idle during those eight hours? Need an extra $10 a month? If so, startup <a href="http://www.cpusage.com/">CPUsage</a> has a proposition that you should hear. The eight-month-old startup wants to pay folks so it can harness their idle compute time to sell to corporations. CEO and Co-Founder Jeff Martens estimates an average user donating four hours of compute time every day could score about $10 a month.</p>
<p>Martens and his two other co-founders want to turn their Portland, Ore.-based startup into the Folding@home or <a href="http://setiathome.berkeley.edu/">SETI@home</a> of the for-profit world. The goal is to enroll users and use their computers to help corporate customers (the startup already has two) speed up their analysis jobs. The company&#8217;s software breaks up a job into bits and sends those bits to the user&#8217;s computer for parallel processing. One customer uses the service for decoding agricultural DNA. Martens knows it&#8217;s not right for all jobs, as latency is high and there might be security concerns.</p>
<p><a href="http://gigaom2.files.wordpress.com/2011/07/cpusage_provider_map.png"><img  title="CPUsage_provider_map" src="http://gigaom2.files.wordpress.com/2011/07/cpusage_provider_map.png?w=708" alt=""   class="aligncenter size-full wp-image-377168" /></a></p>
<p>However, he stressed that each node only gets 1/500 of the data to process, which makes it harder to reassemble the job. The map above shows where the company has nodes today. Also, unlike <a href="http://gigaom.com/2008/10/27/more-money-i-game-developers-with-grid-computing/">Plura Processing</a>, another company doing this sort of CPU harvesting for profit, CPUsage works directly with its members as opposed to through a game or other intermediary, so its software resides on the user&#8217;s hardware as opposed to harnessing CPU cycles through a browser. This allows for extra security over how information is treated, asserts Martens.</p>
<p>The idea of harnessing idle compute time stretches all the way back to 1999 when the SETI@home project was created to help listen for extraterrestrial life. Other non-profit projects followed, including <a href="http://folding.stanford.edu/">Folding@home</a>, which studies proteins to find cures for diseases. But Martens believes there is a market for folks who would be more interested in giving up their idle compute time if they were directly compensated.</p>
<p>Under his planned model, for every dollar CPUsage earns, about 45 cents goes back to the person who owns the computer CPUsage is harnessing. The company charges customers about 15 cents per CPU per hour in line with the pricing for Amazon&#8217;s medium-sized instances. Obviously, this isn&#8217;t a solution for everyone, but it might be useful for enough people to make a viable business.</p>
<p>And Martens thinks it could do some social good as well. He says the company is talking to the Portland School District to take on some of the idle computers in the district&#8217;s schools &#8212; which Martens thinks could generate $1 million for the district in a year. Another customer of CPUsage recently hit it big in the media for its ability to <a href="http://gadgetwise.blogs.nytimes.com/2011/06/12/a-free-site-helps-find-stolen-cameras/">harness compute power to find lost gadgets.</a></p>
<p>For now, Martens goal is to raise in the range of $750,000 for a seed round to help expand the number of computers CPUsage will have in its system. He would also like to move from the current architecture that requires all jobs and traffic to flow through the company&#8217;s servers to more of a peer-to-peer Skype-like design to help cut down on bandwidth costs.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=377027&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=513995"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=513995" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=377027+need-cash-forget-plasma-and-donate-cpu-time-instead&utm_content=shigginbotham">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/06/from-car-to-cloud-the-future-of-the-in-vehicle-app-landscape/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=377027+need-cash-forget-plasma-and-donate-cpu-time-instead&utm_content=shigginbotham">From car to cloud: the future of the in-vehicle app landscape</a></li><li><a href="http://pro.gigaom.com/2010/01/whats-next-for-the-cloud-distributed-architectures/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=377027+need-cash-forget-plasma-and-donate-cpu-time-instead&utm_content=shigginbotham">What&#8217;s Next for the Cloud? Distributed Architectures</a></li><li><a href="http://pro.gigaom.com/2012/12/why-converged-infrastructure-is-crucial-to-the-data-center/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=377027+need-cash-forget-plasma-and-donate-cpu-time-instead&utm_content=shigginbotham">The role of converged infrastructure in the data center</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/07/15/need-cash-forget-plasma-and-donate-cpu-time-instead/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/07/martens_avatar_wide-e1310756246112.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/07/martens_avatar_wide-e1310756246112.jpg?w=150" medium="image">
			<media:title type="html">martens_avatar_wide</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/aee37121e18bf76bb9fee4494bab237a?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">shigginbotham</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/07/martens_avatar_wide-e1310756246112.jpg?w=300" medium="image">
			<media:title type="html">martens_avatar_wide</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/07/cpusage_provider_map.png" medium="image">
			<media:title type="html">CPUsage_provider_map</media:title>
		</media:content>
	</item>
		<item>
		<title>LexisNexis open-sources its Hadoop killer</title>
		<link>http://gigaom.com/2011/06/15/lexisnexis-open-sources-its-hadoop-killer/</link>
		<comments>http://gigaom.com/2011/06/15/lexisnexis-open-sources-its-hadoop-killer/#comments</comments>
		<pubDate>Wed, 15 Jun 2011 15:03:25 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[@NYT]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Hadoop. big data]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[parallel processing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=361801</guid>
		<description><![CDATA[LexisNexis is releasing a set of open-source, data-processing tools it says outperforms Hadoop and even handles workloads that Hadoop presently cannot. There have been calls for a legitimate alternative to Hadoop, and this certainly looks like one.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=361801&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2011/03/fighting-elephants.jpg"><img  title="fighting elephants" src="http://gigaom2.files.wordpress.com/2011/03/fighting-elephants.jpg?w=300&#038;h=202" alt="" width="300" height="202" class="alignleft size-medium wp-image-322010" /></a>LexisNexis is releasing a set of open-source, data-processing tools that it says outperforms Hadoop and even handles workloads Hadoop presently can&#8217;t. The technology (and new business line) is called <a href="http://hpccsystems.com">HPCC Systems</a>, and was created 10 years ago within the LexisNexis Risk Solutions division that analyzes huge amounts of data for its customers in intelligence, financial services and other high-profile industries. There have been calls for a legitimate alternative to Hadoop, and this certainly looks like one.</p>
<p>According to Armando Escalante, CTO of LexisNexis Risk Solutions, the company decided to release HPCC now because it wanted to get the technology into the community before Hadoop became the de facto option for big data processing. Escalante told me during a phone call that he thinks of Hadoop as &#8220;a guy with a machete in front of a jungle &#8212; they made a trail,&#8221; but that he thinks HPCC is superior.</p>
<p>But in order to compete for mindshare and developers, he said, the company felt it had to open-source the technology. One big thing Hadoop has going for it is its open-source model, Escalante explained, which attracts a lot of developers and a lot of innovation. If his company wanted HPCC to &#8220;remain relevant&#8221; and keep improving through new use cases and ideas from a new community, the time for release was now and open source had to be the model.</p>
<p>Hadoop, of course, is the <a href="http://hadoop.apache.org">Apache Software Foundation project</a> created several years ago by then-Yahoo employee Doug Cutting. It has become a critical tool for web companies &#8212; including Yahoo and Facebook &#8212; to process their ever-growing volumes of unstructured data, and is fast making its way into organizations of all types and sizes. Hadoop has <a href="http://gigaom.com/cloud/can-a-yahoo-cloudera-and-ibm-split-the-hadoop-pot/">spawned a number of commercial distributions</a> and <a href="http://gigaom.com/cloud/as-big-data-takes-off-the-hadoop-wars-begin/">products</a>, too, including from Cloudera, <a href="http://gigaom.com/cloud/emc-hadoop/">EMC</a>  and IBM.</p>
<h2>How HPCC works</h2>
<p>Hadoop relies on two core components to store and process huge amounts of data: the Hadoop Distributed File System and Hadoop MapReduce. However, as Cloudant CEO Mike Miller <a href="http://gigaom.com/cloud/democratizing-big-data-is-hadoop-our-only-hope/">explained in a post</a> over the weekend, MapReduce is still a relatively complex language for writing parallel-processing workflows. HPCC seeks to remedy this with its Enterprise Control Language.</p>
<p>Escalante says ECL is a declarative, data-centric language that abstracts a lot of the work necessary within MapReduce. For certain tasks that take a thousand lines of code in MapReduce, he said, ECL only requires 99 lines. Furthermore, he explained, ECL doesn&#8217;t care how many nodes are in the cluster because the system automatically distributes data across however many nodes are present. Technically, though, HPCC could run on just a single virtual machine. And, says Escalante, HPCC is written in C++ &#8212; like the original Google MapReduce  on which Hadoop MapReduce is based &#8212; which he says makes it inherently faster than the Java-based Hadoop version.</p>
<p>HPCC offers two options for processing and serving data: the Thor Data Refinery Cluster and the Roxy Rapid Data Delivery Cluster. Escalante said Thor &#8212; so named for its hammer-like approach to solving the problem &#8212; crunches, analyzes and indexes huge amounts of data a la Hadoop. Roxie, on the other hand, is more like a traditional relational database or database warehouse that even can serve transactions to a web front end.</p>
<p>We didn&#8217;t go into detail on HPCC&#8217;s storage component, but Escalante noted that it does utilize a distributed file system, although it can support a variety of off-node storage architectures and/or local solid-state drives.</p>
<p>He added that in order to ensure LexisNexis wasn&#8217;t blinded by &#8220;eating its own dogfood,&#8221; his team hired a Hadoop expert to kick the tires on HPCC. The consultant was impressed, Escalante said, but did note some shortcomings that the team addressed as it readied the technology for release. It also built a converter for migrating Hadoop applications written in the Pig language to ECL.</p>
<h2>Can HPCC Systems actually compete?</h2>
<p>The million-dollar question is whether HPCC Systems can actually attract an ecosystem of contributors and users that will help it rise above the status of big data also-ran. Escalante thinks it can, in large part because HPCC already has been proven in production dealing with LexisNexis Risk Solutions&#8217; 35,000 data sources, 5,000 transactions per second and large, paying customers. He added that the company also will provide enterprise licenses and proprietary applications in addition to the open-source code. Plus, it already has potential customers lined up.</p>
<p>It&#8217;s often said that competition means validation. Hadoop has moved from a niche set of tools to the core of a potentially huge business that&#8217;s growing every day, and even <a href="http://gigaom.com/cloud/with-dryad-microsoft-is-trying-to-democratize-big-data/">Microsoft </a> <a href="http://gigaom.com/cloud/with-dryad-microsoft-is-trying-to-democratize-big-data/">has a horse in this race</a> with its Dryad set of big data tools. Hadoop has already proven itself, but the companies and organizations relying on it for the their big data strategies can&#8217;t rest on their laurels.</p>
<p><em>Image courtesy of <a href="http://www.flickr.com/photos/nileguide/3153221708/">Flickr user NileGuide.com</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=361801&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=699017"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=699017" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=361801+lexisnexis-open-sources-its-hadoop-killer&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=361801+lexisnexis-open-sources-its-hadoop-killer&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2009/06/why-the-hoopla-about-hadoop/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=361801+lexisnexis-open-sources-its-hadoop-killer&utm_content=dharrisstructure">Why the Hoopla About Hadoop?</a></li><li><a href="http://pro.gigaom.com/2009/04/as-devices-converge-chip-vendors-girding-for-a-fight/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=361801+lexisnexis-open-sources-its-hadoop-killer&utm_content=dharrisstructure">As Devices Converge, Chip Vendors Girding For a Fight</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/06/15/lexisnexis-open-sources-its-hadoop-killer/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/03/fighting-elephants.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/03/fighting-elephants.jpg?w=150" medium="image">
			<media:title type="html">fighting elephants</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/03/fighting-elephants.jpg?w=300" medium="image">
			<media:title type="html">fighting elephants</media:title>
		</media:content>
	</item>
		<item>
		<title>Big data on micro servers? You bet.</title>
		<link>http://gigaom.com/2011/06/13/big-data-on-micro-servers-you-bet/</link>
		<comments>http://gigaom.com/2011/06/13/big-data-on-micro-servers-you-bet/#comments</comments>
		<pubDate>Mon, 13 Jun 2011 21:00:22 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[@NYT]]></category>
		<category><![CDATA[arm-based-servers]]></category>
		<category><![CDATA[Atom]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Calxeda]]></category>
		<category><![CDATA[energy efficiency]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[Intel Atom]]></category>
		<category><![CDATA[low-power servers]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[processors]]></category>
		<category><![CDATA[scale-out]]></category>
		<category><![CDATA[SeaMicro]]></category>
		<category><![CDATA[servers]]></category>
		<category><![CDATA[Web Infrastructure]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=360069</guid>
		<description><![CDATA[Online dating service eHarmony is using SeaMicro's specialized Intel Atom-powered servers as the foundation of its Hadoop infrastructure, demonstrating that big data applications such as Hadoop might be a killer app for low-powered micro servers. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=360069&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_360393" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2011/06/seamicro.jpg"><img title="seamicro" src="http://gigaom2.files.wordpress.com/2011/06/seamicro-e1307997406535.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="size-medium wp-image-360393"></a><p class="wp-caption-text">SeaMicro's SM10000-64 server.</p></div>
<p>Online dating service eHarmony is <a href="http://www.marketwire.com/press-release/eharmony-selects-seamicro-to-streamline-hadoop-capabilities-1525698.htm">using SeaMicro’s specialized Intel</a> <a href="http://www.marketwire.com/press-release/eharmony-selects-seamicro-to-streamline-hadoop-capabilities-1525698.htm">Atom-powered servers</a> as the foundation of its Hadoop infrastructure, demonstrating that big data might be a killer app for low-powered micro servers. The general consensus is that specialized gear from startups such as <a href="http://seamicro.com">SeaMicro</a> and Calxeda — which can save money and power by using processors initially designed for netbooks and smartphones instead of servers — will need to attract both applications and big-name users before it really catches on. Big data looks like it might bring both.</p>
<p><a href="http://calxeda.com">Calxeda</a> CEO Barry Evans explained to me via e-mail why big data and micro servers are such a great match. “Big data is a great fit for us and ARM servers for three key reasons,” he wrote. “First,  it is an inherently scale-out application, requiring a lot of efficient processors. Second, it is a fast-growing market place without a lot of requirements for legacy baggage. Third, the application software is widely available to run on ARM today.”</p>
<p>There arguably is a big difference between the ARM-based Calxeda and x86 (Atom)-based SeaMicro in terms of availability of software designed to run atop their respective architectures, but Evans’ first two factors are applicable across the micro server ecosystem.</p>
<p>Because Hadoop (and big data, in general) is a new undertaking for many organizations, most don’t likely have any infrastructure set aside for it, and it does require a scale-out architecture to best leverage the performance benefits of parallel processing. Speaking of Hadoop specifically, it also doesn’t require the types of high-end, high-powered gear that typically power enterprise data warehouse offerings. In this situation, micro servers present a compelling argument because they provide lots of cores and high efficiency in a small footprint.</p>
<p>SeaMicro <a href="http://gigaom.com/cloud/under-competitive-pressure-intel-builds-low-power-server-chip-for-a-startup/">packs 512 Intel Atom cores into a 10U-sized appliance</a> that acts like a 1.28-terabit-per-second fabric and boasts a 75-percent reduction in energy usage compared with traditional servers. Calxeda, for its part, is <a href="http://gigaom.com/cloud/a-sneak-peek-at-calxedas-arm-based-servers/">putting 120 quad-core ARM Cortex A9 processors</a> into a 2U box that it claims consumes only 5 Watts per node. As Stacey Higginbotham pointed out when discussing Calxeda’s plans, “Intel and AMD boxes using the x86 architecture can consume about 80 to 130 watts for a quad-core machine, while low-power versions of x86 chips can consume 30 watts.” For users wanting to stick with traditional server chips, Dell <a href="http://gigaom.com/cloud/with-sales-booming-dell-sees-a-micro-server-future/">sells a line of cloudscale micro servers</a> featuring Intel’s 30-Watt Xeon processors.</p>
<p>eHarmony’s story aligns with Evans’ theory and is a prime example of why micro servers are so ideally suited for big data applications such as Hadoop. eHarmony began its Hadoop foray by running batch-processing jobs in the cloud, but soon found out that cloud computing can get very expensive when users are running many instances and have to transfer huge amounts of data to and from the cloud. Having never invested in hardware for its Hadoop cluster, eHarmony was free to look at a brand-new architecture like what SeaMicro provides. As <a href="http://www.datacenterknowledge.com/archives/2011/06/10/eharmony-switches-from-cloud-to-atom-servers/">Rich Miller of Data Center Knowledge reported</a>, “the switch reduced its operating expenses by ‘tens of thousands of dollars a month,’ and its total cost of ownership (TCO) by 74 percent.”</p>
<p>Given the now public success at eHarmony, it’s possible we’ll actually start seeing OEM deals pop up between server makers like SeaMicro and Calxeda and big-data software vendors such as Cloudera. As both big data and micro servers catch on among mainstream organizations, it would make some sense for vendors to ride the wave together by pushing pre-integrated systems in which the software and hardware have been specifically tuned to work together. Hadoop in a box, if you will.</p>
<p>We’ll be tackling the subject of next-generation distributed architecture in depth at <a href="http://event.gigaom.com/structure/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_campaign=intext&amp;utm_term=360069+big-data-on-micro-servers-you-bet&amp;utm_content=dharrisstructure">Structure 2011</a> next week, including during a panel featuring Anant Agarwal, co-founder and CTO of <a href="http://gigaom.com/cloud/tilera-scores-45m-for-specialized-cloud-chips/">fellow micro-server maker Tilera</a>, and HP Labs’ Partha Ranganathan. The concept of packing lots of low-power cores into a small form factor has applications outside of big data — <a href="http://www.seamicro.com/sites/default/files/MozillaCaseStudy.pdf">powering web applications</a> probably being chief among them — and it should be very interesting to hear what new use cases might find themselves well suited for this architecture and how vendors such as Tilera, SeaMicro, Calxeda and even traditional server vendors must evolve to address this broader ecosystem of apps.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=360069&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=423302"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=423302" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=360069+big-data-on-micro-servers-you-bet&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/06/cloud-computing-infrastructure-2012-and-beyond/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=360069+big-data-on-micro-servers-you-bet&utm_content=dharrisstructure">Cloud computing infrastructure: 2012 and beyond</a></li><li><a href="http://pro.gigaom.com/2013/01/cleantech-fourth-quarter-2012-analysis/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=360069+big-data-on-micro-servers-you-bet&utm_content=dharrisstructure">The fourth quarter of 2012 in cleantech</a></li><li><a href="http://pro.gigaom.com/2012/12/how-the-mobile-first-world-will-transform-the-data-center/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=360069+big-data-on-micro-servers-you-bet&utm_content=dharrisstructure">How tomorrow&#8217;s mobile-centric data centers will look</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/06/13/big-data-on-micro-servers-you-bet/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/06/seamicro-e1307997406535.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/06/seamicro-e1307997406535.jpg?w=150" medium="image">
			<media:title type="html">seamicro</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/06/seamicro-e1307997406535.jpg?w=300" medium="image">
			<media:title type="html">seamicro</media:title>
		</media:content>
	</item>
		<item>
		<title>Democratizing big data &#8212; is Hadoop our only hope?</title>
		<link>http://gigaom.com/2011/06/11/democratizing-big-data-is-hadoop-our-only-hope/</link>
		<comments>http://gigaom.com/2011/06/11/democratizing-big-data-is-hadoop-our-only-hope/#comments</comments>
		<pubDate>Sat, 11 Jun 2011 19:00:02 +0000</pubDate>
		<dc:creator>MIke Miller</dc:creator>
				<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[parallel processing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=359512</guid>
		<description><![CDATA[Is Hadoop our only hope for solving big data challenges? From scalability to fault tolerance, Hadoop does myriad things very well. Yet, Hadoop is not the solution to all big data problems and use cases. Several key issues remain, including investment, complexity and batch-only processing.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=359512&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2011/04/freedom-of-choice-a22077920.jpg"><img  title="Freedom-of-choice-a22077920" src="http://gigaom2.files.wordpress.com/2011/04/freedom-of-choice-a22077920.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="alignleft size-medium wp-image-336156" /></a>There has been an absolute explosion in the data space recently. Devices, consumers and companies are not only producing data at a stellar pace, but also introducing amazing complexity into the data itself. It is a rare day without a freshly funded company, a new product launch or a front-page acquisition around data to help address these issues. The level of required scale and sophistication of even a small startup is accelerating along with increasing expectations for “real-time” answers.</p>
<p>The good news is that along with these great challenges, we are also seeing an explosion in innovative tools and techniques. One of the most renowned is Hadoop, Apache’s implementation of Google’s MapReduce software that makes data-parallel processing efficient, affordable and approachable even for users without enormous engineering teams.</p>
<p>But is Hadoop our only hope? From scalability to fault tolerance, Hadoop does myriad things very well. Yet, Hadoop is not the solution to all big data problems and use cases. Several key issues remain:</p>
<p><strong>Investment. </strong> Even with the explosion of tools, support and services surrounding Hadoop, getting Hadoop deployed in production requires a significant investment in terms of training users and tuning clusters and workflows for optimal usage. Hadoop’s complexity opens the door for support and services organizations such as Cloudera, as well as interface plays like Datameer and its spreadsheet-based user interface. Also, it&#8217;s surprising that the industry hasn&#8217;t seen far more growth around hosted Hadoop solutions like Amazon’s Elastic MapReduce, which directly targets these capex and opex issues.</p>
<p><strong>Data Complexity.</strong> MapReduce is an excellent paradigm for very efficient parallel execution of many analytic algorithms, but it isn&#8217;t one-size-fits-all. As the issue of data complexity continues to grow, some problems are simply ill-suited for MapReduce. In particular, the very same social graphs that are driving a large sector of the big data problem are <a href="http://www.computer.org/portal/web/csdl/doi/10.1109/MCSE.2009.120">poorly matched to the underlying design choices of MapReduce</a>. Although brute force techniques can be applied, they ultimately fail by leading to inefficiencies that can significantly impact a business’s bottom line.</p>
<p><strong>Batch.</strong> I find this issue the most compelling. Big data, for better or worse, is still generally confined to the data warehouse. That largely means “offline” data that is subject to the classic extract-transform-load (ETL) workflow. Hadoop helps minimize the turnaround time for ETL, but batch processing still means something more akin to “tomorrow” than “real-time.” In contrast, Google’s <a href="http://labs.google.com/papers/mapreduce.html">original MapReduce paper</a> gave an inspiring example of inserting MapReduce directly into a sequential C++ program for efficient real-time computation. Unfortunately, we find far too few real-time examples in production.</p>
<p>As a big data solution provider myself, I think we should set our sights much, much higher than simply handling more data in the warehouse. Instead, we should integrate the concepts of scalability, fault tolerance and efficient parallel computation into the very systems that drive the end-user experience. Many of the players in the nascent NoSQL database market actually aim to do precisely that, but those tools come with their own issues and effects that could fill an entirely separate post.</p>
<p>The emergence of Hadoop has gone a very long way toward democratizing big data, but big data spans across important issues that Hadoop alone doesn’t address. While large engineering shops such as Facebook are <a href="http://hadoopblog.blogspot.com/2011/05/realtime-hadoop-usage-at-facebook-part.html">able to deal with these challenges</a>, the intellectual and capital expenditures required for success reinforce that our revolution is still far from complete. As the market matures, I expect big data’s near future includes technological and market responses to address these increasingly important issues.</p>
<p><em>Mike Miller is co-founder and chief scientist at Cloudant.</em></p>
<p><em>Image courtesy of <a href="http://www.fotocommunity.com/pc/account/myprofile/1448272">Krzysztof Poltorak</a></em>.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=359512&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=790077"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=790077" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=359512+democratizing-big-data-is-hadoop-our-only-hope&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=359512+democratizing-big-data-is-hadoop-our-only-hope&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=359512+democratizing-big-data-is-hadoop-our-only-hope&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=359512+democratizing-big-data-is-hadoop-our-only-hope&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/06/11/democratizing-big-data-is-hadoop-our-only-hope/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/04/freedom-of-choice-a22077920.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/04/freedom-of-choice-a22077920.jpg?w=150" medium="image">
			<media:title type="html">Freedom-of-choice-a22077920</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/04/freedom-of-choice-a22077920.jpg?w=300" medium="image">
			<media:title type="html">Freedom-of-choice-a22077920</media:title>
		</media:content>
	</item>
	</channel>
</rss>
