<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; appistry</title>
	<atom:link href="http://gigaom.com/tag/appistry/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Wed, 22 May 2013 20:58:22 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; appistry</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>Better medicine, brought to you by big data</title>
		<link>http://gigaom.com/2012/07/15/better-medicine-brought-to-you-by-big-data/</link>
		<comments>http://gigaom.com/2012/07/15/better-medicine-brought-to-you-by-big-data/#comments</comments>
		<pubDate>Sun, 15 Jul 2012 13:00:53 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[DNAnexus]]></category>
		<category><![CDATA[Genomics]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[IBM]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=542509</guid>
		<description><![CDATA[Slowly but surely, health care is becoming a killer app for big data. Whether it's Hadoop, machine learning or natural-language processing, folks in the worlds of medicine and hospital administration understand that data is the key to helping them take their fields to the next level.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=542509&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/07/stethoscope.jpg"><img  title="stethoscope" src="http://gigaom2.files.wordpress.com/2012/07/stethoscope.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="alignleft size-medium wp-image-542631" /></a>Slowly but surely, <a href="http://gigaom.com/2012/03/09/healthcare-needs-a-big-data-infusion/">health care is becoming a killer app for big data</a>. Whether it&#8217;s Hadoop, machine learning, natural-language processing or some other technique, folks in the worlds of medicine and hospital administration understand that new types of data analysis are the key to helping them take their fields to the next level.</p>
<p>Here are some of the interesting use cases we&#8217;ve written about over the past year or so, and a few others I&#8217;ve just come across recently. If you have a cool one &#8212; or a suggestion for a new use of big data within the healthcare space &#8212; share it in the comments:</p>
<ul>
<li><strong>Genomics.</strong> This is the epitomic case for big data and health care. Genome sequencing is <a href="http://gigaom.com/cloud/as-genomics-pushes-big-data-limits-cloud-could-save-the-day/">getting cheaper by the day</a> and produces mountains of data. Companies such as <a href="http://gigaom.com/2012/03/22/dnanexus-structure-data-2012/">DNAnexus</a>, <a href="http://gigaom.com/cloud/straight-outta-stanford-bina-wants-to-remake-genome-analysis/">Bina Technologies</a>, <a href="http://gigaom.com/2012/03/22/appistry-structure-data-2012/">Appistry</a> and <a href="http://www.nextbio.com/b/nextbioCorp.nb">NextBio</a> want to make analyzing that data to discover cures for diseases faster, easier and cheaper than ever using lots cutting-edge algorithms and lots of cloud computing cores. Dell is <a href="http://content.dell.com/us/en/gen/d/corp-comm/pediatric-cancer">providing computing power for two research centers</a> to try and treat a particular form of pediatric cancer based on each child&#8217;s specific genetic profile.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/pediatric-cancer-infographi.jpeg"><img  title="pediatric-cancer-infographi" src="http://gigaom2.files.wordpress.com/2012/07/pediatric-cancer-infographi.jpeg?w=708" alt=""   class="aligncenter size-full wp-image-542674" /></a></div>
<ul>
<li><strong>BI for doctors. </strong>Doctors and staff at Seattle Children&#8217;s Hospital are <a href="http://gigaom.com/cloud/data-for-doctors-big-data-meets-a-big-business/">using Tableau to analyze and visualize terabytes of data</a> dispersed across the institution&#8217;s servers and databases. Not only does visualizing the data help reduce medical errors and help the hospital plan trials but, as of this time last year, its focus on data had saved the hospital $3 million on supply chain costs.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/health-costs.jpg"><img  title="health costs" src="http://gigaom2.files.wordpress.com/2012/07/health-costs.jpg?w=708" alt=""   class="aligncenter size-full wp-image-542634" /></a></div>
<ul>
<li><strong>Semantic search. </strong>Imagine you&#8217;re a doctor trying to learn about a new patient or figure out who among your patients might benenfit from a new technique. But patient records have been scattered throughout departments, vary in format and, perhaps worst of all, all use the ontologies of the department that created the record. A startup called Apixio is trying to fix this by <a href="http://gigaom.com/cloud/apixio-is-bringing-big-data-to-medical-records-in-the-cloud/">centralizing records in the cloud and applying semantic analysis</a> to uncover everything doctors need, regardless who wrote it.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/mine3-officenotes-semantic-smaller.jpg"><img  title="mine3-officenotes-semantic-smaller" src="http://gigaom2.files.wordpress.com/2012/07/mine3-officenotes-semantic-smaller.jpg?w=708" alt=""   class="aligncenter size-full wp-image-542675" /></a></div>
<ul>
<li><strong>Hadoop for everything.</strong> Cloudera is <a href="http://gigaom.com/cloud/hadoop-meets-healthcare-in-new-partnership/">partnering with the Mount Sinai School of Medicine</a> to help it develop new methods and systems for analyzing biological data. But that&#8217;s just the latest of Cloudera medical efforts, which also include working with the Food and Drug Administration to detect unsuspected adverse side effects from multi-drug combinations, and Emory University on helping pathologists more accurately analyze medical images. One of Cloudera&#8217;s customers, <a href="https://www.explorys.com">Explorys</a>, built a business around aggregating and analyzing medical records, and Intel and NextBio are <a href="http://www.genomeweb.com/blog/intel%E2%80%94nextbio-collaboration-aims-perfect-hadoop-genomics">teaming to tune Hadoop for processing genomic datasets</a>.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/datagrid.jpg"><img  title="datagrid" src="http://gigaom2.files.wordpress.com/2012/07/datagrid.jpg?w=708" alt=""   class="aligncenter size-full wp-image-542676" /></a></div>
<ul>
<li><strong>Watson. </strong>IBM has dozens of irons in the healthcare fire, but its coolest might well be <a href="http://www-03.ibm.com/press/us/en/pressrelease/35402.wss">a partnership with WellPoint</a> to put the <em>Jeopardy!</em> champion question-answering system in doctors&#8217; offices. Watson could help doctors answer questions posed in natural language by analyzing them against mountains of medical research data that no individual doctor could possibly read and digest.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/watsonpower7.jpeg"><img  title="WatsonPower7" src="http://gigaom2.files.wordpress.com/2012/07/watsonpower7.jpeg?w=708" alt=""   class="aligncenter size-full wp-image-542635" /></a></div>
<ul>
<li><strong>Getting ahead of disease. </strong>It&#8217;s always good if you figure out how to diagnose diseases early without expensive tests, and that&#8217;s<a href="http://gigaom.com/cloud/the-biggest-obstacle-to-embracing-big-data-you/"> just what Seton Healthcare was able to do</a> thanks to its big data efforts. Trying to find better ways to detect congestive heart failure early in order to save the exorbitant costs of treatment as the disease progresses, a team found that a distended jugular vein — something that can be spotted during any routine physical exam — is a particularly high risk factor.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/shutterstock_68642137-1.jpg"><img  title="shutterstock_68642137 (1)" src="http://gigaom2.files.wordpress.com/2012/07/shutterstock_68642137-1.jpg?w=708" alt=""   class="aligncenter size-full wp-image-542678" /></a></div>
<ul>
<li><strong>Data scientist in residence. </strong>Here&#8217;s a new title for a healthcare organization &#8212; chief data scientist. Yet, that&#8217;s <a href="http://www.marketwatch.com/story/alliance-health-networks-hires-data-analytics-expert-deep-dhillon-as-new-chief-data-scientist-2012-05-30">exactly the position Alliance Health Networks just added in May</a>. The company, which provides social networks focused on specific medical conditions, acquired medical research database <a href="https://www.medify.com/">Medify</a> and decided it needed someone to lead the effort of analyzing all that data and providing valuable feedback to network users.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/medify.jpg"><img  title="medify" src="http://gigaom2.files.wordpress.com/2012/07/medify.jpg?w=708" alt=""   class="aligncenter size-full wp-image-542679" /></a></div>
<ul>
<li><strong>Crowdsourced science. </strong>In a field where controlled experiments can be expensive and sometimes ineffective, it&#8217;s turning out there might be no substitute like the real-world data. Probably the most widely known company in this space is <a href="http://www.patientslikeme.com/">PatientsLikeMe</a>, a social network designed to let individuals share their medical conditions so they can learn from others like themselves what treatments might work best in their particular circumstances. As a side effect, the company is able to conduct observational trials based on data users willingly volunteer.</li>
</ul>
<div><a href="http://gigaom2.files.wordpress.com/2012/07/patients.jpg"><img  title="patients" src="http://gigaom2.files.wordpress.com/2012/07/patients.jpg?w=708" alt=""   class="aligncenter size-full wp-image-542633" /></a></div>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-437830p1.html">Shutterstock user lenetstan</a>; Tableau graph <a href="http://www.perceptualedge.com/articles/visual_business_intelligence/information_visualization_and_art.pdf">courtesy of Perceptual Edge</a>; exam image courtesy of <a href="http://www.shutterstock.com/gallery-185902p1.html">Shutterstock user Blaj Gabriel</a></em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=542509&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=476003"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=476003" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=542509+better-medicine-brought-to-you-by-big-data&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=542509+better-medicine-brought-to-you-by-big-data&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/01/infrastructure-q4-big-data-gets-bigger-and-saas-startups-shine/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=542509+better-medicine-brought-to-you-by-big-data&utm_content=dharrisstructure">Infrastructure Q4: Big data gets bigger and SaaS startups shine</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=542509+better-medicine-brought-to-you-by-big-data&utm_content=dharrisstructure">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/07/15/better-medicine-brought-to-you-by-big-data/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/07/stethoscope.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/07/stethoscope.jpg?w=150" medium="image">
			<media:title type="html">stethoscope</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/stethoscope.jpg?w=300" medium="image">
			<media:title type="html">stethoscope</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/pediatric-cancer-infographi.jpeg" medium="image">
			<media:title type="html">pediatric-cancer-infographi</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/health-costs.jpg" medium="image">
			<media:title type="html">health costs</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/mine3-officenotes-semantic-smaller.jpg" medium="image">
			<media:title type="html">mine3-officenotes-semantic-smaller</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/datagrid.jpg" medium="image">
			<media:title type="html">datagrid</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/watsonpower7.jpeg" medium="image">
			<media:title type="html">WatsonPower7</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/shutterstock_68642137-1.jpg" medium="image">
			<media:title type="html">shutterstock_68642137 (1)</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/medify.jpg" medium="image">
			<media:title type="html">medify</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/patients.jpg" medium="image">
			<media:title type="html">patients</media:title>
		</media:content>
	</item>
		<item>
		<title>Because Hadoop isn&#8217;t perfect: 8 ways to replace HDFS</title>
		<link>http://gigaom.com/2012/07/11/because-hadoop-isnt-perfect-8-ways-to-replace-hdfs/</link>
		<comments>http://gigaom.com/2012/07/11/because-hadoop-isnt-perfect-8-ways-to-replace-hdfs/#comments</comments>
		<pubDate>Wed, 11 Jul 2012 21:50:13 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Ceph]]></category>
		<category><![CDATA[CleverSafe]]></category>
		<category><![CDATA[DataStax]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[file systems]]></category>
		<category><![CDATA[GPFS]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HDFS]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Isilon]]></category>
		<category><![CDATA[Lustre]]></category>
		<category><![CDATA[Mapr]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[NetApp]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=541225</guid>
		<description><![CDATA[Hadoop is on its way to becomig the de facto platform for the next-generation of data-based applications, but it's not without some flaws. Ironically, one of Hadoop's biggest shortcomings right now is also one of its biggest strengths going forward -- the Hadoop Distributed File System.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=541225&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/07/achilles_heel.jpg"><img  title="achilles heel" src="http://gigaom2.files.wordpress.com/2012/07/shutterstock_16533076.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="alignleft size-medium wp-image-541764" /></a>Hadoop is <a href="http://gigaom.com/cloud/the-state-of-hadoop-strong-and-poised-to-explode/">on its way to becoming the de facto platform</a> for the next-generation of data-based applications, but it&#8217;s not without flaws. Ironically, one of Hadoop&#8217;s biggest shortcomings now is also one of its biggest strengths going forward &#8212; the Hadoop Distributed File System.</p>
<p>Within the Apache Software Foundation, HDFS is always improving in terms of performance and availability. Honestly, it&#8217;s probably fine for the majority of Hadoop workloads that are running in pilot projects, skunkworks projects or generally non-demanding environments. And technologies such as HBase that are built atop HDFS speak to its versatility <a href="http://gigaom.com/cloud/drawn-to-scale-raises-money-to-make-sql-big-data-ready/">as storage system even for non-MapReduce applications</a>.</p>
<p>But if the growing number of options for replacing HDFS signifies anything, it&#8217;s that HDFS isn&#8217;t quite where it needs to be. Some Hadoop users have strict demands around performance, availability and enterprise-grade features, while others aren&#8217;t keen of its direct-attached storage (DAS) architecture. Concerns around availability might be especially valid for anyone (read &#8220;almost everyone&#8221;) who&#8217;s using an older version of Hadoop without the <a href="http://www.cloudera.com/blog/2012/03/high-availability-for-the-hadoop-distributed-file-system-hdfs/">High Availability NameNode</a>. Here are eight products and projects whose proprietors argue can deliver what HDFS can&#8217;t:</p>
<p><strong>Cassandra (DataStax)<br />
</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/datastax_marketecture_a1-copy.jpg"><img  title="datastax_marketecture_A1 copy" src="http://gigaom2.files.wordpress.com/2012/07/datastax_marketecture_a1-copy.jpg?w=300&#038;h=263" alt="" width="300" height="263" class="alignright size-medium wp-image-541752" /></a>Not a file system at all but an open source, NoSQL key-value store, Cassandra has become a viable alternative to HDFS for web applications that rely on fast data access. <a href="http://www.datastax.com">DataStax</a>, a startup commercializing the Cassandra database, has <a href="http://gigaom.com/cloud/datastax-gets-11m-fuses-nosql-and-hadoop/">fused Hadoop atop Cassandra</a> to provide web applications fast access to data processed by Hadoop, and Hadoop fast access to data streaming into Cassandra from web users.</p>
<p><strong>Ceph<br />
</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/stack-copy.jpg"><img  title="stack copy" src="http://gigaom2.files.wordpress.com/2012/07/stack-copy.jpg?w=300&#038;h=279" alt="" width="300" height="279" class="alignright size-medium wp-image-541758" /></a>Ceph is an open source, multi-pronged storage system that was recently <a href="http://gigaom.com/cloud/inktank-launches-to-change-the-face-of-open-source-storage/"> commercialized by a startup called Inktank</a>. Among its features is a high-performance parallel file system that <a href="http://www.itworld.com/big-datahadoop/262612/ceph-extends-storage-open-scalability">some think makes it a candidate for replacing HDFS</a> (and then some) in Hadoop environments. Indeed, some researchers started <a href="www.soe.ucsc.edu/~carlosm/Papers/eestolan-nsdi10-abstract.pdf">looking at this possibility as far back as 2010</a>.</p>
<p><strong>Dispersed Storage Network (Cleversafe)<br />
</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/object-based-access-methods.gif"><img  title="object-based-access-methods" src="http://gigaom2.files.wordpress.com/2012/07/object-based-access-methods.gif?w=300&#038;h=208" alt="" width="300" height="208" class="alignright size-medium wp-image-541757" /></a>Cleversafe <a href="http://www.cleversafe.com/press-releases/cleversafe-first-to-deliver-breakthrough-capabilities-for-combined-storage-and-massive-computation">got into the HDFS-replacement business on Monday</a>, announcing a product that will fuse Hadoop MapReduce with the company&#8217;s Dispersed Storage Network system. By fully distributing metadata across the cluster (instead of relying on a single NameNode) and not relying on replication, Cleversafe says it&#8217;s much faster, more reliable and scalable than HDFS.</p>
<p><strong>GPFS (IBM)<br />
</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/gpfs.jpg"><img  title="gpfs" src="http://gigaom2.files.wordpress.com/2012/07/gpfs.jpg?w=300&#038;h=135" alt="" width="300" height="135" class="alignright size-medium wp-image-541756" /></a>IBM has been selling its General Parallel File System to high-performance computing customers for years (including within some of the world&#8217;s fastest supercomputers), and in 2010 it <a href="http://database-diary.com/2011/11/30/comparing-hdfs-and-gpfs-for-hadoop/">tuned GPFS for Hadoop</a>. IBM claims the GPFS-SNC (Shared Nothing Cluster) edition is so much faster than Hadoop in part because it runs at the kernel level as opposed to atop the OS like HDFS.</p>
<p><strong>Isilon (EMC)<br />
</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/isilon-hadoop.jpg"><img  title="isilon hadoop" src="http://gigaom2.files.wordpress.com/2012/07/isilon-hadoop.jpg?w=300&#038;h=199" alt="" width="300" height="199" class="alignright size-medium wp-image-541753" /></a>EMC has offered its own Hadoop distributions for more than a year, but in January 2012 it unveiled a new method for making HDFS enterprise-class &#8212; <a href="http://gigaom.com/cloud/emc-delivers-on-isilon-hadoop-bundle/">replace it with EMC Isilon&#8217;s OneFS file system</a>. Technically, as EMC&#8217;s Chuck Hollis <a href="http://chucksblog.emc.com/chucks_blog/2012/01/hdfs-coming-to-an-array-near-you.html">explained at the time</a>, because Isilon can read NFS, CIFS and HDFS protocols, a single Isilon NAS system can serve to intake, process and analyze data.</p>
<p><strong>Lustre</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/lustre.jpg"><img  title="lustre" src="http://gigaom2.files.wordpress.com/2012/07/lustre.jpg?w=300&#038;h=205" alt="" width="300" height="205" class="alignright size-medium wp-image-541761" /></a><a href="http://wiki.lustre.org/index.php/Main_Page">Lustre</a> is a an open source high-performance file system that some claim can make for an HDFS alternative where performance is a major concern. Truth be told, I haven&#8217;t heard of this combination running anywhere in the wild, but HPC storage provider Xyratex <a href="http://www.xyratex.com/pdfs/whitepapers/Xyratex_white_paper_MapReduce_1-4.pdf">wrote a paper on the combination in 2011</a>, claiming a Lustre-based cluster (even with InfiniBand) will be faster and cheaper than an HDFS-based cluster.</p>
<p><strong>MapR File System<br />
</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/compsol-diag3-1.jpg"><img  title="compsol-diag3-1" src="http://gigaom2.files.wordpress.com/2012/07/compsol-diag3-1.jpg?w=300&#038;h=266" alt="" width="300" height="266" class="alignright size-medium wp-image-541754" /></a>The MapR File System is probably the best-known HDFS alternative, as it&#8217;s the basis of MapR&#8217;s increasingly popular &#8212; <a href="http://gigaom.com/cloud/investors-make-20m-bet-on-mapr-to-win-hadoop-war/">and well-funded</a> &#8212; Hadoop distribution. Not only does MapR claim its file system is two to five times faster than HDFS on average (although, <a href="http://www.mapr.com/products/only-with-mapr/scalable">really, up to 20 times faster</a>), but it has features such as mirroring, snapshots and high availability that enterprise customers love.</p>
<p><strong>NetApp Open Solution for Hadoop</strong></p>
<p><a href="http://gigaom2.files.wordpress.com/2012/07/netapp.jpg"><img  title="netapp" src="http://gigaom2.files.wordpress.com/2012/07/netapp.jpg?w=300&#038;h=279" alt="" width="300" height="279" class="alignright size-medium wp-image-541755" /></a>OK, the <a href="http://www.netapp.com/us/solutions/infrastructure/hadoop.html">NetApp Open Solution for Hadoop</a> isn&#8217;t so much an HDFS replacement as it is an HDFS <em>improvement</em>, <a href="http://gigaom.com/cloud/netapp-does-network-attached-hadoop/">according to NetApp and early partner Cloudera</a>. The offering still relies on HDFS, but it reenvisions the physical Hadoop architecture by putting HDFS on a RAID array. This, NetApp claims, means faster, more reliable and more secure Hadoop jobs.</p>
<p>This might be a good place to say rest in peace to two other HDFS alternatives that are effectively no longer with us &#8212; <a href="http://code.google.com/p/kosmosfs/">KosmosFS</a> (aka CloudStore) and <a href="http://gigaom.com/2010/03/15/appistry-joins-cloudscale-storage-fray-and-brings-hadoop-with-it/">Appistry CloudIQ Storage</a>. The former was created by Kosmix (<a href="http://gigaom.com/2011/09/14/what-media-companies-can-learn-from-walmart/">since bought by @WalmartLabs</a>) and released to the open source world in 2007, but no longer has an active community. The latter was an attempt by Appistry in 2010 to get a piece of the Hadoop pie with its computational storage technology, but the company has since switched its focus from selling the technology to <a href="http://gigaom.com/2012/03/22/appistry-structure-data-2012/">providing high-performance computing services based on it</a>.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-177808p1.html">Shutterstock user Panos Karapanagiotis</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=541225&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=639205"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=639205" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=541225+because-hadoop-isnt-perfect-8-ways-to-replace-hdfs&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=541225+because-hadoop-isnt-perfect-8-ways-to-replace-hdfs&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=541225+because-hadoop-isnt-perfect-8-ways-to-replace-hdfs&utm_content=dharrisstructure">Infrastructure Q1: Cloud and big data woo enterprises</a></li><li><a href="http://pro.gigaom.com/2012/01/how-amazons-dynamodb-is-rattling-the-big-data-and-cloud-markets/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=541225+because-hadoop-isnt-perfect-8-ways-to-replace-hdfs&utm_content=dharrisstructure">Amazon’s DynamoDB: rattling the cloud market</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/07/11/because-hadoop-isnt-perfect-8-ways-to-replace-hdfs/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/07/shutterstock_16533076.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/07/shutterstock_16533076.jpg?w=150" medium="image">
			<media:title type="html">achilles heel</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/shutterstock_16533076.jpg?w=300" medium="image">
			<media:title type="html">achilles heel</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/datastax_marketecture_a1-copy.jpg?w=300" medium="image">
			<media:title type="html">datastax_marketecture_A1 copy</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/stack-copy.jpg?w=300" medium="image">
			<media:title type="html">stack copy</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/object-based-access-methods.gif?w=300" medium="image">
			<media:title type="html">object-based-access-methods</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/gpfs.jpg?w=300" medium="image">
			<media:title type="html">gpfs</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/isilon-hadoop.jpg?w=300" medium="image">
			<media:title type="html">isilon hadoop</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/lustre.jpg?w=300" medium="image">
			<media:title type="html">lustre</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/compsol-diag3-1.jpg?w=300" medium="image">
			<media:title type="html">compsol-diag3-1</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/07/netapp.jpg?w=300" medium="image">
			<media:title type="html">netapp</media:title>
		</media:content>
	</item>
		<item>
		<title>Straight outta Stanford, Bina wants to remake genome analysis</title>
		<link>http://gigaom.com/2012/04/30/straight-outta-stanford-bina-wants-to-remake-genome-analysis/</link>
		<comments>http://gigaom.com/2012/04/30/straight-outta-stanford-bina-wants-to-remake-genome-analysis/#comments</comments>
		<pubDate>Tue, 01 May 2012 00:45:44 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[appistry]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Bina Technologies]]></category>
		<category><![CDATA[DNAnexus]]></category>
		<category><![CDATA[Genomics]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[humane genome]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=515950</guid>
		<description><![CDATA[Bina Technologies emerged from stealth mode last week and is bringing an Apple-like business model to genomics. The company relies on its Bina Box to make genome analysis faster than ever before possible without the benefit of having a supercomputer and a research network on hand. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=515950&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/04/dna-sculpture.jpg"><img  title="dna sculpture" src="http://gigaom2.files.wordpress.com/2012/04/dna-sculpture.jpg?w=300&#038;h=269" alt="" width="300" height="269" class="alignleft size-medium wp-image-516125" /></a>The advent of the $1,000 genome is bound to revolutionize researchers&#8217; understanding of human health, but ever-lower prices on DNA sequencing are only half the battle. Researchers <a href="http://gigaom.com/cloud/as-genomics-pushes-big-data-limits-cloud-could-save-the-day/">also need to analyze the raw data that comes off sequencing machines</a>, which can range between many gigabytes to terabytes and can cost well more than the sequencing itself. That&#8217;s why a collection of startups are trying to stake their claims as essential parts of the genomics ecosystem by ensuring that analysis doesn&#8217;t become the bottleneck that slows progress.</p>
<h2>Domain expertise, statistics and HPC, unite!</h2>
<p>The latest is <a href="http://www.binatechnologies.com/">Bina Technologies</a>, which just emerged from stealth mode last week and is bringing an Apple-like business model to genomics. The company, which grew out of a research project at Stanford University, relies on its Bina Box appliance to make genome analysis faster than typically possible <a href="http://gigaom.com/cloud/fighting-cancer-at-100-gigabits-per-second/">without the benefit of having a supercomputer and a research network on hand</a>.</p>
<p>According to Bina CEO Narges Bani Asadi, who co-founded the company while completing her Ph.D. at Stanford, the appliance came about as part of a mission to solve a disconnect among the stakeholders in cancer research. Improving the analysis of cancer data required input from medical researchers, statisticians and high-performance computing experts, &#8220;but people are not speaking even the same language,&#8221; she said. While they&#8217;re all headed in the same direction, their paths rarely converge to harness peak velocity.</p>
<p><a href="http://gigaom2.files.wordpress.com/2012/04/bina_box_01.jpg"><img  title="bina_box_01" src="http://gigaom2.files.wordpress.com/2012/04/bina_box_01.jpg?w=300&#038;h=170" alt="" width="300" height="170" class="alignright size-medium wp-image-516123" /></a>Bani Asadi and her team solved that problem by developing a system that merged the three areas into one. With Bina, researchers can develop analysis pipelines that are optimized at both the algorithmic and silicon levels to run optimally across a mix of CPUs, GPUs and FPGAs, all of which are present within the purpose-built box. Applications are getting what they need in order to perform their best, and Bina says results can be processed 10 to 100 times faster (hours instead of days) than running jobs on the Amazon Web Services cloud, which has proven very popular for genomics workloads <a href="http://gigaom.com/cloud/amazon-gets-graphic-with-cloud-gpu-instances/">thanks to its supercomputer-like performance</a>.</p>
<p>That being said, a chart Bina uses to illustrate the performance difference compares the Bina Box to a single eight-core AWS instance rather than a cluster of those high-performance instances.</p>
<h2>The Apple analogy</h2>
<p>Bani Asadi answers the inevitable question of whether research centers will want to special appliances instead of using the cloud or generic servers by pointing to Apple. That company&#8217;s devices and computers can be a little more expensive and more difficult to tinker with than alternatives, but they&#8217;re also designed specifically with Apple&#8217;s operating system and applications in mind. It&#8217;s an analogy other companies, <a href="http://gigaom.com/cloud/ex-nasa-cto-builds-cloud-dream-team-launches-nebula/">such as cloud computing startup Nebula</a>, also use to justify their appliance-based businesses.</p>
<p>Not that Bina dismisses the cloud. If Bina&#8217;s software is the Mac OS to the Bina Box&#8217;s iMac, the Bina Cloud is the company&#8217;s iCloud. Once the box processes the raw sequencing data and compresses it into a smaller volume (up to 1,000 times smaller), the data is shipped to the Bina Cloud where it&#8217;s stored and can be easily accessed and shared. Actually, Bani Asadi said that&#8217;s where the most-innovative research likely will take place. While Bina&#8217;s appliance handles the necessary first steps of  genome analysis (e.g., determining how it&#8217;s unique), it&#8217;s the resulting data sets that are accessible by doctors, specialists and others to really make sense of it all.</p>
<p><a href="http://gigaom2.files.wordpress.com/2012/04/bina_process_01.jpg"><img  title="bina_process_01" src="http://gigaom2.files.wordpress.com/2012/04/bina_process_01.jpg?w=708" alt=""   class="aligncenter size-full wp-image-516124" /></a></p>
<p>Presumably, Bina is referring to companies such as DNAnexus when it compares its solution to entirely cloud-based approaches. DNAnexus is another Silicon Valley startup trying to democratize genome analysis, <a href="http://gigaom.com/cloud/dnanexus-cloudant-biotech-deals/">relying on the processing power and centralized nature of the cloud</a> to serve as a platform for analyzing and collaborating on DNA data. Another startup, St. Louis-based Appistry, has taken a somewhat different approach, <a href="http://gigaom.com/2012/03/22/appistry-structure-data-2012/">building its own high-powered cloud service</a> and developing its own algorithms specially designed for genome analysis.</p>
<h2>In the end, it&#8217;s all about the data</h2>
<p>Regardless of which approach a researcher takes to solving the problem of sequenced genome data (they all have unique benefits), the underlying trend driving innovation is the deluge of genome data itself. Bani Asadi said the biggest difference now compared with past efforts to analyze health data is that we have so much available. There are 30,000 fully sequenced genomes available right now, and some predict there will be 10 million in five years.</p>
<p>That means researchers can study DNA at a much more-granular level the previously possible, Bani Asadi said, and they can analyze findings across huge data sets to identify previously undetectable patterns. Especially with cancer, she said, each case is relatively unique, shaped by many conditions and factors. If we&#8217;re going to make significant progress on treating it, we&#8217;ll need to know exactly what&#8217;s going on in any given case and how similar cases have played out. The more data, the more accurate the diagnosis and treatment.</p>
<p><em>Feature image <a href="http://www.geograph.org.uk/photo/2848513">courtesy of Keith Edkins</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=515950&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=581888"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=581888" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=515950+straight-outta-stanford-bina-wants-to-remake-genome-analysis&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/05/the-importance-of-putting-the-u-and-i-in-visualization/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=515950+straight-outta-stanford-bina-wants-to-remake-genome-analysis&utm_content=dharrisstructure">The importance of putting the U and I in visualization</a></li><li><a href="http://pro.gigaom.com/2012/01/infrastructure-q4-big-data-gets-bigger-and-saas-startups-shine/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=515950+straight-outta-stanford-bina-wants-to-remake-genome-analysis&utm_content=dharrisstructure">Infrastructure Q4: Big data gets bigger and SaaS startups shine</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=515950+straight-outta-stanford-bina-wants-to-remake-genome-analysis&utm_content=dharrisstructure">Dissecting the data: 5 issues for our digital future</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/04/30/straight-outta-stanford-bina-wants-to-remake-genome-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/04/dna-sculpture.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/04/dna-sculpture.jpg?w=150" medium="image">
			<media:title type="html">dna sculpture</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/04/dna-sculpture.jpg?w=300" medium="image">
			<media:title type="html">dna sculpture</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/04/bina_box_01.jpg?w=300" medium="image">
			<media:title type="html">bina_box_01</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/04/bina_process_01.jpg" medium="image">
			<media:title type="html">bina_process_01</media:title>
		</media:content>
	</item>
		<item>
		<title>How federal money will spur a new breed of big data</title>
		<link>http://gigaom.com/2012/03/29/how-federal-money-will-change-the-face-of-big-data/</link>
		<comments>http://gigaom.com/2012/03/29/how-federal-money-will-change-the-face-of-big-data/#comments</comments>
		<pubDate>Thu, 29 Mar 2012 22:16:55 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[Academia]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[DARPA]]></category>
		<category><![CDATA[Department of Defense]]></category>
		<category><![CDATA[DoE]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[federal government]]></category>
		<category><![CDATA[genetics]]></category>
		<category><![CDATA[Genome]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[supercomputers]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=505263</guid>
		<description><![CDATA[By pumping hundreds of millions of dollars into big data research and development, the Obama administration thinks it can push the current state of the art well beyond what's possible today, and into entirely new research areas. It's a noble goal, but also a necessary one. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=505263&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/03/istock_000001007494xsmall1.jpg"><img  title="istock_000001007494xsmall" src="http://gigaom2.files.wordpress.com/2012/03/istock_000001007494xsmall1.jpg?w=708" alt=""   class="alignleft size-full wp-image-505339" /></a>If you think Hadoop and the current ecosystem of big data tools are great, &#8220;you ain&#8217;t seen nothing yet,&#8221; to quote Bachman Turner Overdrive. By <a href="http://gigaom.com/cloud/obamas-big-data-plans-lots-of-cash-and-lots-of-open-data/">pumping hundreds of millions of dollars a year into big data research and development</a>, the Obama administration thinks it can push the current state of the art well beyond what&#8217;s possible today, and into entirely new research areas.</p>
<p>It&#8217;s a noble goal, but also a necessary one. Big data does have the potential to change our lives, but to get there it&#8217;s going to take more than <a href="http://gigaom.com/cloud/heres-another-big-data-startup-from-team-yahoo/">startups created to feed us better advertisements</a>.</p>
<h2>Consumer data is easy to get, and profitable</h2>
<p>It&#8217;s not fair to call the current state of big data problematic, but it is largely focused on profit-centric technologies and techniques. That&#8217;s because as companies &#8212; especially those in the web world &#8212; realized the value they could derive from advanced data analytics, they began investing huge amounts of money in developing cutting-edge techniques for doing so. For the first time in a long time, <a href="http://gigaom.com/cloud/how-business-taught-scientists-about-big-data/">industry is now leading the academic and scientific research communities</a> when it comes to technological advances.</p>
<p>As Brenda Dietrich, IBM Fellow and vice president for business analytics for IBM Software (and former VP of IBM&#8217;s mathematical sciences division), explained to me, universities are still doing good research, but students are leaving to work at companies like Google and Facebook as soon as their graduate or Ph.D. studies are complete, often times beforehand. Research begun in universities is <a href="http://googleresearch.blogspot.com/2012/03/excellent-papers-for-2011.html">continued in commercial settings</a>, generally with commercial interests guiding its direction.</p>
<p>And this commercial focus isn&#8217;t ideal for everyone. For example, Sultan Meghji, vice president of product strategy at Appistry, told me that many of his company&#8217;s government- and intelligence-sector customers aren&#8217;t getting what they expected out of Hadoop, and they&#8217;re looking for alternative platforms. Hadoop might well be the platform of choice for large web and commercial applications &#8212; indeed, it&#8217;s where most of those companies&#8217; big data investments are going &#8212; but it has its limitations.</p>
<h2>Enter federal dollars for big data</h2>
<p>However, as John Holdren, assistant to the president and director of White House Office of Science and Technology Policy, noted <a href="http://live.science360.gov/bigdata/">during a White House press conference</a> on Thursday afternoon, the Obama administration realized several months ago that it was seriously under-investing in big data as a strategic differentiator for the United States. He was followed by leaders from six government agencies explaining how they intend to invest their considerable resources to remedy this under-investment. That means everything from the Department of Defense, DARPA and the Department of Energy developing new techniques for storage and management, to the U.S. Geological Survey and the National Science Foundation using big data to change the way we research everything from climate science to educational techniques.</p>
<p>How&#8217;s it going to do all this, apart from agencies simply ramping up their own efforts? Doling out money to researchers. As Zach Lemnios, Assistant Secretary of Defense for Research &amp; Engineering for the Department of Defense, put it, &#8220;We need your ideas.&#8221;</p>
<p>IBM&#8217;s Deitrich thinks increased availability of government grants can play a major role in keeping researchers in academic and scientific settings rather than bolting for big companies and big paychecks. Grants can help steer research away from targeted advertising and toward areas that will &#8220;be good … for mankind at large,&#8221; she said.</p>
<div id="attachment_505340" class="wp-caption alignright" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2012/03/genomes.jpg"><img  title="genomes" src="http://gigaom2.files.wordpress.com/2012/03/genomes.jpg?w=300&#038;h=199" alt="" width="300" height="199" class="size-medium wp-image-505340" /></a><p class="wp-caption-text">The 1,000 Genomes Project data is now freely available to researchers on Amazon's cloud.</p></div>
<p>Additionally, she said, academic researchers have been somewhat limited in what they can do because they haven&#8217;t always had easy access to meaningful data sets. With the government now pushing to open its own data sets, and as well as for collaborative research among different scientific disciplines, she thinks there&#8217;s a real opportunity for researchers to do conduct better experiments.</p>
<p>During the press conference, Department of Energy Office of Science Director William Brinkman expressed his agency&#8217;s need for better personnel to program its fleet of supercomputers. &#8220;Our challenge is not high-performance computing,&#8221; he said, &#8220;it&#8217;s high-performance people.&#8221; As my colleague Stacey Higginbotham has noted in the past, the ranks of Silicon Valley companies are deep with people <a href="http://gigaom.com/cloud/supercomputings-problem-isnt-power-its-software/">who might be able to bring their parallel-programming prowess to supercomputing centers</a> if the right incentives were in place.</p>
<h2>Self-learning systems, a storage revolution and a cure for cancer?</h2>
<p>As anyone who follows the history of technology knows, government agencies have been responsible for a large percentage of innovation over the past half century, taking credit for no less than the Internet itself. &#8220;You can track every interesting technology in the last 25 years to government spending over the past 50 years,&#8221; Appistry&#8217;s Meghji said.</p>
<p>Now, the government wants to turn its brainpower and money to big data. As part of its new, roughly $100-million XDATA program, DARPA Deputy Director Kaigham &#8220;Ken&#8221; Gabriel said his agency &#8220;seek[s] the equivalent of radar and overhead imagery for big data&#8221; so it can locate a single byte among an ocean of data. The DOE&#8217;s Brinkman talked about the importance of being able to store and visualize the staggering amounts of data generated daily by supercomputers, or by the second from CERN&#8217;s Large Hadron Collider.</p>
<p>IBM&#8217;s Dietrich also has an idea for how DARPA and the DOE might spend their big data allocations. &#8220;When one is doing certain types of analytics,&#8221; she explained, &#8220;you&#8217;re not looking at single threads of data, you tend to be pulling in multiple threads.&#8221; This makes previous storage technologies designed to make the most-accessed data the easiest to access somewhat obsolete. Instead, she said, researchers should be looking into how to store data in a manner that takes into account the other data sets typically accessed and analyzed along with any given set. &#8220;To my knowledge,&#8221; she said, &#8220;no one is looking seriously at that.&#8221;</p>
<p>Not surprisingly given his company&#8217;s large focus on genetic analysis, Appistry&#8217;s Meghji is particularly excited about the government promising more money and resources in that field. For one, he said, the Chinese government&#8217;s <a href="http://gigaom.com/cloud/supercomputings-problem-isnt-power-its-software/">Beijing Genomics Institute</a> probably accounts for anywhere between 25 and 50 percent of the genetics innovation right now,  and &#8220;to see the U.S. compete directly with the Chinese government is very gratifying.&#8221;</p>
<p>But he&#8217;s also excited about the possibility of seeing big data turned to areas in genetics other than cancer research &#8212; which <a href="http://gigaom.com/cloud/fighting-cancer-at-100-gigabits-per-second/">is presently a very popular pastime</a> &#8212; and generally toward advances in real-time data processing. He said the DoD and intelligence agencies are typically two to four years ahead of the rest of the world in terms of big data, and increased spending across government and science will help everyone else catch up. &#8220;It&#8217;s all about not just reacting to things you see,&#8221; he said, &#8220;but being proactive.&#8221;</p>
<p><a href="http://gigaom2.files.wordpress.com/2012/03/obama.jpg"><img  title="obama" src="http://gigaom2.files.wordpress.com/2012/03/obama.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="size-medium wp-image-505336 alignright" /></a>Indeed, the DoD has some seriously ambitious plans in place. Assistant Secretary Lemnios explained during the press conference how previous defense research has led to technologies such as IBM&#8217;s Watson system and Apple&#8217;s Siri that are becoming part of our everyday lives. Its latest quest: utilize big data techniques to create autonomous systems that can adapt to and act on new data inputs in real time, but that know enough to know when they need to invite human input on decision-making. Scary, but cool.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=505263&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=696880"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=696880" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=505263+how-federal-money-will-change-the-face-of-big-data&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=505263+how-federal-money-will-change-the-face-of-big-data&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2011/11/dissecting-the-data-5-issues-for-our-digital-future/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=505263+how-federal-money-will-change-the-face-of-big-data&utm_content=dharrisstructure">Dissecting the data: 5 issues for our digital future</a></li><li><a href="http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=505263+how-federal-money-will-change-the-face-of-big-data&utm_content=dharrisstructure">Defining Hadoop: the Players, Technologies and Challenges of 2011</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/03/29/how-federal-money-will-change-the-face-of-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/03/obama.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/03/obama.jpg?w=150" medium="image">
			<media:title type="html">obama</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/istock_000001007494xsmall1.jpg" medium="image">
			<media:title type="html">istock_000001007494xsmall</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/genomes.jpg?w=300" medium="image">
			<media:title type="html">genomes</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/03/obama.jpg?w=300" medium="image">
			<media:title type="html">obama</media:title>
		</media:content>
	</item>
		<item>
		<title>Meet the new breed of HPC vendor</title>
		<link>http://gigaom.com/2011/08/03/meet-the-new-breed-of-hpc-vendor/</link>
		<comments>http://gigaom.com/2011/08/03/meet-the-new-breed-of-hpc-vendor/#comments</comments>
		<pubDate>Wed, 03 Aug 2011 22:00:10 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[Amazon Web Services]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[clusters]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[hpc]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[parallel processing]]></category>
		<category><![CDATA[platform-computing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=387976</guid>
		<description><![CDATA[The face of high-performance computing is changing. That means new technologies and new names, but also familiar names in new places. Anyone that doesn't have a cloud computing story to tell, possibly a big data one too, might starting looking really old really quickly.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=387976&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<div id="attachment_388171" class="wp-caption alignleft" style="width: 310px"><a href="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer.jpg"><img  title="Columbia_Supercomputer" src="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=300&#038;h=200" alt="" width="300" height="200" class="size-medium wp-image-388171" /></a><p class="wp-caption-text">These things are expensive.</p></div>
<p>The face of high-performance computing is changing. That means new technologies and new names, but also familiar names in new places. Sure, cluster management is still important, but anyone that doesn&#8217;t have a cloud computing story to tell, possibly a big data one too, might starting looking really old really quickly.</p>
<p>We&#8217;ve been seeing the change happening over the past couple years, as Amazon Web Services and Hadoop, in particular, have changed the nature of HPC by democratizing access to resources and technologies. AWS did it by making lots of cores available on demand, freeing scientists from the need to buy expensive clusters or wait for time on their organization&#8217;s system. That story clearly caught on, and even large pharmaceutial companies and space agencies <a href="http://gigaom.com/2010/03/22/to-space-and-beyond-the-rise-of-research-driven-cloud-computing/">began running certain research tasks </a>on AWS.</p>
<p>Amazon <a href="http://gigaom.com/2010/07/13/amazons-cloud-gets-a-supercomputing-cluster/">took things a step further</a> by supplementing its virtual machines with physical speed in the form of Cluster Compute Instances. With a 10 GbE backbone, Intel Nehalem processors and the <a href="http://gigaom.com/cloud/amazon-gets-graphic-with-cloud-gpu-instances/">option of Nvidia Tesla GPUs</a>, users can literally have a Top500 supercomputer available on demand for a fraction of the cost of buying one. Cycle Computing, a startup that helps customers configure AWS-based HPC clusters, recently <a href="http://blog.cyclecomputing.com/2011/04/single-click-starts-a-10000-core-cyclecloud-cluster-for-1060-hr.html">launched a 10,000-core offering</a> that costs only $1,060 per hour.</p>
<p><a href="http://gigaom2.files.wordpress.com/2011/08/hadoop-logo.jpg"><img  title="hadoop-logo" src="http://gigaom2.files.wordpress.com/2011/08/hadoop-logo.jpg?w=708" alt=""   class="alignright size-full wp-image-388178" /></a>Hadoop, for its part, made Google- or Yahoo-style parallel data-processing available to anyone with the ambition to learn how to do it &#8212; and a few commodity servers. It&#8217;s not the be all, end all of the big data movement, but Hadoop&#8217;s certainly driving the ship and has opened mainstream businesses to the promise of advanced analytics. Most organizations have lots of data, some of it not suitable for a database or data warehouse, and tools like Hadoop let them get real value from it if they&#8217;re willing to put in the effort.</p>
<p><strong>New blood</strong></p>
<p>This change in the way organizations think about obtaining advanced computing capabilities has opened the door for new players that operate at the intersection of HPC, cloud computing and big data.</p>
<p>One relative newcomer to HPC &#8212; and someone that should give Appistry and everyone else a run for their money &#8212; is Microsoft. It only got into the space in the late &#8217;00s, so it didn&#8217;t have much of a legacy business to disrupt when the cloud took over. In a <a href="http://www.hpcwire.com/hpcwire/2011-07-27/microsoft_reshuffles_hpc_organization,_azure_cloud_looms_large.html">recent interview with <em>HPCwire</em></a>, Microsoft HPC boss Ryan Waite details, among other things, an increasingly HPC-capable Windows Azure offering and &#8220;the emergence of a new HPC workload, the data intensive or &#8216;big data&#8217; workload.&#8221;</p>
<p>Indeed, Microsoft has been busy trying to accommodate big data workloads. It just <a href="http://research.microsoft.com/en-us/projects/azure/daytona.aspx">launched an Azure-based MapReduce service</a> called Project Daytona, and has been <a href="http://gigaom.com/cloud/with-dryad-microsoft-is-trying-to-democratize-big-data/">developing its on-premise Hadoop alternative</a> called Dryad for quite some time.</p>
<p><a href="http://gigaom2.files.wordpress.com/2011/08/daytona.jpg"><img  title="daytona" src="http://gigaom2.files.wordpress.com/2011/08/daytona.jpg?w=708" alt=""   class="aligncenter size-full wp-image-388190" /></a></p>
<p>The latest company to get into the game is Appistry. As I noted when <a href="http://gigaom.com/cloud/appistry-raises-12m-realigns-around-big-data/">covering its $12 million funding round</a> yesterday, Appistry actually made a natural shift from positioning itself as a cloud software vendor to positioning itself as an HPC vendor. Sultan Meghji, Appistry&#8217;s vice president of analytics applications, explained to me just how far down the HPC path the company already has gone.</p>
<p>Probably the most extreme change is that Appistry is now offering its own cloud service for running HPC computational or analytic workloads. It&#8217;s based on a per-pipeline pricing model, and today is targeted at the life sciences community. Meghji said the scope will expand, but the cloud service just &#8220;soft launched&#8221; in May, and life sciences is a new field of particular interest to Appistry.</p>
<p>The new cloud service is built using Appistry&#8217;s existing CloudIQ software suite, which already is tuned for HPC on commodity gear thanks to parallel-processing capabilities, &#8220;computational storage&#8221; (i.e., co-locating processors and relevant data to speed throughput) and Hadoop compatibility.</p>
<p>Appistry is also <a href="http://www.appistry.com/solutions/life-sciences">tuning its software</a> to work with common HPC and data-processing algorithms, as well as some it&#8217;s writing itself, and is bringing in expertise in fields like life sciences to help the company better serve those markets.</p>
<p>&#8220;Cloud has become, frankly, meaningless,&#8221; Meghji explained. Appistry had a choice between trying to get heard of the noise of countless other private cloud offerings or trying to add distinct value in areas where its software was always best suited. It chose the latter, in part because Appistry&#8217;s products are best taken as a whole. If you need just cloud, HPC or analytics, Meghji said, Appistry might not be the right choice.</p>
<p>One would be remiss to ignore AWS as a potential HPC heavyweight, too, although it seems content to simply provide the infrastructure and let specialists handle the management. However, its Cluster Compute Instances and Elastic MapReduce service do open the doors for other companies, such as Cycle Computing, to make their mark on the HPC space by leveraging that readily available computing power.</p>
<p><strong>The old guard gets it</strong></p>
<p>But the emergence of new vendors isn&#8217;t to say that mainstay HPC vendors were oblivious to the sea change. Many, including <a href="http://www.adaptivecomputing.com/news/2011moab-hpcsa.php">Adaptive Computing</a> and <a href="http://www.univa.com/products">Univa UD</a>, have been particularly willing to embrace the cloud movement.</p>
<p>Platform Computing has really been making a name for itself in this new HPC world. It recently <a href="http://gigaom.com/cloud/forrester-on-private-clouds-platform-looks-the-best-for-now/">outperformed the competition</a> in Forrester Research&#8217;s comparison of private-cloud software offerings, and its ISF software powers SingTel&#8217;s nationwide cloud service. Spotting an opportunity to cash in on the hype around Hadoop, Platform also has <a href="http://gigaom.com/cloud/hadoop-may-be-hot-but-it-needs-to-be-useful/">turned its attention to big data</a> with a management product that&#8217;s compatible a number of other data-processing frameworks and storage engines.</p>
<p>Whoever the vendor, though, there&#8217;s lots of opportunity. That&#8217;s because the new HPC opens the doors to an endless pipeline of new customers and new business ideas that could never justify buying a supercomputer or developing a MapReduce implementation, but that can enter a credit-card number or buy a handful of commodity servers with the best of them.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=387976&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=513958"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=513958" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2010/12/9-companies-that-pushed-the-infrastructure-discussion-in-2010/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">9 Companies that Pushed the Infrastructure Discussion in 2010</a></li><li><a href="http://pro.gigaom.com/2010/07/infrastructure-overview-q2-2010/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">Infrastructure Overview, Q2 2010</a></li><li><a href="http://pro.gigaom.com/2012/12/cloud-computing-2013-how-to-navigate-without-a-map/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=387976+meet-the-new-breed-of-hpc-vendor&utm_content=dharrisstructure">Cloud computing 2013: how to navigate without a map</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/08/03/meet-the-new-breed-of-hpc-vendor/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=150" medium="image">
			<media:title type="html">Columbia_Supercomputer</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/columbia_supercomputer-e1312404275483.jpg?w=300" medium="image">
			<media:title type="html">Columbia_Supercomputer</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/hadoop-logo.jpg" medium="image">
			<media:title type="html">hadoop-logo</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/daytona.jpg" medium="image">
			<media:title type="html">daytona</media:title>
		</media:content>
	</item>
		<item>
		<title>Appistry raises $12M, realigns around big data</title>
		<link>http://gigaom.com/2011/08/02/appistry-raises-12m-realigns-around-big-data/</link>
		<comments>http://gigaom.com/2011/08/02/appistry-raises-12m-realigns-around-big-data/#comments</comments>
		<pubDate>Tue, 02 Aug 2011 15:56:58 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[high-performance computing]]></category>
		<category><![CDATA[private clouds]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=386787</guid>
		<description><![CDATA[Appistry, a St. Louis–based software company, has closed a $12 million Series D round for its family of distributed computing products. The company also appears to have changed its corporate messaging -- from that of a cloud-computing vendor to that of a big-data vendor.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=386787&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Appistry, a St. Louis–based software company, has <a href="http://www.prnewswire.com/news-releases/appistry-closes-oversubscribed-12m-series-d-financing-to-fuel-growth-126581658.html">closed a $12 million Series D round</a> for its family of distributed computing products. Along with the new money, which came from St. Louis VC firm eXome Capital, the company also appears to have changed its corporate messaging &#8212; from that of a cloud-computing vendor to that of a big-data vendor.</p>
<p><a href="http://gigaom2.files.wordpress.com/2011/08/cloudiq_platform_architecture_large.jpg"><img  title="cloudiq_platform_architecture_large" src="http://gigaom2.files.wordpress.com/2011/08/cloudiq_platform_architecture_large.jpg?w=300&#038;h=250" alt="" width="300" height="250" class="alignright size-medium wp-image-386847" /></a>One could accuse Appistry of simply jumping onto the bandwagon of the latest IT bandwagon, but big data arguably was its forte all along. I&#8217;ve <a href="http://gigaom.com/2009/03/09/appistry-opens-the-cloud-to-almost-all-apps/">covered its CloudIQ family of products</a> before &#8212; since it was called the Enterprise Application Fabric, in fact &#8212; and it fit nicely into the cloud picture. However, its customers have traditionally been defense, intelligence, financial services and other customers using the distributed platform to run compute- and data-intensive workloads.</p>
<p>Appistry has embraced this customer base even more in the past 18 months. It <a href="http://gigaom.com/2010/03/15/appistry-joins-cloudscale-storage-fray-and-brings-hadoop-with-it/">launched a storage engine</a> that colocates data on the compute servers, and that&#8217;s a plug-and-play alternative for the Hadoop Distributed File System. Late last year, along with partner Accenture, it <a href="http://gigaom.com/cloud/appistry-and-accenture-create-real-time-cloud-mapreduce/">developed Cloud MapReduce</a>, a stream-processing engine based on the MapReduce framework.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=386787&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=156176"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=156176" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=386787+appistry-raises-12m-realigns-around-big-data&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=386787+appistry-raises-12m-realigns-around-big-data&utm_content=dharrisstructure">A near-term outlook for big data</a></li><li><a href="http://pro.gigaom.com/2012/12/cloud-computing-2013-how-to-navigate-without-a-map/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=386787+appistry-raises-12m-realigns-around-big-data&utm_content=dharrisstructure">Cloud computing 2013: how to navigate without a map</a></li><li><a href="http://pro.gigaom.com/2012/04/deploying-big-data-2012-strategies-for-it-departments/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=386787+appistry-raises-12m-realigns-around-big-data&utm_content=dharrisstructure">Deploying big data: 2012 strategies for IT departments</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/08/02/appistry-raises-12m-realigns-around-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2010/11/american_cash-e1312300604714.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2010/11/american_cash-e1312300604714.jpg?w=150" medium="image">
			<media:title type="html">American_Cash</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/08/cloudiq_platform_architecture_large.jpg?w=300" medium="image">
			<media:title type="html">cloudiq_platform_architecture_large</media:title>
		</media:content>
	</item>
		<item>
		<title>Defining Hadoop: the Players, Technologies and Challenges of 2011</title>
		<link>http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/</link>
		<comments>http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/#comments</comments>
		<pubDate>Wed, 30 Mar 2011 15:46:45 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/derrickharris/" rel="author">Derrick Harris</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Adobe]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[amazon-elastic-mapreduce]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[apache-hadoop]]></category>
		<category><![CDATA[Apollo]]></category>
		<category><![CDATA[apollo-group]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[Aster Data Systems]]></category>
		<category><![CDATA[AT&T]]></category>
		<category><![CDATA[bank-of-america]]></category>
		<category><![CDATA[banking]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Concurrent]]></category>
		<category><![CDATA[concurrent-cascading]]></category>
		<category><![CDATA[consumer electronics manufacturers]]></category>
		<category><![CDATA[Couchbase]]></category>
		<category><![CDATA[data management]]></category>
		<category><![CDATA[data storage]]></category>
		<category><![CDATA[data-processing workloads]]></category>
		<category><![CDATA[Datameer]]></category>
		<category><![CDATA[datameer-analytics]]></category>
		<category><![CDATA[ebay]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[goto-metrics-data-analytics]]></category>
		<category><![CDATA[Groupon]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Hbase]]></category>
		<category><![CDATA[Hewlett-Packard]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[HP]]></category>
		<category><![CDATA[Hulu]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[ibm-netezza]]></category>
		<category><![CDATA[infochimps]]></category>
		<category><![CDATA[informatica]]></category>
		<category><![CDATA[infosphere-biginsights]]></category>
		<category><![CDATA[ingres]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[jaspersoft]]></category>
		<category><![CDATA[karmasphere]]></category>
		<category><![CDATA[kitenga]]></category>
		<category><![CDATA[large web]]></category>
		<category><![CDATA[LinkedIn]]></category>
		<category><![CDATA[Loggly]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[microstrategy]]></category>
		<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[Netflix]]></category>
		<category><![CDATA[Nokia]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[open-sources]]></category>
		<category><![CDATA[openlogic-exchange]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Orbitz]]></category>
		<category><![CDATA[pentaho]]></category>
		<category><![CDATA[Pervasive Software]]></category>
		<category><![CDATA[pig]]></category>
		<category><![CDATA[Quantcast]]></category>
		<category><![CDATA[quest-software]]></category>
		<category><![CDATA[Rackspace]]></category>
		<category><![CDATA[Revolution Analytics]]></category>
		<category><![CDATA[Samsung]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[talend]]></category>
		<category><![CDATA[talend-cloud]]></category>
		<category><![CDATA[tennessee-valley-authority]]></category>
		<category><![CDATA[Teradata]]></category>
		<category><![CDATA[Trend Micro]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Vertica]]></category>
		<category><![CDATA[vertica-systems]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[yelp]]></category>
		<category><![CDATA[zettaset]]></category>
		<category><![CDATA[zettavox]]></category>
		<category><![CDATA[zookeeper]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=63077</guid>
		<description><![CDATA[ Hadoop has been used by large web companies for applications such as search engines, but the reality is that the project is so much more. This report takes a closer look, examining what Hadoop is (and isn’t), who’s doing what to productize it and why we can expect to see the market pick up serious steam in 2011. We profile the growing number of companies — from startups like MapR to Cloudera, the arguable leader in the space — using Hadoop, the challenges still hindering widespread adoption and where potential users can expect the market to go as we move through 2011 and beyond. Companies mentioned in this report include Yahoo, Facebook, EMC, Teradata and Appistry. For a full list of companies, and to read the full report, sign up for a free trial.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=323891&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=323891&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=481105"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=481105" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=323891+defining-hadoop-the-players-technologies-and-challenges-of-2011&utm_content=gigaedit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/07/infrastructure-q2-big-data-and-paas-gain-more-momentum/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=323891+defining-hadoop-the-players-technologies-and-challenges-of-2011&utm_content=gigaedit">Infrastructure Q2: Big data and PaaS gain more momentum</a></li><li><a href="http://pro.gigaom.com/2011/04/infrastructure-q1-iaas-comes-down-to-earth-big-data-takes-flight/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=323891+defining-hadoop-the-players-technologies-and-challenges-of-2011&utm_content=gigaedit">Infrastructure Q1: IaaS Comes Down to Earth; Big Data Takes Flight</a></li><li><a href="http://pro.gigaom.com/2012/03/a-near-term-outlook-for-big-data/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=323891+defining-hadoop-the-players-technologies-and-challenges-of-2011&utm_content=gigaedit">A near-term outlook for big data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2011/03/defining-hadoop-the-players-technologies-and-challenges-of-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://pro.gigaom.com/files/2010/07/bronze-elephant.jpg?w=150" />
		<media:content url="http://pro.gigaom.com/files/2010/07/bronze-elephant.jpg?w=150" medium="image">
			<media:title type="html">bronze elephant</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>5 Cloud Software Vendors Dell Should Buy</title>
		<link>http://gigaom.com/2011/01/28/5-cloud-software-vendors-that-dell-should-buy/</link>
		<comments>http://gigaom.com/2011/01/28/5-cloud-software-vendors-that-dell-should-buy/#comments</comments>
		<pubDate>Fri, 28 Jan 2011 17:00:58 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[@NYT]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[Aster Data Systems]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[DynamicOps]]></category>
		<category><![CDATA[Joyent]]></category>
		<category><![CDATA[Univa]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=291271</guid>
		<description><![CDATA[Michael Dell is talking this week talking about having acquisition plans in "software, data centers, cloud computing, storage and virtualization," which raises questions about who it might be eying up. There are five vendors, in particular, that could give high value for a relatively low price.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=291271&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2011/01/shopping_list_notepad_1407433_l.jpg"><img title="shopping_list_notepad_1407433_l" src="http://gigaom2.files.wordpress.com/2011/01/shopping_list_notepad_1407433_l.jpg?w=300&#038;h=225" alt="" width="300" height="225" class="alignleft size-medium wp-image-291503"></a>Michael Dell is at The World Economic Forum this week <a href="http://www.bloomberg.com/news/2011-01-26/michael-dell-says-hp-paid-way-too-much-for-3par-seeks-more-acquisitions.html">talking about Dell having acquisition plans</a> in “software, data centers, cloud computing, storage and virtualization,” which has speculators venturing guesses as to what’s on its shopping list. Timothy Prickett Morgan <a href="http://www.channelregister.co.uk/2011/01/26/dell_acquisitions_davos/">gave his thoughts in <em>The Register</em></a>, dropping companies from Brocade to Cray to Rackspace  as possibilities, but I don’t think Dell will make a system-centric play this time around. There are two trends right now – cloud computing and big data – that are dependent on software and services, and I think Dell gets this, if only because the company knows it doesn’t want to go blow-to-blow with IBM, HP and Cisco on high-end systems. It has already shown as much with its recent purchases of Scalent and Boomi.</p>
<p>Here are the companies I think Dell should consider buying this time around. They’re not huge companies by any stretch of the imagination, but they would provide very-relevant software products for advancing Dell’s mission of adding value to the <a href="http://www.serverwatch.com/news/article.php/3899651/Dell-Servers-Services-Sales-Soar.htm">growing number of servers it’s selling</a>:</p>
<p><strong>Aster Data Systems</strong></p>
<p>Thus far, Dell has about the same in-house big data prowess as does HP, which is to say none at all. But Dell does resell Aster Data Systems’ <em>n</em>Cluster massively parallel analytic database as a part of the Dell Cloud Solution for Data Analytics. That’s why I think <a href="http://gigaom.com/2009/03/06/aster-data-making-the-most-of-2009/">Aster Data</a> would be a natural fit for Dell: It already knows the product and the business, and it lets Dell keep selling commodity boxes while letting the software do the work. Dell pushes openness in terms of hardware choice, so if it wants to get into database space, buying a company with an appliance business might not make too much sense. Aster Data won’t come cheap, with a <a href="http://gigaom.com/cloud/cloud-startup-values-are-getting-insane-2/">rumored valuation easily north of $100 million</a>, but it should cost less than the <a href="http://gigaom.com/2010/07/06/emc-buys-greenplum/">$300 million-plus EMC  reportedly paid for Greenplum</a>, and certainly less than the <a href="http://gigaom.com/cloud/ibm-to-buy-netezza-for-1-7-billion/">$1.7 billion IBM paid for Netezza</a>.</p>
<p><strong> </strong></p>
<p><strong>Joyent</strong></p>
<p>Joyent would let Dell kill three birds with one stone, as it encompasses software, cloud computing and data centers. Furthermore, as with Aster Data, Dell already has an OEM deal with Joyent through which it <a href="http://gigaom.com/2010/03/24/joyent-dell-cloud/">resells Joyent’s SmartDataCenter software</a> as the Dell Cloud Solution for Web Applications. As I’ve written before, Dell has <a href="http://gigaom.com/cloud/dells-cloud-strategy-is-shaping-up-and-looking-good/">formed a fairly holistic portfolio of cloud offerings</a>, of which Joyent is a key part, so closing the loop and bringing that software in-house makes sense. It also would be good for Joyent, which would have a larger channel and sales team through which to sell its software. Of course, Joyent’s business also extends into cloud hosting, which would get Dell into the service-provider business, as some have speculated it wants to do, without buying Rackspace (which could be a complex integration) or<a href="http://gigaom.com/2010/07/12/microsoft-azure-appliance/"> relying on the Windows Azure Appliance</a>.</p>
<p><strong> </strong></p>
<p><strong>DynamicOps</strong></p>
<p>DynamicOps presents a similar situation as both Aster Data and Joyent, because Dell also has an OEM deal with it, although DynamicOps’ deal with Dell definitely is more limited in scope. Presently, <a href="http://gigaom.com/cloud/credit-suisse-spawn-dynamicops-enters-private-cloud-game/">its cloud-management software provides the self-service capability</a> for Dell’s Virtual Integrated System software package, which is Dell’s attempt to give customers the converged infrastructure experience of managing computing, storage and networking from one place without forcing them to <a href="http://gigaom.com/cloud/can-open-converged-infrastructure-compete-2/">buy expensive vertically integrated systems</a> such as Cisco’s UCS or HP’s BladeSystem Matrix. DynamicOps also sells virtualization management software, which would give Dell customers that aren’t ready for the cloud a more down-to-earth option.</p>
<p><strong> </strong></p>
<p><strong>Univa</strong></p>
<p><a href="http://univa.com">Univa</a> could be a good choice, especially if Dell wants to provide its <a href="http://www.dell.com/content/topics/global.aspx/sitelets/solutions/cluster_grid/dcs_landingpage?c=us&amp;l=en">Data Center Solutions</a> customers, who buy large quantities of customized hyperscale servers from Dell, with tools to manage their scale-out data centers and clusters. Univa is a newly technology-rich company thanks to its <a href="http://www.businesswire.com/news/home/20110118005484/en/Univa-Acquires-Grid-Engine-Expertise">forking of the Sun Grid Engine software</a>, and it already has an Austin, Texas office as a result of its purchase of United Devices a few years ago. There are other options in this space – <a href="http://gigaom.com/2009/06/22/platform-brings-big-business-grid-rep-to-the-cloud/">Platform Computing</a> (which Morgan suggested) and Adaptive Computing – come to mind, but I think Univa’s Austin roots and relatively low price will make it the most-appealing choice of the three HPC vendors that have expanded into the cloud-data-center-management space.</p>
<p><strong>Appistry</strong></p>
<p>As with the other four suggestions, Appistry is another software company that’s a perfect complement for Dell’s scale-out-focused Data Center Solutions group. <a href="http://gigaom.com/2009/03/09/appistry-opens-the-cloud-to-almost-all-apps/">Appistry’s CloudIQ Platform</a> is all about achieving high application performance across a distributed set of commodity servers, and it already has established a fairly strong customer base across the intelligence and defense industries. The companies already have <a href="http://www.appistry.com/blog/2010/11/appistry-and-dell-showcase-private-storage-cloud-solution/">partnered, in fact, on a petabyte-scale Private Storage Cloud</a> that combines Appistry’s CloudIQ Storage software with Dell hardware. CloudIQ Storage would give Dell a differentiating story for customers, as it focuses on not just on scaling out, but also <a href="http://gigaom.com/2010/03/15/appistry-joins-cloudscale-storage-fray-and-brings-hadoop-with-it/">on placing data near computing logic</a> to ensure that storage doesn’t slow application performance as the numbers of servers grows.</p>
<p><em>Image courtesy of Flickr user <a href="http://www.flickr.com/photos/28481088@N00/349049527/in/photostream/">tanakawho</a>.</em></p>
<p><strong>Related content from GigaOM Pro (sub req’d):</strong></p>
<ul><li><a href="http://pro.gigaom.com/2010/11/the-data-center-is-the-new-box-are-you-ready/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_content=dharrisstructure&amp;utm_campaign=intext&amp;utm_term=291271+5-cloud-software-vendors-that-dell-should-buy">The Data Center Is the New Box. Are You Ready?</a></li>
<li><a href="http://pro.gigaom.com/2010/11/why-dells-cloud-computing-prospects-are-strong/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_content=dharrisstructure&amp;utm_campaign=intext&amp;utm_term=291271+5-cloud-software-vendors-that-dell-should-buy">Why Dell’s Cloud Computing Prospects Are Strong</a></li>
<li><a href="http://pro.gigaom.com/2010/10/think-converged-infrastructure-means-lock-in-think-again/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_content=dharrisstructure&amp;utm_campaign=intext&amp;utm_term=291271+5-cloud-software-vendors-that-dell-should-buy">Think Converged Infrastructure Means Lock-in? Think Again.</a></li>
</ul>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=291271&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=36814"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=36814" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2011/01/28/5-cloud-software-vendors-that-dell-should-buy/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2011/01/shopping_list_notepad_1407433_l.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2011/01/shopping_list_notepad_1407433_l.jpg?w=150" medium="image">
			<media:title type="html">shopping_list_notepad_1407433_l</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2011/01/shopping_list_notepad_1407433_l.jpg?w=300" medium="image">
			<media:title type="html">shopping_list_notepad_1407433_l</media:title>
		</media:content>
	</item>
		<item>
		<title>Big Data, ARM and Legal Troubles Transformed Infrastructure in Q4</title>
		<link>http://pro.gigaom.com/2011/01/big-data-arm-and-legal-troubles-transformed-infrastructure-in-q4/</link>
		<comments>http://pro.gigaom.com/2011/01/big-data-arm-and-legal-troubles-transformed-infrastructure-in-q4/#comments</comments>
		<pubDate>Tue, 18 Jan 2011 08:00:26 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/derrickharris/" rel="author">Derrick Harris</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Cisco]]></category>
		<category><![CDATA[Citrix]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Comcast]]></category>
		<category><![CDATA[Data Centers]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[ebay]]></category>
		<category><![CDATA[EMC]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Fujitsu]]></category>
		<category><![CDATA[Fusion-io]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hewlett-Packard]]></category>
		<category><![CDATA[Hitachi]]></category>
		<category><![CDATA[HP]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[KT]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[NEC]]></category>
		<category><![CDATA[NetApp]]></category>
		<category><![CDATA[Netflix]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Red Hat]]></category>
		<category><![CDATA[SAP]]></category>
		<category><![CDATA[Smooth Stone]]></category>
		<category><![CDATA[Sun Microsystems]]></category>
		<category><![CDATA[toyota]]></category>
		<category><![CDATA[Twitter]]></category>
		<category><![CDATA[Verizon]]></category>
		<category><![CDATA[VMWare]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[Foursquare]]></category>
		<category><![CDATA[LinkedIn]]></category>
		<category><![CDATA[Nvidia]]></category>
		<category><![CDATA[AOL]]></category>
		<category><![CDATA[Parallels]]></category>
		<category><![CDATA[salesforce]]></category>
		<category><![CDATA[ShareThis]]></category>
		<category><![CDATA[Tumblr]]></category>
		<category><![CDATA[yelp]]></category>
		<category><![CDATA[3Tera]]></category>
		<category><![CDATA[Acadia]]></category>
		<category><![CDATA[Adaptive Computing]]></category>
		<category><![CDATA[Akamai]]></category>
		<category><![CDATA[Amazon Web Services]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[Azul Systems]]></category>
		<category><![CDATA[blackwave]]></category>
		<category><![CDATA[BMC]]></category>
		<category><![CDATA[Canonical]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[Cloud.com]]></category>
		<category><![CDATA[CloudSwitch]]></category>
		<category><![CDATA[Clustrix]]></category>
		<category><![CDATA[Cotendo]]></category>
		<category><![CDATA[CSC]]></category>
		<category><![CDATA[Engine Yard]]></category>
		<category><![CDATA[Enomaly]]></category>
		<category><![CDATA[Equinix]]></category>
		<category><![CDATA[eucalyptus]]></category>
		<category><![CDATA[Eucalyptus Systems]]></category>
		<category><![CDATA[F5 Networks]]></category>
		<category><![CDATA[Gear6]]></category>
		<category><![CDATA[GoGrid]]></category>
		<category><![CDATA[Heroku]]></category>
		<category><![CDATA[infochimps]]></category>
		<category><![CDATA[Isilon]]></category>
		<category><![CDATA[Joyent]]></category>
		<category><![CDATA[juniper]]></category>
		<category><![CDATA[Juniper Networks]]></category>
		<category><![CDATA[Level3]]></category>
		<category><![CDATA[Loggly]]></category>
		<category><![CDATA[Makara]]></category>
		<category><![CDATA[Marvell]]></category>
		<category><![CDATA[Mellanox]]></category>
		<category><![CDATA[Membase]]></category>
		<category><![CDATA[MorphLabs]]></category>
		<category><![CDATA[MP3Tunes]]></category>
		<category><![CDATA[Netezza]]></category>
		<category><![CDATA[New Relic]]></category>
		<category><![CDATA[Nimsoft]]></category>
		<category><![CDATA[Nirvanix]]></category>
		<category><![CDATA[Novell]]></category>
		<category><![CDATA[ooVoo]]></category>
		<category><![CDATA[OpSource]]></category>
		<category><![CDATA[Pivot3]]></category>
		<category><![CDATA[Rackspace]]></category>
		<category><![CDATA[rpath]]></category>
		<category><![CDATA[Savvis]]></category>
		<category><![CDATA[Scale Computing]]></category>
		<category><![CDATA[SGI]]></category>
		<category><![CDATA[splunk]]></category>
		<category><![CDATA[Terremark]]></category>
		<category><![CDATA[twilio]]></category>
		<category><![CDATA[Voltaire]]></category>
		<category><![CDATA[Wikileaks]]></category>
		<category><![CDATA[Zuora]]></category>
		<category><![CDATA[Zynga]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[amd]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[AT&T]]></category>
		<category><![CDATA[Nimbula]]></category>
		<category><![CDATA[StorSimple]]></category>
		<category><![CDATA[Egenera]]></category>
		<category><![CDATA[newscale]]></category>
		<category><![CDATA[Xeround]]></category>
		<category><![CDATA[PiCloud]]></category>
		<category><![CDATA[Calxeda]]></category>
		<category><![CDATA[cloudbees]]></category>
		<category><![CDATA[abiquo]]></category>
		<category><![CDATA[Racktivity]]></category>
		<category><![CDATA[compellent]]></category>
		<category><![CDATA[SimpleCDN]]></category>
		<category><![CDATA[aprimo]]></category>
		<category><![CDATA[gluster]]></category>
		<category><![CDATA[Smoothstone]]></category>
		<category><![CDATA[DynamicOps]]></category>
		<category><![CDATA[Violin Memory]]></category>
		<category><![CDATA[CouchOne]]></category>
		<category><![CDATA[ca-technologies]]></category>
		<category><![CDATA[apptis]]></category>
		<category><![CDATA[attachmate]]></category>
		<category><![CDATA[big-blue]]></category>
		<category><![CDATA[callidus]]></category>
		<category><![CDATA[citras]]></category>
		<category><![CDATA[cloudpronto]]></category>
		<category><![CDATA[compellent-technologies]]></category>
		<category><![CDATA[coraid]]></category>
		<category><![CDATA[cryogel]]></category>
		<category><![CDATA[dcim]]></category>
		<category><![CDATA[dicom-grid]]></category>
		<category><![CDATA[hexagrid-computing]]></category>
		<category><![CDATA[hostway]]></category>
		<category><![CDATA[isilon-systems]]></category>
		<category><![CDATA[logicalis]]></category>
		<category><![CDATA[maxiscale]]></category>
		<category><![CDATA[nexenta]]></category>
		<category><![CDATA[nlyte]]></category>
		<category><![CDATA[nuvio]]></category>
		<category><![CDATA[overland-storage]]></category>
		<category><![CDATA[proferi-software]]></category>
		<category><![CDATA[sentrigo]]></category>
		<category><![CDATA[servercentral]]></category>
		<category><![CDATA[stax-networks]]></category>
		<category><![CDATA[tenzing]]></category>
		<category><![CDATA[tianhe-1a]]></category>
		<category><![CDATA[translattice]]></category>
		<category><![CDATA[trapeze-networks]]></category>
		<category><![CDATA[verecloud]]></category>
		<category><![CDATA[wso]]></category>
		<category><![CDATA[zt-systems]]></category>
		<category><![CDATA[mp3s]]></category>
		<category><![CDATA[marvell-technology]]></category>
		<category><![CDATA[advanced-micro-devices]]></category>
		<category><![CDATA[data-center-infrastructure-management]]></category>
		<category><![CDATA[digital-imaging-and-communications-in-medicine-grids]]></category>
		<category><![CDATA[newscales]]></category>
		<category><![CDATA[watsco]]></category>
		<category><![CDATA[consumer electronics manufacturers]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=56285</guid>
		<description><![CDATA[Some might call this past quarter in the infrastructure space transformative. The rise of ARM-based processing suggests the days of x86 dominance might be coming to an end, while the Amazon Web Services-WikiLeaks controversy cast new light on the legal aspects of cloud computing. Big data got bigger, meanwhile, as the Hadoop ecosystem expanded, and amid all these cutting-edge technologies, two archaic topics — Novell and Java — proved they aren't going anywhere soon. Companies mentioned in this report include Intel, AMD, Amazon Web Services, IBM, Yahoo, Appistry, VMware, Joyent and Microsoft. For a full list of companies, and to read the full report, sign up for a free trial.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=306227&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Some might call this past quarter in the infrastructure space transformative. The rise of ARM-based processing suggests the days of x86 dominance might be coming to an end, while the Amazon Web Services-WikiLeaks controversy cast new light on the legal aspects of cloud computing. Big data got bigger, meanwhile, as the Hadoop ecosystem expanded, and amid all these cutting-edge technologies, two archaic topics — Novell and Java — proved they aren&#8217;t going anywhere soon. Companies mentioned in this report include Intel, AMD, Amazon Web Services, IBM, Yahoo, Appistry, VMware, Joyent and Microsoft. For a full list of companies, and to read the full report, sign up for a free trial.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=306227&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=211159"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=211159" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2011/01/big-data-arm-and-legal-troubles-transformed-infrastructure-in-q4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://pro.gigaom.com/files/2009/04/gigaompromasterimagecloud.jpg?w=150" />
		<media:content url="http://pro.gigaom.com/files/2009/04/gigaompromasterimagecloud.jpg?w=150" medium="image">
			<media:title type="html">gigaompromasterimagecloud</media:title>
		</media:content>

		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>Cloud MapReduce Targets Big Data in Real Time</title>
		<link>http://gigaom.com/2010/11/11/appistry-and-accenture-create-real-time-cloud-mapreduce/</link>
		<comments>http://gigaom.com/2010/11/11/appistry-and-accenture-create-real-time-cloud-mapreduce/#comments</comments>
		<pubDate>Thu, 11 Nov 2010 20:44:45 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[@CNN]]></category>
		<category><![CDATA[@NYT]]></category>
		<category><![CDATA[@SYN]]></category>
		<category><![CDATA[@TheStreet]]></category>
		<category><![CDATA[CNN Big Tech]]></category>
		<category><![CDATA[NYT Company News]]></category>
		<category><![CDATA[NYT Enterprise]]></category>
		<category><![CDATA[NYT Internet]]></category>
		<category><![CDATA[SYN Analysis]]></category>
		<category><![CDATA[SYN Feature Enterprise]]></category>
		<category><![CDATA[SYN Straight News]]></category>
		<category><![CDATA[appistry]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[mapreduce]]></category>
		<category><![CDATA[real-time web]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=258450</guid>
		<description><![CDATA[Cloud application-platform provider Appistry has teamed with Accenture to develop Cloud MapReduce product. Cloud MapReduce is focused on real-time analysis of streaming data, and it complements Appistry's distributed file system to form a Hadoop alternative for certain applications.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=258450&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2010/11/speed.jpg"><img title="speed" src="http://gigaom2.files.wordpress.com/2010/11/speed.jpg?w=300&#038;h=225" alt="" width="300" height="225" class="size-medium wp-image-258459 alignright"></a>Cloud application-platform provider Appistry has teamed with Accenture  to develop an on-premise implementation of Accenture’s existing Amazon EC2-focused <a href="http://www.appistry.com/go/cmr">Cloud MapReduce</a> product. There are two particularly noteworthy aspects of this product:</p>
<ol><li>Cloud MapReduce is focused on real-time analysis of streaming data</li>
<li>Appistry customers now have access to an entirely distributed Hadoop alternative. Earlier this year, the company released its Hadoop Distributed File System (HDFS) alternative called CloudIQ Storage Hadoop Edition.</li>
</ol><p>As I <a href="http://gigaom.com/2010/03/15/appistry-joins-cloudscale-storage-fray-and-brings-hadoop-with-it/" target="_blank">explained in a post on that product</a>, Appistry’s primary goal in developing these products is to improve performance and reliability by eliminating single points of failure. In HDFS, that’s the NameNode; in the Hadoop MapReduce engine, that’s the JobTracker. Running atop Appistry’s CloudIQ platform, Cloud MapReduce takes advantage of a peer-to-peer architecture in which these concerns are largely ameliorated.</p>
<p>Appistry’s Sam Charrington said those were the same issues that led Accenture to develop Cloud MapReduce in the first place, as its customers wanted higher reliability and performance for mission-critical jobs. As it turns out, however, certain users in intelligence and defense weren’t too keen on the cloud-based model. Jointly developed and distributed by both companies, on-premise Cloud MapReduce keeps the same focus on bleeding-edge customers in intelligence, defense and financial services.</p>
<p>Then there are the real-time capabilities. Cloud MapReduce utilizes a streaming API that frees it from the batch-processing boundaries typically associated with MapReduce. As Charrington explained, Hadoop’s popularity has shaped many connotations of MapReduce, but “the algorithm can be applied much more broadly.”Cloud MapReduce also leverages existing CloudIQ capabilities, such as Fabric Accessible Memory, a form of in-memory caching to speed data processing. “It’s not a competitor to Hadoop,” he added, “so much as an alternative to other approaches for processing data streams [such as IBM InfoSphere Streams].” In fact, Appistry retains partnerships within the Hadoop ecosystem so that customers have a choice of options depending on their applications.</p>
<p>In terms of scope, Cloud MapReduce appears to be in the same vein as the S4 project that Yahoo <a href="http://gigaom.com/cloud/is-yahoo-set-to-open-source-real-time-mapreduce/" target="_blank">open-sourced last week</a>. Once described as “real-time MapReduce,” the <a href="http://s4.io/">project website</a> now describes S4 as a “distributed stream computing platform” that “fills the gap between complex proprietary systems and batch-oriented open source computing platforms.” According to a <a href="http://labs.yahoo.com/files/KDCloud%202010%20S4.pdf">research paper</a> (PDF), S4 was inspired by MapReduce but more closely resembles the <a href="http://en.wikipedia.org/wiki/Actor_model" target="_blank">Actors model</a>. Like Cloud MapReduce, S4 is wholly decentralized to improve reliability and performance.</p>
<p>Both Cloud MapReduce and S4 should catch on (S4 likely sooner because it’s an open source project, not a paid product), but it might take time. In the case of Cloud MapReduce, many organizations with Big Data problems are still experimenting with Hadoop for batch-processing, and might not be ready to take on writing parallel-processing applications for real-time data. Even Charrington acknowledges that Appistry’s two products might be unnecessary for Hadoop experimentation or R&amp;D projects, but are designed for mission-critical production applications that require real-time analysis. And there aren’t too many of those around right now.</p>
<p>Aside from relatively low-hanging fruit like fraud detection and instant search, it will be fascinating to see the applications for these types of technologies once organizations are able to wrap their minds around the full scope of their data situations. You can bet social-media analysis will be an early priority, but that’s just the tip of the iceberg. Just <a href="http://gigaom.com/2010/10/11/jeff-jonas-big-data/" target="_blank">ask IBM</a>, which is beating the real-time drum with its Smarter Planet initiative.</p>
<p><em>Image courtesy of Flickr user <a href="http://www.flickr.com/photos/laserstars/908946494/in/photostream/" target="_blank">jpctalbot</a>.</em></p>
<p><strong>Related content from GigaOM Pro (sub req’d):</strong></p>
<ul><li><a href="http://pro.gigaom.com/2010/04/what-ibm-does-with-big-data/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_content=dharrisstructure&amp;utm_campaign=intext&amp;utm_term=258450+appistry-and-accenture-create-real-time-cloud-mapreduce" target="_blank">What IBM Does With Big Data</a></li>
<li><a href="http://pro.gigaom.com/2010/07/the-incredible-growing-commercial-hadoop-market/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_content=dharrisstructure&amp;utm_campaign=intext&amp;utm_term=258450+appistry-and-accenture-create-real-time-cloud-mapreduce" target="_blank">The Incredible, Growing, Commercial Hadoop Market</a></li>
<li><a href="http://pro.gigaom.com/2009/12/will-the-real-time-web-bring-high-performance-to-a-system-near-you/?utm_source=cloud&amp;utm_medium=editorial&amp;utm_content=dharrisstructure&amp;utm_campaign=intext&amp;utm_term=258450+appistry-and-accenture-create-real-time-cloud-mapreduce" target="_blank">Will the Real-Time Web Bring High Performance to a System Near You?</a></li>
</ul>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=258450&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=990791"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=990791" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2010/11/11/appistry-and-accenture-create-real-time-cloud-mapreduce/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2010/11/speed-e1318892183924.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2010/11/speed-e1318892183924.jpg?w=150" medium="image">
			<media:title type="html">speed</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2010/11/speed.jpg?w=300" medium="image">
			<media:title type="html">speed</media:title>
		</media:content>
	</item>
	</channel>
</rss>
