<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>GigaOM &#187; ancestry-com</title>
	<atom:link href="http://gigaom.com/tag/ancestry-com/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com</link>
	<description></description>
	<lastBuildDate>Wed, 19 Jun 2013 18:11:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='gigaom.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/0db8f6557d022075dbbf010c54d46d93?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>GigaOM &#187; ancestry-com</title>
		<link>http://gigaom.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://gigaom.com/osd.xml" title="GigaOM" />
	<atom:link rel='hub' href='http://gigaom.com/?pushpress=hub'/>
		<item>
		<title>How Ancestry.com transforms mounds of data into legible digital records</title>
		<link>http://gigaom.com/2013/05/27/how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records/</link>
		<comments>http://gigaom.com/2013/05/27/how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records/#comments</comments>
		<pubDate>Mon, 27 May 2013 17:30:03 +0000</pubDate>
		<dc:creator>Jordan Novet</dc:creator>
				<category><![CDATA[ancestry-com]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[natural language processing]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=649022</guid>
		<description><![CDATA[Ancestry.com wants to help users of the family-history site share their discoveries. Now it's devised methods for turning isolated facts into full-on stories.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=649022&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Sure, genealogy nerds might have fun poking through U.S. Census records, birth certificates and other documents in pursuit of information about their relatives on Ancestry.com. When it comes to showing off individual records to friends and relatives, though, the presentation can lack punch, and telling the whole story of an ancestor&#8217;s life isn&#8217;t straightforward.</p>
<p>The people behind the Ancestry.com service have realized this. Now they&#8217;re making the most of their <a href="http://gigaom.com/2012/06/12/how-ancestry-com-is-using-big-data-to-map-time-place-and-people/">4PB storehouse</a> of official personal records, user-submitted information and other data with a new feature delivering sleek computer-generated but customizable summaries of information available on users&#8217; ancestors. </p>
<p>Ancestry started rolling out the feature, known as Story View, earlier this quarter to a tiny share of its customers, and now it&#8217;s active for 10 percent of them. The plan is to analyze the use of Ancestry with and without Story View and round out the feature before making it generally available, probably later this year, said Eric Shoup, the company&#8217;s executive vice president of product, in a recent interview. Already Ancestry has made the feature more interactive by letting users move around a single page the images of documents and edit the associated bodies of text derived from the documents.</p>
<h2 id="how-it-works">How it works</h2>
<p>Story View builds on top of Ancestry&#8217;s already highly evolved tools for mining data about relatives, including some handwritten records. But sometimes only critical fields, such as name and place of residence, have been processed for inclusion in Story View. A customer can access a handwritten record, scroll down to the row in which a relative is described and toggle across columns to see data that hasn&#8217;t been processed, such as that person&#8217;s occupation.</p>
<p>Ancestry is working on getting more out of its handwritten records by gradually directing armies of &#8220;keyers&#8221; to decrypt handwriting and turn it into searchable text. Street addresses have been added in this way, and other fields will come later. And since Ancestry continues to add records to its repository, life stories will gain more sources to draw from as well.</p>
<div id="attachment_649027" class="wp-caption alignleft" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/05/ancestry-story-view-summary.jpg"><img src="http://gigaom2.files.wordpress.com/2013/05/ancestry-story-view-summary.jpg?w=708&#038;h=218" alt="The Story View life summary for one of Shoup&#039;s relatives" width="708" height="218"  class="size-full wp-image-649027" /></a><p class="wp-caption-text">The Story View life summary for one of Shoup&#8217;s relatives</p></div>
<p>To generate one-paragraph summaries drawing on information from multiple documents (check out the top paragraph in the picture above), Ancestry looked to Narrative Science, a company founded in 2010 to <a href="http://gigaom.com/2012/04/25/are-robots-and-content-farms-the-future-of-the-news/">make machines turn out readable copy</a>. Early use cases came in the production of coverage of sports events and public companies&#8217; earnings reports, but now Narrative Science technology is handling much more personal information.</p>
<p>When Ancestry first got involved with Narrative Science, it was only possible to produce data in big batches, said Reed McGrew, lead developer on Ancestry&#8217;s narrative and context services team. &#8220;They&#8217;ll produce huge numbers of financial reports, and that&#8217;s not really the experience we&#8217;re trying to deliver,&#8221; McGrew said. &#8220;Because it was meant for batches, it was pretty slow.&#8221;</p>
<p>Within a few months, Narrative Science came out with a new API that could work on a more granular level. &#8220;On kind of a user-by-user basis, they generate our life stories,&#8221; McGrew said.</p>
<p>Ancestry knows a thing or two about serving up genealogy information. The company&#8217;s editors provided editorial standards, or &#8220;rules,&#8221; for how the data should inform the narratives and how the narratives should sound, McGrew said. One Ancestry standard? &#8220;We don&#8217;t talk about births that happen to mothers less than 10 years old,&#8221; he said. &#8220;They&#8217;re more likely keystroke errors. They do happen in reality sometimes, for sure, but more often than not, when we find them, they&#8217;re errors.&#8221;</p>
<div id="attachment_649028" class="wp-caption alignleft" style="width: 718px"><a href="http://gigaom2.files.wordpress.com/2013/05/ancestry-story-view-record.jpg"><img src="http://gigaom2.files.wordpress.com/2013/05/ancestry-story-view-record.jpg?w=708&#038;h=447" alt="One of several records containing information on a relative of Shoup&#039;s" width="708" height="447"  class="size-full wp-image-649028" /></a><p class="wp-caption-text">One of several records containing information on a relative of Shoup&#8217;s</p></div>
<p>Underneath a picture and life summary of an ancestor in Story View are zoomed-out pictures of documents, instead of discrete fields of structured text. Next to the images, Ancestry can plug in blurbs generated from information in the document. Those draw from a system that engineers drew up in house. Once Ancestry has found all the records associated with a person, it selects specific facts to pull out of them based on Ancestry editors&#8217; rules, and assembles them into full sentences. Once the document-based blurbs are displayed in the browser, customers can edit and save them before sharing.</p>
<h2 id="sharing-aint-easy">Sharing ain&#8217;t easy</h2>
<p>The challenge is not the creation and storage of new data and websites that users create, said Scott Sorensen, Ancestry&#8217;s chief technology officer. Storage has gotten cheaper and cheaper, and that trend should continue. Accurate processing of handwritten records generally is not an issue, either. Often the keyers are in China, Sorensen said. &#8220;The Chinese character set is much larger than our alphabet,&#8221; he said. &#8220;They&#8217;re actually very skilled at keying these records.&#8221;</p>
<p>The real hard part is to make sure the service is highly available, to serve up all the right document and text for millions of users and keep the site from crashing when traffic peaks. But since one goal of Story View is to get more people checking out content on the site and eventually signing up, that would be a good problem to have.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=649022&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=516051"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=516051" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=649022+how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records&utm_content=gigajordan">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/report/sector-roadmap-social-customer-service-in-2013/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=649022+how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records&utm_content=gigajordan">Sector RoadMap: Social customer service in 2013</a></li><li><a href="http://pro.gigaom.com/2012/10/continuous-delivery-and-the-world-of-devops/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=649022+how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records&utm_content=gigajordan">Continuous delivery and the world of devops</a></li><li><a href="http://pro.gigaom.com/2012/09/listening-platforms-finding-the-value-in-social-media-data/?utm_source=data&utm_medium=editorial&utm_campaign=auto3&utm_term=649022+how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records&utm_content=gigajordan">Listening platforms: finding the value in social media data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2013/05/27/how-ancestry-com-transforms-mounds-of-data-into-legible-digital-records/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2013/05/ancestry-screen-shot-2.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2013/05/ancestry-screen-shot-2.jpg?w=150" medium="image">
			<media:title type="html">Ancestry screen shot 2</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/c00ab753df107b639e76ed4c3ab07ba7?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigajordan</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/ancestry-story-view-summary.jpg" medium="image">
			<media:title type="html">The Story View life summary for one of Shoup&#039;s relatives</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2013/05/ancestry-story-view-record.jpg" medium="image">
			<media:title type="html">One of several records containing information on a relative of Shoup&#039;s</media:title>
		</media:content>
	</item>
		<item>
		<title>Continuous delivery and the world of devops</title>
		<link>http://pro.gigaom.com/2012/10/continuous-delivery-and-the-world-of-devops/</link>
		<comments>http://pro.gigaom.com/2012/10/continuous-delivery-and-the-world-of-devops/#comments</comments>
		<pubDate>Tue, 02 Oct 2012 06:55:53 +0000</pubDate>
		<dc:creator><a href="http://pro.gigaom.com/members/daveo/" rel="author">Dave Ohara</a></dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[ancestry-com]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[CFEngine]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[configuration management]]></category>
		<category><![CDATA[continuous delivery]]></category>
		<category><![CDATA[continuous integration]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[devops]]></category>
		<category><![CDATA[Enterprise Collaboration]]></category>
		<category><![CDATA[enterprise IT]]></category>
		<category><![CDATA[Etsy]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[infrastructure as a service]]></category>
		<category><![CDATA[Joyent]]></category>
		<category><![CDATA[metrics]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[operations]]></category>
		<category><![CDATA[Opscode]]></category>
		<category><![CDATA[Puppet]]></category>
		<category><![CDATA[Puppet Enterprise]]></category>
		<category><![CDATA[software as a service]]></category>
		<category><![CDATA[software development]]></category>
		<category><![CDATA[UrbanCode]]></category>
		<category><![CDATA[virtualization]]></category>
		<category><![CDATA[Wildfire]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=154940</guid>
		<description><![CDATA[Thanks to the rise of online business, companies must now get their products and services to market as fast as they can, and releasing software now means small releases that occur very frequently. Enter devops, which is disrupting traditional assumptions about the roles of development and operations.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=568757&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Thanks to the rise of online business, companies must now get their products and services to market as fast as they can, and releases that occur in periods of months or years are no longer competitive. As a result, the pattern of how to release software is changing from large, infrequent releases of new software to small, frequent releases. This paper explains the world of continuous delivery and its underlying philosophy, devops. It is intended for executives who determine their organization’s business strategies. If you are looking for ways to reduce time to market and are considering a realignment of traditional assumptions about the roles of development and operations, you require knowledge of new tools and new approaches. </p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=568757&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=960774"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=960774" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=568757+continuous-delivery-and-the-world-of-devops&utm_content=gigaedit">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/04/infrastructure-q1-iaas-comes-down-to-earth-big-data-takes-flight/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=568757+continuous-delivery-and-the-world-of-devops&utm_content=gigaedit">Infrastructure Q1: IaaS Comes Down to Earth; Big Data Takes Flight</a></li><li><a href="http://pro.gigaom.com/2012/04/infrastructure-q1-cloud-and-big-data-woo-the-enterprise/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=568757+continuous-delivery-and-the-world-of-devops&utm_content=gigaedit">Infrastructure Q1: Cloud and big data woo enterprises</a></li><li><a href="http://pro.gigaom.com/2011/12/migrating-media-applications-to-the-private-cloud-best-practices-for-businesses/?utm_source=pro&utm_medium=editorial&utm_campaign=auto3&utm_term=568757+continuous-delivery-and-the-world-of-devops&utm_content=gigaedit">Migrating media applications to the private cloud: best practices for businesses</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2012/10/continuous-delivery-and-the-world-of-devops/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/4f3860069d181dbeeb398304f5940a9e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">gigaedit</media:title>
		</media:content>
	</item>
		<item>
		<title>MyHeritage automates record-matching as genealogy wars heat up</title>
		<link>http://gigaom.com/2012/09/19/myheritage-automates-record-matching-as-genealogy-wars-heat-up/</link>
		<comments>http://gigaom.com/2012/09/19/myheritage-automates-record-matching-as-genealogy-wars-heat-up/#comments</comments>
		<pubDate>Wed, 19 Sep 2012 12:00:57 +0000</pubDate>
		<dc:creator>David Meyer</dc:creator>
				<category><![CDATA[ancestry-com]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Family tree]]></category>
		<category><![CDATA[Genealogy]]></category>
		<category><![CDATA[Gilad Japhet]]></category>
		<category><![CDATA[Israel]]></category>
		<category><![CDATA[Myheritage]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=564393</guid>
		<description><![CDATA[The Israel-based firm has set up a server farm to automatically match its social family trees with billions of historical records ranging from newspaper articles to tombstone images, and will offer its users free snippets of the matches it's found.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=564393&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>When it comes to social networks, few are more important – and harder to pin down – than the family tree. So it&#8217;s no surprise that the fierce competition between the two leading platforms, <a href="http://www.ancestry.com/">Ancestry.com</a> and <a href="http://www.myheritage.com/">MyHeritage</a>, is getting ever more technologically advanced.</p>
<p>Derrick <a href="http://gigaom.com/cloud/how-ancestry-com-is-using-big-data-to-map-time-place-and-people/">covered</a> some of the techniques being used by Ancestry.com back in June, and today we can reveal the latest weapon in MyHeritage&#8217;s arsenal: automated record matching.</p>
<p>Both platforms lean heavily on records as a way of augmenting the drier names and dates that make up family trees, but the Israel-based MyHeritage – which already has its own angle by explicitly treating the service like a social network – reckons it now has the edge. </p>
<p>According to CEO Gilad Japhet, MyHeritage has had its Record Matching tech ready for some time, but needed to set up a server farm, then clear a backlog of four billion historical records (including the world&#8217;s largest historical newspaper collection, acquired through the company&#8217;s FamilyLink buy last year), before launching it today.</p>
<blockquote><p>&#8220;They come from original documents, birth records, marriage certificates, passenger lists going through Ellis Island, tombstones &#8211; in a few cases user contributed, as some people take snapshots of gravestones and upload them – public information, census records, newspaper articles and books. Record Matching covers both text-based and structured records, those that can be filled into a regular database,&#8221; he told me.</p></blockquote>
<p>As an example, let&#8217;s say you don&#8217;t know the date of birth or death for your grandfather, but you do know his name. MyHeritage has a big database of wills, but again, you&#8217;re lacking dates. So the service would use its already-existing Smart Matching technology to compare the known information with that on other family trees, perhaps pinning down dates through other relatives&#8217; connections. </p>
<p>Then, armed with that, it would find what it can in those historical records, using semantic analysis to deal with the free-text newspaper cuttings for example.</p>
<p>The smart thing, and one that Japhet hopes will pull in more subscribers and pay-as-you-go credit users, is that Record Matching works automatically and provides snippets of information for free. If you&#8217;re a user, you&#8217;ll just get an email telling you what&#8217;s been found. If you want to see the full record, you pay, but it doesn&#8217;t require that step to prove its worth.</p>
<p><a href="http://gigaom.com/?attachment_id=564395" rel="attachment wp-att-564395"><img src="http://gigaom2.files.wordpress.com/2012/09/myheritage-logo.jpg?w=300&#038;h=200" alt="MyHeritage logo" title="MyHeritage logo" width="300" height="200"  class="alignright size-medium wp-image-564395" /></a>So why did MyHeritage decide to shun the cloud for all this?</p>
<p>&#8220;We found it wasn&#8217;t very efficient to run this in the cloud because the CPU power you get is typically smaller, as a lot of these servers are virtual,&#8221; Japhet said. &#8220;We wanted serious number-crunching capabilities, and found it more efficient for us to purchase high-end servers, put together a large farm, run it all and accumulate the matches. It&#8217;s an ongoing real-time system.&#8221;</p>
<p>Japhet also claims other advantages over Ancestry.com, an older and larger service (38 million family trees to MyHeritage&#8217;s 23 million). For one thing, he points out that MyHeritage is available in 38 languages and its rival in just half a dozen – that makes a difference when you consider the international aspect of genealogical research.</p>
<p>What&#8217;s more, MyHeritage intends to &#8220;launch a massive crowdsourcing based transcription system&#8221; for its users within the next year, he added. And so the battle for family history continues.</p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=564393&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=306537"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=306537" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=europe&utm_medium=editorial&utm_campaign=auto3&utm_term=564393+myheritage-automates-record-matching-as-genealogy-wars-heat-up&utm_content=superglaze">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/report/sector-roadmap-social-customer-service-in-2013/?utm_source=europe&utm_medium=editorial&utm_campaign=auto3&utm_term=564393+myheritage-automates-record-matching-as-genealogy-wars-heat-up&utm_content=superglaze">Sector RoadMap: Social customer service in 2013</a></li><li><a href="http://pro.gigaom.com/2012/07/new-strategies-in-consumer-media-cloud-storage/?utm_source=europe&utm_medium=editorial&utm_campaign=auto3&utm_term=564393+myheritage-automates-record-matching-as-genealogy-wars-heat-up&utm_content=superglaze">The evolution of consumer-media cloud storage</a></li><li><a href="http://pro.gigaom.com/2012/04/connected-consumer-q1-controversy-courtrooms-and-the-cloud/?utm_source=europe&utm_medium=editorial&utm_campaign=auto3&utm_term=564393+myheritage-automates-record-matching-as-genealogy-wars-heat-up&utm_content=superglaze">Controversy, courtrooms and the cloud in Q1</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/09/19/myheritage-automates-record-matching-as-genealogy-wars-heat-up/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/09/gilad-japhet-myheritage_com-ceo.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/09/gilad-japhet-myheritage_com-ceo.jpg?w=150" medium="image">
			<media:title type="html">Gilad Japhet-MyHeritage_com-CEO</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/6599daccfd7e897e68744fe0065e5a2e?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">superglaze</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/09/myheritage-logo.jpg?w=300" medium="image">
			<media:title type="html">MyHeritage logo</media:title>
		</media:content>
	</item>
		<item>
		<title>How big data helps Ancestry.com map people, places and time</title>
		<link>http://gigaom.com/2012/06/12/how-ancestry-com-is-using-big-data-to-map-time-place-and-people/</link>
		<comments>http://gigaom.com/2012/06/12/how-ancestry-com-is-using-big-data-to-map-time-place-and-people/#comments</comments>
		<pubDate>Tue, 12 Jun 2012 22:33:23 +0000</pubDate>
		<dc:creator>Derrick Harris</dc:creator>
				<category><![CDATA[ancestry-com]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[DNA]]></category>
		<category><![CDATA[genome sequencing]]></category>
		<category><![CDATA[genomic data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[machine-learning]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://gigaom.com/?p=531644</guid>
		<description><![CDATA[Online genealogy service Ancestry.com is trying to become like the Amazon or Netflix of family trees. Much like those companies use customer data to recommend products or movies customers might like, Ancestry.com is using machine learning to make learning about ancestors a lot less work.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=531644&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://gigaom2.files.wordpress.com/2012/06/family-tree.jpg"><img  title="family tree" src="http://gigaom2.files.wordpress.com/2012/06/family-tree.jpg?w=300&#038;h=217" alt="" width="300" height="217" class="alignleft size-medium wp-image-531795" /></a>Online genealogy service <a href="http://ancestry.com">Ancestry.com</a>  is trying to become like the Amazon or Netflix of family trees. Much like those companies use customer data to recommend products or movies customers might like, Ancestry.com wants to feed its users relevant historical records and other information on ancestors without making them search through its database. And it&#8217;s taking in everything from newspaper clippings to your DNA to make this happen.</p>
<p>It you&#8217;ve used Ancestry.com recently, you&#8217;re probably thankful for its efforts. According to Head of Engineering Scott Sorenson, Ancestry.com has more than 10 billion records that are part of a 4-petabyte (or 4-million gigabyte) data store. If you&#8217;re searching for &#8220;John Smith,&#8221; he explained, it probably has about 60 million for &#8220;Smith&#8221; and about 4 million for &#8220;John Smith,&#8221; but you&#8217;re only interested in the relative handful that are relevant to <em>your</em> John Smith.</p>
<h2>Making models smarter</h2>
<p>That&#8217;s why Ancestry.com is using machine learning to make sorting through those records a lot less like finding a needle in a haystack and a lot more like having that needle &#8212; and any others made from the same batch of steel &#8212; delivered right to your door. Here&#8217;s how the process works, in a nutshell:</p>
<ol>
<li>Crawl digital records (e.g., newspapers, birth records, death records, census data, ship manifests, etc.) online and extract relevant data</li>
<li>(Or 1(a)) Scan, upload and index physical records (via a partner in China)</li>
<li>Stitch together new records with user data to add more context</li>
<li>And this is key, constantly analyze user behavior in order to make its algorithms smarter</li>
</ol>
<p>As users make judgments about the records they&#8217;re presented, Sorenson said, Ancestry.com&#8217;s algorithms get better at performing their particular tasks. So, a system for extracting data from newspaper pages might be able to better recognize the various sections of the page (so as to ignore the ads, for example) and then be able to adjust for mistakes in the section it is analyzing. And as with Google&#8217;s search algorithms, the more that users interact with records, the better Ancestry.com&#8217;s sorting algorithms are able to determine those records relevance to any given user.</p>
<h2>Spit in a tube, pay $99, learn your past</h2>
<p>Oh, but Ancestry.com has decided that merely storing and analyzing historical records is just the beginning with regard to providing accurate genealogy information. It <a href="http://dna.ancestry.com/">also will sequence your DNA</a>, focusing on 700,000 markers important to determining one&#8217;s race, lineage and other factors. That service, which simply requires users to swab their cheek or spit in a tube and send it to the lab, costs only $99 (a full genome sequence would cost at least 10 times that, by the way), but could revolutionize the accuracy of Ancestry.com&#8217;s models.</p>
<p>Right now, Sorenson said, the DNA service can tell users their race and what country they&#8217;re from, and also connect them with other relatives who share a DNA profile. (If your privacy red flag has gone up reading this, Sorenson did note the following: all communications with relatives are optional and initially anonymous; all DNA information is disassociated from personal information; and users get their sequence results via an encrypted key &#8220;that we treat with a higher level of security than we&#8217;d store your credit card information.&#8221;)</p>
<p><a href="http://gigaom2.files.wordpress.com/2012/06/homemaps.jpg"><img  title="homemaps" src="http://gigaom2.files.wordpress.com/2012/06/homemaps.jpg?w=708" alt=""   class="aligncenter size-full wp-image-531796" /></a></p>
<p>Connecting with distant relatives can be valuable, though. A third cousin, for example, might have ancestral information that you don&#8217;t, which will help make your family tree that much more accurate. But Sorenson said when it really gets interesting is when Ancestry.com can combine DNA data with record data in family trees. Someone&#8217;s DNA might indicate he&#8217;s from France, Sorenson explained, but cross-checking that against that person&#8217;s family data will let the service discover he&#8217;s actually from the Normandy region.</p>
<p>Going forward, Sorenson said Ancestry.com expects its DNA service to take off like a rocket. The company is investing between $10 million and $15 million into that service over the next couple years, and has bioinformatic scientists on staff trying to scale algorithms designed to handle hundreds of samples to work with hundreds of thousands or even millions of samples. In that regard, though, Ancestry.com isn&#8217;t alone &#8212; the steady drop in the price of genome sequencing has <a href="http://gigaom.com/cloud/as-genomics-pushes-big-data-limits-cloud-could-save-the-day/">everyone in the sector anticipating skyrocketing data volumes</a>.</p>
<h2>What&#8217;s next: Telling stories and making genealogy real-time</h2>
<p>OK, so it has billions of records and our DNA, what more can Ancestry.com possibly want or need to provide us information on our ancestors? Nothing, actually.</p>
<p><a href="http://gigaom2.files.wordpress.com/2012/06/key_art_who_do_you_think_you_are.jpg"><img  title="key_art_who_do_you_think_you_are" src="http://gigaom2.files.wordpress.com/2012/06/key_art_who_do_you_think_you_are.jpg?w=300&#038;h=116" alt="" width="300" height="116" class="alignright size-medium wp-image-531800" /></a>It just needs to make better use of what it does have and the new technologies available for working with that information. Genealogy has traditionally been &#8220;dusty,&#8221; Sorenson explained, but Ancestry.com is trying to tell the stories behind those dusty records. If you&#8217;ve seen the NBC program <a href="http://www.nbc.com/who-do-you-think-you-are/">&#8220;Who Do You Think You Are?&#8221;</a>, on which Ancestry.com traces celebrities&#8217; ancestral roots, you have an idea of what Sorenson is talking about.</p>
<p>For example, by improving its image-processing capabilities, Ancestry.com could extract more information than just name, data and location from old records that it already knows how to process. It could tell someone that his grandfather was the only person on the block to own a radio, or whether he owned his home. Combined with socioeconomic and other external data, Sorenson said, Ancestry.com could &#8220;create a really vivid picture&#8221; of what it was like to live during a specific time.</p>
<p>By using location data from cell phones, Sorenson said Ancestry.com could deliver a mobile experience that&#8217;s far more than a translation of the web on a smaller screen by making genealogy a geospatial pursuit. For example, Sorenson, explained, if a user takes a picture of a gravestone, Ancestry.com would like to provide him with relevant historical data related to that place, and maybe even some nearby points of interest.</p>
<p>Some might think Ancestry.com&#8217;s practices and plans toe the privacy line, but if someone has to toe that line, this might be the company to do it. In a fast-paced world it&#8217;s easy to get tied up in the moment and in our own little worlds &#8212; especially with big data being used elsewhere on the web <a href="http://gigaom.com/2012/05/24/hey-startups-is-your-service-a-healer-or-a-drug-dealer/">to keep our attention firmly on one site or another</a>. Using personal data to let users dig into decades into their family histories ends up looking very refreshing.</p>
<p><em>Feature image courtesy of <a href="http://www.shutterstock.com/gallery-686161p1.html">Shutterstock user tovovan</a>.</em></p>
<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=531644&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=877753"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=877753" /></a></p><p><strong>Related research and analysis from GigaOM Pro:</strong><br />Subscriber content. <a href="http://pro.gigaom.com/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=531644+how-ancestry-com-is-using-big-data-to-map-time-place-and-people&utm_content=dharrisstructure">Sign up for a free trial</a>.</p><ul><li><a href="http://pro.gigaom.com/2011/11/connected-world-the-consumer-technology-revolution/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=531644+how-ancestry-com-is-using-big-data-to-map-time-place-and-people&utm_content=dharrisstructure">Connected world: the consumer technology revolution</a></li><li><a href="http://pro.gigaom.com/report/how-to-use-big-data-to-make-better-business-decisions/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=531644+how-ancestry-com-is-using-big-data-to-map-time-place-and-people&utm_content=dharrisstructure">How to use big data to make better business decisions</a></li><li><a href="http://pro.gigaom.com/2012/07/cloud-and-data-second-quarter-2012-analysis-and-outlook-2/?utm_source=cloud&utm_medium=editorial&utm_campaign=auto3&utm_term=531644+how-ancestry-com-is-using-big-data-to-map-time-place-and-people&utm_content=dharrisstructure">Takeaways from the second quarter in cloud and data</a></li></ul>]]></content:encoded>
			<wfw:commentRss>http://gigaom.com/2012/06/12/how-ancestry-com-is-using-big-data-to-map-time-place-and-people/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:thumbnail url="http://gigaom2.files.wordpress.com/2012/06/family-tree.jpg?w=150" />
		<media:content url="http://gigaom2.files.wordpress.com/2012/06/family-tree.jpg?w=150" medium="image">
			<media:title type="html">family tree</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9e48ffa0913f65c577727457dd63023f?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">dharrisstructure</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/06/family-tree.jpg?w=300" medium="image">
			<media:title type="html">family tree</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/06/homemaps.jpg" medium="image">
			<media:title type="html">homemaps</media:title>
		</media:content>

		<media:content url="http://gigaom2.files.wordpress.com/2012/06/key_art_who_do_you_think_you_are.jpg?w=300" medium="image">
			<media:title type="html">key_art_who_do_you_think_you_are</media:title>
		</media:content>
	</item>
		<item>
		<title>Report: Monetizing Digital Content</title>
		<link>http://pro.gigaom.com/2010/03/paid-content/</link>
		<comments>http://pro.gigaom.com/2010/03/paid-content/#comments</comments>
		<pubDate>Tue, 30 Mar 2010 07:00:16 +0000</pubDate>
		<dc:creator>paulzagaeski</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[aapl]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[AMZN]]></category>
		<category><![CDATA[ancestry-com]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[App Stores]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[application-stores]]></category>
		<category><![CDATA[applications]]></category>
		<category><![CDATA[Barnes & Noble]]></category>
		<category><![CDATA[Boku]]></category>
		<category><![CDATA[bottle-rocket]]></category>
		<category><![CDATA[bundled-content]]></category>
		<category><![CDATA[bundled-contents]]></category>
		<category><![CDATA[CBS]]></category>
		<category><![CDATA[Comcast]]></category>
		<category><![CDATA[consumer electronics manufacturers]]></category>
		<category><![CDATA[Digital Content]]></category>
		<category><![CDATA[Disney]]></category>
		<category><![CDATA[dow]]></category>
		<category><![CDATA[Dow Jones]]></category>
		<category><![CDATA[EMI]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[Financial Times]]></category>
		<category><![CDATA[Gaia Online]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Hearst]]></category>
		<category><![CDATA[Hulu]]></category>
		<category><![CDATA[irex]]></category>
		<category><![CDATA[itunes]]></category>
		<category><![CDATA[lexisnexis]]></category>
		<category><![CDATA[Lexus]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[mpayy]]></category>
		<category><![CDATA[MSFT]]></category>
		<category><![CDATA[Myspace]]></category>
		<category><![CDATA[National Public Radio]]></category>
		<category><![CDATA[NBC]]></category>
		<category><![CDATA[New York]]></category>
		<category><![CDATA[New York Times]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[News Corp]]></category>
		<category><![CDATA[newspapers]]></category>
		<category><![CDATA[Nook]]></category>
		<category><![CDATA[npr]]></category>
		<category><![CDATA[paid content]]></category>
		<category><![CDATA[paypal]]></category>
		<category><![CDATA[PinchMedia]]></category>
		<category><![CDATA[Plastic Logic]]></category>
		<category><![CDATA[Playdom]]></category>
		<category><![CDATA[playspan]]></category>
		<category><![CDATA[sne]]></category>
		<category><![CDATA[Sony]]></category>
		<category><![CDATA[television-everywhere]]></category>
		<category><![CDATA[Time Warner]]></category>
		<category><![CDATA[trialpay]]></category>
		<category><![CDATA[tv everywhere]]></category>
		<category><![CDATA[Universal]]></category>
		<category><![CDATA[Viacom]]></category>
		<category><![CDATA[Wall Street Journal]]></category>
		<category><![CDATA[Warner]]></category>
		<category><![CDATA[Zong]]></category>
		<category><![CDATA[Zune]]></category>
		<category><![CDATA[Zynga]]></category>

		<guid isPermaLink="false">http://pro.gigaom.com/?p=28928</guid>
		<description><![CDATA[The worldwide online market for digital goods will grow amid a state of continuous disruption across all forms of content markets. Fueled by an ever-growing user base, migration from physical formats to digital distribution, and a proliferation of new connected devices, the overall market for digital goods will grow to $36 billion by 2014, up  from $16.7 billion in 2009. This report examines the state of paid content and the various monetization and payment models across each of the various digital goods markets. The report examines key players and market dynamics in the film and video, newspaper, online game, music and social networks space relative to their paid content strategies, and includes a revenue forecast of each of these segments relative to the overall paid content market.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=308425&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<br />  <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=gigaom.com&#038;blog=14960843&#038;post=308425&#038;subd=gigaom2&#038;ref=&#038;feed=1" width="1" height="1" /><p><a href="http://pubads.g.doubleclick.net/gampad/jump?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=646603"><img src="http://pubads.g.doubleclick.net/gampad/ad?iu=/1008864/GigaOM_RSS_300x250&#038;sz=300x250&#038;c=646603" /></a></p>]]></content:encoded>
			<wfw:commentRss>http://pro.gigaom.com/2010/03/paid-content/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:thumbnail url="http://pro.gigaom.com/files/2010/03/onlineshopping.jpg?w=150" />
		<media:content url="http://pro.gigaom.com/files/2010/03/onlineshopping.jpg?w=150" medium="image">
			<media:title type="html">onlineshopping</media:title>
		</media:content>

		<media:content url="http://0.gravatar.com/avatar/9ef878516977eae56cd94d92f1f3e02c?s=96&#38;d=retro&#38;r=PG" medium="image">
			<media:title type="html">paulzagaeski</media:title>
		</media:content>
	</item>
	</channel>
</rss>
