20 Comments

Summary:

The massive amount of data that is emerging from connected, digital systems, is fundamentally changing everything, from Internet search to entertainment, to disease management, to energy consumption. Here’s 10 case studies that highlight the power of big data.

feature-waterfall

The future of Foursquare is data-fueled recommendations

By Ryan Kim

When Foursquare first appeared on the scene, it looked more like a real-world game, with people checking in to locations to try to secure points and “mayorships.” But from the beginning co-founder Dennis Crowley also had a deeper vision that hinged around tapping into the wealth of data,. That vision became clear with the launch of Foursquare’s Explore feature last year.

Suddenly all of that fun check-in data was put to use as the fuel driving Explore’s very capable recommendation and search engine. Foursquare, it appeared, was a powerful big data company using a catchy front end to feed in more information.

The development, Crowley explained to me in an interview, was a bit how Mr. Miyagi taught Daniel karate in the The Karate Kid: “We asked people to check in, which is like painting the fence. Now we’re teaching karate,” Crowley said, adding, “It all goes into a recommendation engine that knows what you like and what else you’ll like.”

Next-gen recommendations

Explore debuted in March 2011 with Foursquare 3.0, fundamentally changing the usage model behind the location-based service. Instead of charting a user’s movements, Foursquare was spitting out recommendations and answers about the best places it thought users would want to visit.

Explore bases its recommendations on the places a user, a user’s friends and similar types of users visit. The platform factors in types of places and time of day when people search for recommendations and tailors responses for each location and time. Increasingly, it is also being fed by other signals like the tips people leave, the lists they construct and the places they are saving to Foursquare to visit later.

Foursquare takes all of this data and returns suggestions within 200 milliseconds. This is all done using the data Foursquare users have entered themselves, without prompting, so it has an air of authenticity that reflects their real tastes.

That unprompted data alone is huge. Foursquare has had well over 1.5 billion check-ins, including 5 million per day from more than 15 million users. There are more than 35 million venues on Foursquare, 750,000 of which have been claimed by business owners.

You and your peers

But what Foursquare is essentially doing with Explore is creating profiles of people based on what it can gather from users and their friends. The company applies machine learning algorithms to the collective movements of its users to figure out what people are doing and how one user fits into the larger group. If it understands a user prefers independent coffee shops, for example, it doesn’t suggest Starbucks or Dunkin’ Donuts. Wherever that person goes, Explore understands what that person and similar people are looking for. And it suggests coffee shops around the time of day a user has previously gone for coffee.

Explore then matches those tastes to the geo-spatial and temporal data it has collected on all the venues in its universe. Over time, a place can develop a profile or fingerprint, just like a user, attracting certain types of people at specific times of the day. Some places prompt even more engagement from users who capture pictures and leave tips and share the venue online or on a Foursquare list.

That data helps Foursquare understand more about that place and similar venues like it, and it helps the engine decide if it should recommend the venue to users that match that profile. It is a complex engine and one that is being tweaked and improved constantly. But it’s an example of how social data, both structured and unstructured, can be put to smart use for real-world recommendations.

Data growth

Foursquare has increasingly been building up its data team. It hired Justin Moore, a former quantitative analyst and Bear Stearns VP and director of technology in May of 2010 (though it recently lost him to Facebook). The company now has about a tenth of its 100 percent staff on its data team, including Andrew Hogue, the new head of search, an engineer who spent seven years working on search projects at Google. The company has built a big data stack using MongoDB, Hadoop, Amazon S3, Elastic MapReduce and other tools.

Hogue said even as Foursquare becomes known as more of a recommendation and planning tool, it is important to keep getting people to check in. The more people put in, the better and more refined the recommendations that come back. And that also helps support the majority of users, who are less engaged and don’t input as much data.

“We don’t want check-in data to go away because it’s the most direct, visceral signal we have,” Hogue said. “It tells us you’re there, interacting, one to one. It captures the experience in a raw kind of way.”

In the future, Foursquare will look to incorporate and balance more data to find even better recommendations for users. Hogue said, for example, Foursquare may look at giving more weight to knowledgeable people who are creating lists and leaving tips. The service will also look to get more direct input from users about their tastes, so it doesn’t have to just infer it from their actions. The future of Foursquare is very entwined in its big data operation, as it learns to pull out more value from its growing pile of information.

You’re subscribed! If you like, you can update your settings

firstpage of 11
  1. Reblogged this on Dots Of Color and commented:
    Big data big money!

    1. I don’t get it, what does Big Data have to do with a video card…or is this some lamesauce ad post?

    2. The emergence of this so-called big data phenomenon is also fundamentally changing everything from the way companies operate

      1. Yes, it does. Who controls the most data wins. At least Facebook would like to think so. ;-)

  2. Is gigabytes bytes more then a gigabyte?

    1. Katie Fehrenbacher gil Monday, March 12, 2012

      nope just a typo, fixed that, thanks!

      1. Typos happen.
        Gil’s “more thEn a gigabyte” is just plain ignorant.

    2. Grammar Police gil Thursday, March 15, 2012

      If you’re going to complain about a typo, make sure you don’t have any in your immature comment. When you have full mastery of the language, then you’ll be allowed to comment.

  3. infotech ideas Monday, March 12, 2012

    Great info! Bring the expo to SFO as well!

  4. remedy2020@gmail.com Monday, March 12, 2012

    Advertorial ! Advertorial! so fast you sold your soul!

  5. DataStax, more specifically Cassandra, can solve all big data problems.
    http://cassandra.apache.org/
    And its open sourced.

  6. SAP HANA to the rescue!

  7. Reblogged this on <i>cu Lì!</i> and commented:
    great info :)

  8. why do we have to click through so many pages. can you at least provide a way to read it in a single page? (like businessinsider) there is not even a print option and it doesn’t work with readability. i thought more of gigaom. disappointed.

  9. idiots…
    «“We want to unlock the black box of how an artist becomes a star,” White said»
    what makes the charts is good music, not $$$$$ pumped into it 8-X
    just like m$$$$$ can keep wasting billion$ on WP trying to make it a success, it won’t work. its crap, it doesn’t sell
    period

  10. Steven Brown Monday, March 19, 2012

    Big Data is a tactical problem. Content and Business Analytics and Intelligence is the logistical problem. One must pay particular attention to Business Process Models, Entity-Relationship Models, and Data Modeling to be able to use ETL and Data Integration Technologies for developing your data storage organization, retrieval, formatting, clustering, Web Caching; and backup, recovery, and archival retention strategies.

  11. Edwin Ritter Wednesday, May 9, 2012

    Reblogged this on Ritter's Ruminations & Ramblings and commented:
    As 2012 reaches the half way mark, here is a quick view on how this hot topic. This is the first of three. Posts on the other trends will follow.

    So ‘big data’ is a hot topic. What is it? Simply stated, everything you do on the web is tracked and creates data. So much data is collected that 90% of the online data was created in just the last two years. This data is stored, sliced, diced and analyzed. The growth in data is due to several things such as proliferation of smart phones and tablets, lower storage costs and improved analytical tools. This article reveals 10 ways in which big data will have an impact.

  12. Mission Impossible Wednesday, May 30, 2012

    Thank you for the thoughts on this so far.
    The kate broadwell

  13. Mission Impossible Thursday, May 31, 2012

    Thanks instead of the article. Blogging is replacing main onslaught news for various black people.
    The vpn port

Comments have been disabled for this post