The massive amount of data that is emerging from connected, digital systems, is fundamentally changing everything, from Internet search to entertainment, to disease management, to energy consumption. Here’s 10 case studies that highlight the power of big data.


Can gigabytes predict the next Lady Gaga?

By Stacey Higginbotham

Want to know how playing on Jimmy Kimmel Live will boost the sales of an artist’s album? Or how about figuring out where fans go to find artists after they hit the evening news? What about the effect Whitney Houston’s death had on her YouTube and Vevo plays? They shot up 4,525 percent, by the way.

If you want to know this and other music industry data gleaned from the Internet, then you want to turn to Next Big Sound, which exists to find the connection between social activity and music sales.

The service, which recently raised $6.5 million, began two years ago because its founders thought the influx of data — from social networks like MySpace and Twitter, online music services such as Rdio, and sales sites — might help them understand how someone transitions from being a member of a band to being a full-fledged rock star.

The site pulls in 5–10 GB per day with peaks of about 100 GB per day on the heaviest days from the usual suspects such as Facebook, Last.fm, Rdio, iTunes and more. Some of this data is structured and accessed via an API — such as data from the music services and sales sites — and thus is easy to deal with. Other data, like that gleaned from blogs, Facebook pages or Twitter, is based on scraping the pages and sites and needs some formatting before the data geeks at Next Big Sound can make sense of it.

Next Big Sound uses Cassandra for larger, time-oriented data sets; MongoDB for medium-sized, semi-structured data sets; MySQL for small, well-structured data; and Apache Hadoop + Pig for offline analytics. Alex White, the CEO and founder of Next Big Sound, didn’t go into more tech specifics, but he did get excited about the new sources of data and how they can change the industry. “We want to unlock the black box of how an artist becomes a star,” White said. “We want to reverse engineer the Billboard charts and understand the key actions and moments that can turn a garage band into superstars.”

Beat of big data

The music industry is ripe for a data infusion. Major labels have their own Hadoop clusters and attempt to track how their artists perform on fan pages and, of course, produce record sales. The process of tracking music sales was done for decades by managers calling up record stores, but in 1991 Nielsen SoundScan entered the scene with accurate CD sales information. Billboard, the industry’s trade magazine, realized that accuracy was the way to go and used the SoundScan data in its charts. The next time Billboard added charts, it was from Next Big Sound.

“Billboard realized it was missing the ways that people were listening to music now,” said White. The music industry had to understand how the explosion of social media affected its core metrics — sales of songs and albums — and nothing was out there. That’s where Next Big Sound comes into play, but fundamentally its goal is larger: It is to learn via scads of data how to make a star.

Next Big Sound has two undisclosed major record labels as customers so far, and it generates two charts for the Billboard Social 50 and the Next Big Sound’s Up and Coming Artist list for Billboard. Last year it also published ”The state of online music in 2011,” an infographic and report chock-full of stats, including that almost 65 billion songs were played across the sites that Next Big Sound tracked in 2011 and that video plays on YouTube peak on Thursday. Also Lady Gaga is big. Everywhere.

But the big money isn’t in trivia; it’s in the insights. And Next Big Sound delivers value above and beyond the record label’s own data-tracking efforts by looking across the entire music industry spectrum to help music professionals allocate their resources around a particular artist. For example, if a manager notices a singer that shares many of the same characteristics as one of her own clients, she might investigate how having a YouTube page has affected that artist. This way the manager can spend her promotional dollars and time more efficiently.

Labels can also run an artist’s songs on YouTube to see which one the label should promote for radio. Tracking YouTube plays or comments made there and on music-focused networks might indicate if the label has a potential hit on its hands. White’s hope is to help provide that comparison.

What Next Big Sound can’t do yet is provide context, however. For example, White says he can show an executive that Chris Brown has 20,000 new likes on Facebook, which might seem good until you realize that is flat compared to all the other Grammy-affiliated artists in the week heading into the awards. But for now, he can’t say why that is. White points out his job isn’t to assess popularity exactly but to say what that popularity means for the record industry.

“We don’t speculate why fans aren’t liking Chris Brown. We want to know what the business impact is. If they aren’t liking, are they still buying?” asks White. That’s what the record industry wants to know, and that’s what the data shows it. For White, this isn’t a social experiment but a hunt for actions that an artist or label can take to produce someone who can sell albums and fill concert halls. For insights into the hows and whys we are drawn to a particular artist, the data and Next Big Sound will stay silent.

You’re subscribed! If you like, you can update your settings

firstpage of 11
  1. Reblogged this on Dots Of Color and commented:
    Big data big money!

    1. I don’t get it, what does Big Data have to do with a video card…or is this some lamesauce ad post?

    2. The emergence of this so-called big data phenomenon is also fundamentally changing everything from the way companies operate

      1. Yes, it does. Who controls the most data wins. At least Facebook would like to think so. ;-)

  2. Is gigabytes bytes more then a gigabyte?

    1. Katie Fehrenbacher gil Monday, March 12, 2012

      nope just a typo, fixed that, thanks!

      1. Typos happen.
        Gil’s “more thEn a gigabyte” is just plain ignorant.

    2. Grammar Police gil Thursday, March 15, 2012

      If you’re going to complain about a typo, make sure you don’t have any in your immature comment. When you have full mastery of the language, then you’ll be allowed to comment.

  3. infotech ideas Monday, March 12, 2012

    Great info! Bring the expo to SFO as well!

  4. remedy2020@gmail.com Monday, March 12, 2012

    Advertorial ! Advertorial! so fast you sold your soul!

  5. DataStax, more specifically Cassandra, can solve all big data problems.
    And its open sourced.

  6. SAP HANA to the rescue!

  7. Reblogged this on <i>cu Lì!</i> and commented:
    great info :)

  8. why do we have to click through so many pages. can you at least provide a way to read it in a single page? (like businessinsider) there is not even a print option and it doesn’t work with readability. i thought more of gigaom. disappointed.

  9. idiots…
    «“We want to unlock the black box of how an artist becomes a star,” White said»
    what makes the charts is good music, not $$$$$ pumped into it 8-X
    just like m$$$$$ can keep wasting billion$ on WP trying to make it a success, it won’t work. its crap, it doesn’t sell

  10. Steven Brown Monday, March 19, 2012

    Big Data is a tactical problem. Content and Business Analytics and Intelligence is the logistical problem. One must pay particular attention to Business Process Models, Entity-Relationship Models, and Data Modeling to be able to use ETL and Data Integration Technologies for developing your data storage organization, retrieval, formatting, clustering, Web Caching; and backup, recovery, and archival retention strategies.

  11. Edwin Ritter Wednesday, May 9, 2012

    Reblogged this on Ritter's Ruminations & Ramblings and commented:
    As 2012 reaches the half way mark, here is a quick view on how this hot topic. This is the first of three. Posts on the other trends will follow.

    So ‘big data’ is a hot topic. What is it? Simply stated, everything you do on the web is tracked and creates data. So much data is collected that 90% of the online data was created in just the last two years. This data is stored, sliced, diced and analyzed. The growth in data is due to several things such as proliferation of smart phones and tablets, lower storage costs and improved analytical tools. This article reveals 10 ways in which big data will have an impact.

  12. Mission Impossible Wednesday, May 30, 2012

    Thank you for the thoughts on this so far.
    The kate broadwell

  13. Mission Impossible Thursday, May 31, 2012

    Thanks instead of the article. Blogging is replacing main onslaught news for various black people.
    The vpn port

Comments have been disabled for this post