Using Hadoop to process data for targeted web advertising efforts is nothing new — Yahoo (s yhoo), Razorfish and others have been doing it for a while — but this week two companies in the video advertising space also stepped forward to highlight how Hadoop is helping them deliver the right ads to the right viewers for their clients. In reality, the purposes behind targeted video advertising are exactly the same as for any other form of targeted advertising, but the practice might only now be increasing significantly in popularity as more consumers are watching a greater percentage of content online. And video does present some new data points for organizations to track beyond what might be necessary for HTML banners and other forms of web advertising.
The two Hadoop converts that exposed themselves publicly are TidalTV, which is using Karmasphere’s application-development tools atop of Amazon (s amzn) Elastic MapReduce, and LiveRail, which is using Apache’s Hadoop distribution in conjunction with Infobright’s open-source analytic database. For TidalTV, the decision to turn to Hadoop is the result of a rapid growth in network activity — “a rate of 20-40 million data ‘pings’ every 24 hours” — making it far more challenging to store and analyze the data, especially in light of its desire to accurately track revenue against specific targets and deliver detailed reports back to advertiser clients.
LiveRail turned to its Hadoop-plus-Infobright setup to handle an increase in customers needing near real-time access and the ability to run ad hoc queries against the “millions of rows of data produced every day.” According to the announcement, LiveRail monitors more than a dozen metrics for video content, “including percentage viewed/completed, pause/resume and muting.” In its new analytics environment, data goes through an ETL process before being processed by the Hadoop cluster and ultimately being loaded into the Infobright database that underlies LiveRail’s customer-facing reporting platform, although customers also can access some of the raw data stored in the Hadoop cluster.
From YouTube (s goog) to Hulu to CNN’s new multi-platform video-news service (s twx), as well as just about every corporate website presenting video content, there is a new and vast collection of data to collect, analyze and monetize, and it seems very likely that Hadoop or similar technologies will become a big part of those efforts across the board — if they aren’t already.
Image courtesy of Flickr user drumm.