The BBC’s R&D department unveiled an interesting take on the traditional electronic programming guide (EPG) this week that allows viewers to search for people, places and things across tens of thousands of movies and TV show episodes. Channelography is based on captions of close to 170,000 pieces of programming shown across the BBC’s nine U.K.-wide TV networks, which can be searched for close to 100,000 data entities.
Viewers can, for example, find which shows have mentioned San Francisco in recent weeks and how many programs mentioned Barack Obama since data gathering began in the fall of 2009 (1423 times). Channelography also allows users to browse through various shows, making it possible to quickly learn which persons or places were mentioned on a specific episode of BBC Newsnight or the children’s show Arthur.
Channelography is based on semantic analysis of closed captions, which is performed by cross-referencing the data with Wikipedia, Musicbrainz and various other openly available data collections. This type of analysis is performed by Muddy, a semantic indexing and categorization tool developed by Rattle Labs in cooperation with the BBC’s now-defunct Backstage R&D initiative.
Granted, Channelography may not be the most convenient EPG for everyday use. But one of the things that’s really fascinating about it is the amount of additional aggregate information that can be gathered from it. For example, who would have guessed that Afghanistan gets more mentions these days on British TV than Northern Ireland?
The makers of Channelography clearly recognized this potential for data analysis as well, which is why they also created a companion dashboard to reveal trends across the BBC’s network. The Channelography dashboard not only reveals how much of the BBC’s programming consists of repeats, but also how often companies like Facebook and Microsoft (s MSFT) have been mentioned on the programs, and even which clichés are the most common amongst BBC journalists.
Channelography was developed by Rattle and commissioned by the BBC. The project was only available internally until this week’s official unveiling, and Rattle actually produced a paper guide to make sense of the BBC’s 2010 programming for the broadcaster’s staff as well. (Check it out here; it contains some beautiful visualizations). Rattle’s Director James Boardwell wrote on his blog this week that the company wants to build a similar semantic guide for radio next.
He also said that using captions for semantic analysis of TV content could help broadcasters to add SEO to online platforms, and even offer a new kind of cultural analysis. From his blog post:
“How perhaps different people appear together or cluster and how over time the data could become a proxy for British culture more generally and the things that pre-occupy us, for example how Victorian drama is replaced by Edwardian or how Shakespeare’s influence ebbs and flows, all hugely interesting and only do-able when you have data available on this scale by a media organisation as central to the culture of a nation as the BBC.”