VentureBeat reports Pluggd, a Seattle-based audio and video search startup, has raised $1.65 million from Intel and a group of angels. Pluggd, which we first met at DEMO this year, is trying to use speech recognition tools to help make sense of audio and video podcasts. VentureBeat runs the company’s claim it has “‘perfected the user experience’ for audio and visual search.” While we like what Pluggd is doing, that’s a bit of an overstatement.
First, “visual” is probably a typo for “video”; to our knowledge Pluggd does not have an image recognition product. It’s possible that visual search is in the works, but it’s not even at the demo level yet.
Video search is often attempted by analysis of the soundtrack, rather than the picture, and we expect that’s what’s going on here. But mainly, we take issue with the claim that anyone has “perfected the user experience” in this area, because a big part of user experience is having a product that works.
Speech recognition is something we take a personal interest in; we studied it academically. At GigaOM, we recently ran a story about the prospects for speech recognition tapping the immense potential of automated online video advertising, and gave Pluggd a plug for its efforts in the area. However, we know our thinking is wishful.
To wit, Pluggd currently offers a demo site of 97 episodes of the same show on the same topic: ESPN Radio SportsCenter. The company’s HearHere technology’s use of semantic clustering (so you can do topic search, rather than keyword search) and visual heat-mapping are quite nice. But speech recognition as a whole is a long ways from being able to take on any voice, in any conditions, on any topic. In other words, it’s nowhere near perfect.