Thanks to the rating systems in place on such popular websites as Netflix, Amazon and eBay, many people have become comfortable evaluating things in absolute terms: a two-star restaurant, a B movie and so on. But new research out of the Massachusetts Institute of Technology says that this approach to ranking things is fundamentally flawed.
Recommendation systems should instead ask users to compare products in pairs, not as stand-alone items, says Devavrat Shah, a professor at MIT’s Laboratory of Information and Decisions Systems.
According to Shah, the kind of star rating systems that are the status quo on the web today are flawed because, well, humans are flawed. “If my mood is bad today, I might give four stars, but tomorrow I’d give five stars. But if you ask me to compare two movies, most likely I will remain true to that for a while,” Shah says in an article published this week on MIT’s news site. “Your three stars might be my five stars, or vice versa. For that reason, I strongly believe that comparison is the right way to capture this.”
In a series of recently published academic papers, Shah, along with students Ammar Ammar and Srikanth Jagabathula, as well as MIT Sloan School of Management professor Vivek Farias, demonstrated that stitching “pairwise rankings” together into a master list is a more accurate representation of customer sentiment than relying on customers to rate things by themselves on a typical five-star scale. According to the MIT researchers, they have formulated algorithms that have proven to accurately predict shoppers’ preferences with 20 percent greater accuracy than the kinds of formulas most often in use today. They have built a website, Celect.com, to show off their theories in practice.
The success of using a more complicated algorithmic approach for recommendation engines has been proven by Apple’s iTunes software, and in particular, its Genius song selection feature. An Apple engineer disclosed last year that Genius uses much more than just the star ratings system to power its personalized song recommendation engine. In fact, iTunes use a complex combination of big data analytics and aggregated personal information to customize content for users.
Of course, finding programmers is tough as it is, and not all web companies can afford to hire Apple-caliber software engineers or MIT Ph.D.s to formulate their recommendation engines. Also, users on sites such as Yelp have become very confident in their roles as armchair critics, adding and subtracting stars from reviews for highly specific reasons. But then again, classic websites such as HotorNot.com — and even Facebook predecessor FaceMash.com — have shown that one-to-one comparisons can be fun, too. According to the folks at MIT, if comparison engines can make the leap from fun pastime into the big leagues of e-commerce, recommendation systems could get even more spookily accurate.
Image from the cover art of The Complete Works of The Critic DVD set, found on Amazon.com