Can 32,000 Data Points Yield The Perfect Book Recommendation?

Reading recommendation websites have focused on social recommendations up to now: Sites like Goodreads and Shelfari suggest books to users based on what their friends are reading. But the books that turn up tend to be the most popular anyway, often those with marketing efforts already in place. A new site,, is delving deeper–all the way into the book’s “DNA”–to help readers find new titles. It is launching with 20,000 titles, and is better than any other book recommendation tools I have used. is the public face of the Book Genome Project, which was founded by University of Idaho students in 2003 and aims to identify, track, measure, and study the features that make up a book using computational tools. Other book recommendation sites exist, but they tend to rely on user-submitted data and social recommendations. BookLamp is different because it actually analyzes the books’ text. Its algorithm breaks books down into 32,160 elements: “StoryDNA” (“setting” and “actors”), language and character DNA. (Here is some more detailed information on how that works.)

“The analogy I use the most is that if you’ve eaten a chocolate cake and you wanted to find other cake that tasted the same, you’d need to know not just the ingredients, but the percentage and the preparation,” Aaron Stanton, the site’s founder (who originally gained some internet fame for his project Can *Google* Hear Me), said. “From that perspective, thematic ingredients balance the book, and the writing style is the preparation: How is language prepared to deliver that storyline to the reader?” Each book featured on the site has a “BookDNA” overall profile including all of those elements.

“Since the suggestions are based on what is inside a book alone, it is not influenced by things like marketing budgets or author popularity, which drives social suggestion engines,” Stanton said. “We are an equal friend to the front, mid, and backlist author. Or as we like to say, ‘We’re as likely to find you Richard Bachman as Stephen King.'”

Who is Richard Bachman? In early tests, Book Genome Project researchers noted that he kept popping up as a match for readers who liked Stephen King. Turns out that Richard Bachman is the pen name that Stephen King used to publish the “Running Man” series of science fiction books between 1977 and 1982. “He wanted to see if he could recreate breaking into the mainstream,” Stanton said. “He sold maybe 30,000 copies as Richard Bachman. When he became public as Stephen King, he sold millions. From our perspective, if you’re looking for a perfect Stephen King-like book, Richard Bachman would be the best possible match. But a social network would never have recommended it. That is an ideal use case.”

Publishers, Partnerships and Business Models

Right now, BookLamp gets the texts of the books it analyzes directly from publishers. “We didn’t want to be in a situation where we were using people’s data without a clear thumbs-up from them,” Stanton said. The company’s biggest partners so far are Random House and independent publisher Kensington. Stanton would not comment publicly on ongoing negotiations with other publishers, but said that publishers want to know “what kinds of things BookLamp can do to help them make publishing decisions and editorial and sales staff tools to know more about the books they’re working on.”

In addition, Stanton said, BookLamp may eventually be used as a tool to match writers with agents and publishers. Writers have used the site to see how their writing compares to bestselling author and to find possible editors and agents who specialized in the types of books they were writing. “Assuming things go well,” he said, “we might like to move in that direction.”

For now, BookLamp is not intended to be a commercial site. “It’s more of a way to show off something we think is cool to the publishing industry as a whole, and ultimately ask them to help support the project with content,” Stanton said. “The reason we’re doing this is because part of our business is in offering business-to-business tools to publishers, using these sorts of data analytics tools to improve market targeting, find better comparables, and that sort of thing. There’s a range of areas where we can work with publishers to help improve processes, and that’s traditionally been our life blood over the last two years. So we’re not looking to the website to earn us money, so much as get feedback from readers, demonstrate what the tools can do, and then hope that facilitates the right introductions throughout the publishing industry to empower the rest of what we do.”

A possible wrinkle for BookLamp is that at least a few publishers are working on some kind of book recommendation system of their own. The idea of a “Pandora for books” is not new (and may not really be relevant at this point–when was the last time you discovered a new artist on Pandora anyway?) The most recent time that comparison was used was used (well, besides in this post, which focused on a Spanish e-book site whose delivery system, streaming, is similar to Pandora’s) was for a joint venture called Bookish backed by Hachette, Simon & Schuster and Penguin. That site also aims to deliver book recommendations, along with news, reviews and a bookstore. It was supposed to launch this summer, but there’s been no new news in recent weeks and no details released about how its recommendations will work.

But BookLamp is a tool that publishers should not ignore. Publishers have struggled to promote their backlist titles (books more than a year old), especially as bookstores close and more space and attention are taken up by hot new books. Since BookLamp makes recommendations based on content–not on newness, marketing campaigns or other outside factors–it can call readers’ attention to books they might never hear about otherwise.

So How Well Does It Work?

The site is obviously limited by the fact that most publishers aren’t yet participating, but it helps that the largest English-language trade publisher is. I searched for Hateship, Friendship, Loveship, Courtship, Marriage, a short-story collection by one of my favorite authors, Alice Munro (published by Random House). Reading suggestions that popped up included, not surprisingly, other short-story collections by Munro–at the top, interestingly, was Runaway, my favorite of her collections after Hateship, Friendship. I’d never heard of some of the other results, most of which appear to be the quiet type of literary fiction that I am a sucker for–and, indeed, these are titles I probably wouldn’t have found on Goodreads. (One of the top results, however, was Ten Days In The Hills by Jane Smiley, a book I despised.) Many of the books recommended for me were published at least a few years ago and probably had small to nonexistent marketing budgets–another reason they might never have shown up on Goodreads, or under my Amazon recommendations.

When I searched for Hateship, Friendship, Loveship, Courtship, Marriage on Goodreads, the recommended titles for me–in “Readers Also Enjoyed,” a small and not immediately obvious sidebar–were primarily other short-story collections by writers of literary fiction. In other words, what those titles had in common were that they were all short-story collections, not subject matter. While BookLamp found other titles for me that shared common elements with Munro’s book–“expressions of emotion,” “family relationships/marital dynamics,” “suburban living/neighbors,” indeed the themes I’m consistently attracted to in what I read–Goodreads simply assumed that I must like short stories of any kind, which is not true (I prefer novels, I just love Alice Munro). And while Goodreads is indeed a social recommendation site, it focuses much more heavily on the social aspect–reader reviews and stars, friending etc.–than on the recommendation aspect.

To be sure, BookLamp’s recommendations aren’t always perfect and are sometimes downright odd, which the company acknowledges (one section of its FAQ is titled “When BookLamp’s engine goes insane“). The site lists a few of the zanier examples, but sometimes recommendations are just off: Danielle Steel was recommended to me because I like Jhumpa Lahiri, and the authors are not at all similar.

Most of the time, though, the recommendations seemed really apt and serendipitous, books I wouldn’t have stumbled across on my own. The bottom line is that BookLamp is the best book recommendation site I have used so far, in that it actually turned up books I would like to read and had not previously heard about. That’s huge for me as a reader.