25 Comments

Summary:

A new site called Digg In The Future — created by 17-year-old high-school student Raj Vir as a research project — says that its algorithm can predict with 63-percent accuracy what shared links are going to make it to the front page of the Digg website.

Fixed Mobile Networks Converge, TMobile launches HotSpot@Home

Hard-core Digg users and fans of the link-sharing site — the so-called “Digg Nation” — spend a lot of their time trying to push their favorite links to the front page, competing with each other for the number of “diggs” their links get, and debating why certain links made it and others didn’t. Now, a 17-year-old programmer has come up with an algorithm he says can predict which links will make it to the front of Digg with a high degree of accuracy. The algorithm, which developer Raj Vir is still tweaking, powers a site called Digg In The Future.

Vir, who is still in high school, says that his algorithm — which he’s been working on in his spare time for the past several months — has so far proven to be 63-percent accurate when it comes to predicting what will make the Digg front page. (He also says it will still work even with the new version of Digg, which is expected to launch soon.) So how does the software do it? Vir says he has built into the algorithm some intelligence about the site and its most popular users: both the users whose links get dugg the most, and those who digg a lot of other people’s links. As he described it in an email:

The method used takes advantage of several tendencies of Digg users. There are several “power users” on Digg who have a heavy influence on Digg’s frontpage. Two main factors taken into consideration are what I like to call “power submitters” (users who frequently submit future frontpage stories) and “power diggers” (users who frequently digg future frontpage stories). Digginthefuture [sic] keeps track of stories that have been dugg or sumbitted by successful users.

The algorithm also relies on other factors, Vir says, including the time of day (since stories submitted in the early morning hours are unlikely to reach the front page) and whether the link comes from “preferred” sites that appeal to Digg users: a list that includes Cracked, Wired, The Huffington Post, The Daily Mail and The Telegraph. One interesting element of the algorithm is that it doesn’t just look at Digg or its users: Vir says that since many of the links that make it to the front page of the site have already been shared on other social networks, the Digg In The Future software looks at frequently shared URLs from Twitter and gives those added weight.

Vir says he isn’t looking to build a business or sell his algorithm at this point (unless of course Digg or Google make him a multimillion-dollar offer, we’re assuming), although he said he is considering selling advertising on the site. But if his algorithm is good at predicting trending topics on Digg, it might be good at predicting what links and content will become popular elsewhere, and a lot of companies are very interested in doing that. Vir may wind up getting an offer he can’t refuse.

Related content from GigaOM Pro (sub req’d): Social Advertising Models Go Back to the Future

Post and thumbnail photos courtesy of Flickr user Sean McGrath

  1. good idea and relevant application of the idea — how about applying similar algorithms to recommendation engines of various kinds ? not just recommendations based on behavioral profiles but for trends, content, stuff that will become popular ?

    Share
  2. Do we really need an algo for that? I can predict who’s gonna go on the FP with high accuracy…it’s power user + Pic = that’s it. Now if he could predict which NON POWER user would go on the FP that would be already something…

    Share
  3. It would be useful to have a statistician say whether 63% is significant, it doesn’t seem to be much over 50/50, especially given it’s focused on power users already i.e. if you chose all the stories that power users submitted at all, you might get close to 63% anyway.

    Share
    1. Red Mcstevens Tuesday, August 24, 2010

      Its safe to assume that he checked the accuracy of choosing only the stories power users submitted – he has been working on it for months, so he most likely tested out several different methods.

      Share
    2. Given that any given story may have an infinitesimal shot at making the home page, I’d say 63% accuracy is remarkable. Now, if any given story had a 50-50% shot in the dark, it would be insignificant.

      Share
  4. Good deal.

    Share
  5. Taft Baumgold Tuesday, August 24, 2010

    this kid is going places. great idea. pat yourself on the back, kid

    Share
  6. Seriously? It’s awesome that a 17 year old cares enough to try it out AND have something working to show for it. Who cares if it’s useful or not?

    Share
  7. I congratulate anyone who works hard to achieve something they believe in .

    But you don’t need an algorithm to predict which link is going to be on the digg front page. :D if there’s boobs or drugs or both it’s going to be on the front page of digg.

    Share
  8. I can do this with 100% accuracy, pull headlines from reddit’s front page then wait a day or so. :)

    Share
    1. I have tried the reddit approach – and it is not nearly as successful as the current method. Yes, stories frequently hit reddit first, but a much larger number of stories reach reddit’s homepage and never even see a glimpse of Digg’s. The current algorithm takes into account popular URLs from twitter (which is even more predicting than reddit), but more importantly the effect of power users. Simply displaying reddit posts only would not yield results even close to the current 63%.

      Share
  9. Hey kid, if you want to make some money, use the same principal and make this for stocks :)

    Share
  10. Yeah, he’ll have to go back to the drawing board once the new Digg is released.

    The new Digg is highly focused on “news from friends” and it works like Facebook “like”, which means that if you’re a power user who diggs 100 articles per day, you’re going to lose most of your followers for “spamming” them. In the long term, the new Digg will get every user to follow only people that share relevant news to them.

    The new “power users” will be sites like Gigaom or Techcrunch, because a lot of people will be interested in getting news from them, so they will follow them, and some will digg the stories, too. So the more Digg followers a news site has, the more likely it will end up on front page. This makes a lot more sense, and it’s how it’s supposed to be.

    Share

Comments have been disabled for this post