How to Hack RSS to Reduce Information Overload


?Last week, I held a session at SXSW Interactive titled Hacking RSS: Filtering & Processing Obscene Amounts of Information, where I talked about creative ways to use RSS to manage information overload without using any programming skills.

There is more information available in the world than any one person could hope to consume (hundreds of exabytes of data), but most of that information is uninteresting, out of date, inaccurate, or not relevant for you. The key to reducing information overload is to more efficiently find the data you want among the information that you don’t care about. The tools that I talked about in my SXSW session are focused on discarding or de-emphasizing the data you don’t need, while highlighting the data that’s relevant for you. I wanted to share some of what I talked about during my presentation.

Individual RSS feeds from blogs, news and other sources are a great starting point for your information overload reduction efforts. Some individual RSS feeds from friends’ blogs or the top people in your field might almost always be relevant and won’t need any other work. But what about the blogs where one in five or one in 10 posts are relevant for you? How do you narrow them down to a manageable flow of information that allows you to keep up with at least the most important content?

While there are some simple ways to make better use of your RSS reader to manage information overload, the real magic is in filtering. My favorite filtering tool is Yahoo Pipes (s yhoo), which lets me filter an RSS feed using various criteria: URL, author, date, content and more. Some examples of filtered feeds in my reader right now include industry analyst blogs filtered to only find posts about online community; searches across social websites where my projects are mentioned; and my some blogs filtered for just the best posts using PostRank. The image on the right contains a simple Yahoo Pipes filtering example from my SXSW presentation.

PostRank is a great service that allows you to get the best posts from any feed based on an “engagement” ranking score that incorporates measures like comments, Diggs, sharing on social sites like Twitter, and more. The best thing about PostRank is that you can get an RSS feed of just the best posts from a particular publisher, and that feed then includes the PostRank score, which means that you can do even more hacking on the PostRank RSS feed using Yahoo Pipes. One useful way to use PostRank and Yahoo Pipes is to take several feeds containing only the best posts from a few of your favorite blogs, and filter those top posts to find only the articles mentioning a specific group of keywords using Yahoo Pipes. Because the PostRank feed includes the rank, you can even sort the results so that the highest ranked posts appear at the top of your feed. The image to the right shows an example of how you might do this.

Another technique that helps me to consume information more efficiently is to modify the format of many of my RSS feeds; I bring relevant information into the headlines of the feed to make it easier to quickly scan it to determine which posts are important enough for me to click on them for more details. By bringing more details into the title, I can avoid spending time clicking to get more information. There’s an example of reformatting a Twitter RSS feed in the image to the right.

The final trick is to use Web APIs to gather additional data that can’t be found in an RSS feed. I’ve written about using APIs before, so I won’t go into much detail here, but you can see an example of how I’ve used several APIs together with Yahoo Pipes to build an RSS feed of people posting links from Twitter to my blog posts on slides 17 – 23 of my presentation.

You can listen to the audio from my session and download the slides here.

What are your favorite RSS hacking tools and techniques to manage information overload?

Photo courtesy Flickr user SparkCBC.


Mark Jackson

Does anybody know of (or have created) a “bayesian” type approach to this? Like using bayesian spam filters (thumb up/thumb down type stuff) based on keywords. Seems like this approach would be highly customized to the individual and would be better able to learn/adapt to future articles/trends.


Thanks for a very informative post!

I also like you past posts on API’s and would like to request more posts on this subject.

Alan Ralph

I’ve used both PostRank and Yahoo! Pipes in the past to try and manage the amount of information I receive, and they are both good products, but I’ve found that they are only a partial solution for more general information gathering. These days I tend to skim through new items in Google Reader, pick out stuff that is interesting, tag it using ReadItLater, then hit Mark All As Read. I also do this with stuff that comes to my attention via friends on Facebook.


Hm… yahoo piper seems like a huge hassle. I just use the filtering that provides me.

I subscribe to the feeds I like and then say: “filter for this and that in all my feeds”

I can create as many filters as I want and the new stuff is indicated with a number in the filter name – convenient and snappy.

Alan Ralph

Hi Martin, I just had a look at the Favit website, and will be investigating further. Thanks for the tip!


Very nice on pipes, I was looking for a solution before I had heard of pipes and ended up coding up my own a year or so ago.

bit of back story: was using netvibes and noticed it was at times over refreshing and that some of the feeds would date back months and have been read ages ago, so just coded it together in a weekend hack session. Guess that’s the programmers way of doing it =]

Carl Natale

I like what you outline here. This could work for a few things I want to do. But I have a silly question.

What’s the long-term viability of Yahoo Pipes?

I keep thinking about what almost happened to Delicious. Which may still happen.


You can do much of the same by simply using FeedDemon. It has search feeds and it can filter feeds to show or not show feeds with specific words.

Alan Ralph

That is true, but the downside is that the filtering is only on the system where FeedDemon is installed. The stuff that Dawn is talking about will be usable on multiple systems and services.


Dawn, thanks for posting this for those who could not attend your session. My comment centers on how this problem impacts people in the business context where creative “hacks” like the ones you describe are laying the ground work for enterprise solutions to this problem. In enterprise the problem of overload is compounded by “email overload” since many of the systems in use generate a ton of email alerts. The productivity losses, inefficiency and possible missed opportunities are very significant. Where users can “subscribe” or “follow” as you describe it opens up many opportunities for solving the bigger issue of information overload within business – especially when combined with analytic and social intelligence capabilities. Even so, as you note, it’s clear that the ability to easily produce information is outstripping the capability of tools to capture, filter and deliver what is personally or professionally relevant.

As so often the case, creative solutions get pieced together out of necessity but they are not enterprise friendly so they do not make the solution widely available to everyone that needs it. We built the Attensa StreamServer to do that. It is an server application combining massively scalable aggregation with tools for combine, filter, curate and deliver topical streams in the manner that you describe. Your readers can find out more about how it works and how it combats information overload at

Melanie Baker

A bit of an asterisk to that — we actually blocked Pipes some time ago. Mostly because they were way too slow for the system to process, and were resulting in a lot of duplicate data.

So bottom line, Pipes + PostRank may work, may kinda work, may time out, or may fail completely because of it. YMMV.

Simon Mackie

thanks for the update, Melanie, although I believe what Dawn is talking about is taking the output from PostRank as an RSS feed and feeding it into Pipes, which should still work. Maybe Dawn can clarify for us.

Dawn Foster

Melanie / Simon – Yes, I’m talking about using PostRank feeds within the Yahoo Pipes fetch feed module, which works just fine. In other words, we’re using the PostRank feed just like any other feed.

Putting a Yahoo Pipe feed into PostRank never really worked right, so I’m happy to see you’ve blocked it to avoid user frustration :)

Comments are closed.