RSS in the Mainstream, Coping with Rising Bandwidth

RSS in Web Browsers

Support for RSS in Firefox, then Safari, are mostly encouraging developments to slowly further bring RSS into the mainstream. Both browsers are implementing the concept of “Live Bookmarks”. Bookmarking a site now offers us the ability to not only save its location, but also become aware of its changes and store its updates on our own computer for future consumption. The question remains whether or not said content will actually be consumed by the end-user:

On one hand, dedicated News Aggregators such as NetNewsWire take important steps to notify users that content they’ve subscribed to has been updated: the Dock icon will reflect how many new articles are available. As with e-mail, it is difficult to ignore this new content, as we’re inevitably compelled to go see what has changed.

On the other hand, the web browser is an application I typically use to perform specific, timely browsing tasks. In the current implementation of Safari’s RSS Bookmarks, I’m not quite feeling drawn to the updated content. Many articles are updating from sources that were added through the Tiger installation process, taking-up space on my hard drive, sucking bandwidth from websites I typically don’t even visit. The New York Times? Puh-lease!

Are we wasting bits?

Bandwidth and Load

Om Malik touched on a subject fairly dear to my heart in one of his recent blog articles on RSS, Tiger Safari and the Bandwidth Bottleneck:

Most RSS readers are set to poll for updates every hour, and imagine when half-a-million Tiger Safari users who start hitting a server at the same time, pulling down RSS updates, because they have not changed the default settings. Server meltdown? [Read More]

HTTP 304 and Caching

As end-users are likely to become more passive consumers of web sites’ bandwidth, highly-trafficked sites and authors of RSS-consuming applications (may they be aggregators or client software) will likely find it in their best interest to become intimately familiar with Caching in HTTP/1.1. Beyond the alarming question, Om Malik’s article offers a good layman’s introduction to the wonders of the HTTP 304 response code, drawing from seasoned geeks in the field. One such Geek, Charles Miller, gives the rest of us a great practical overview of HTTP Conditional GET.

Accept-Encoding: gzip

Another tool available in our shed to dramatically reduce bandwidth usage is GZIP Encoding of content sent over HTTP, upon acknowledging the “Accept-Encoding: gzip” header sent by all decent browsers today. This standard has been around for quite some time. From what I recall there were a few issues resulting in some of the most trafficked sites to have shied away from this in the past: 1) issues with certain versions of Netscape by which retrieving a .css or .js document over a gzipped stream, caused the browser to crash. Depending on who your audience is, this is very unlikely to be an issue today. 2) GZIP yields greatest benefits to the most trafficked sites, yet, bandwidth savings may at times have been offset by greater CPU consumption on highly strained servers. Depending on what your web application does, and how large are the documents being served, this is also unlikely to be as much of an issue today with fairly recent server hardware.

Google gzips its output … provided you give it a User-Agent it recognizes.

If you run Apache’s httpd, you might consider trying out mod_gzip.

If you’re in a Java Servlet Container, there are quite a few ways of gzipping your output: A servlet filter will behave very-much like mod_gzip. If you’re working on an individual servlet, you just might “Writer out = new BufferedWriter(new OutputStreamWriter(new GZIPOutputStream(response.getOutputStream()),”UTF-8″));” your way out of it after making sure the requester did send gzip as a value to the Accept-Encoding request header, and setting the Content-Encoding: gzip header in your response.