I used to think of APIs as some kind of magical, amazing constructs that could only be used by people willing to write big chunks of code. But when I started using Yahoo Pipes, I noticed that other people were using APIs to gather useful data using what looked a lot like regular, ordinary URLs. Now, this isn’t to say that every API can be used without any programming skills, but using many of the APIs made available by services on the web really isn’t as hard to use as people seem to think. In this post, I’m going to show you how you can make use of APIs (perhaps to gather data, or to carry out some automation) with minimal programming.
Construct the URL
This first thing you need to know is how to construct the URL. Each API is slightly different, so you need to review the documentation for the API you’re interested in. Most of the time, the documentation will include examples that you can tweak to get the output that you’re after.
For example, here is the documentation for the MediaWiki API’s backlink query. It can return a list of pages that link back to a specific page that you are interested in. It includes an example URL that looks for all the pages that link to Wikipedia’s home page. I’m going to tweak that so that it finds 10 links to the Star Wars Wikipedia page.
Let’s break down that URL and see how it’s made up:
- http://en.wikipedia.org/w/api.php — This is the place on the web where the API lives. Here, we’re using Wikipedia as an example, but you can replace this first part with the URL of any MediaWiki installation.
- action=query — fetch the data requested in the rest of the URL.
- list=backlinks — list pages that link to a specific page.
- bltitle=Star%20Wars — the specific page with the title “Star Wars” (note that %20 represents a space).
- bllimit=10 — limit to 10 results.
I simply took one of the default examples listed in the API documentation and made only minor changes to gather the information that I was interested in seeing.
API calls can return a variety of formats depending on the service. I usually start out by returning the data in HTML or XML that I can easily view in a web browser to make sure that my query is correct I’m getting the results from the API that I expect. I’ll then sometimes switch to another format, like JSON, if I want to download the data to use for some other purpose later.
Let’s take a look at another example. Here is the documentation for Twitter’s GET users/show function, which returns all of the available information about a user, including description, URL, link to profile image, last tweet, count of friends / followers, and much more. Again, I can simply modify the provided example query to return the information I’m interested in:
- http://api.twitter.com/1 — Version 1 of the Twitter API.
- users — users section of the API to gather information on a user.
- show.xml — display the output as XML.
- screen_name=geekygirldawn — the user that you want information about (geekygirldawn)
Keep in mind that most APIs have some kind of rate limiting, which means that you can only make so many calls to the API from a given IP address or account in a given amount of time before they cut you off. For Twitter this limit is 150 API calls per hour. This is to prevent people from abusing the API and putting too heavy a load on the servers, but it also means that you might be playing with some API calls to get the parameters just right when you are suddenly cut off. Don’t panic. Go have a drink or a snack, and come back in an hour or so to try again.
Some APIs require that you sign up for an API key. This is usually to keep track of your requests, and you should think of it as a little like a password that shouldn’t be shared with other people. In many cases, an API key is what the API uses to rate limit your requests.
Combination for More Power
With some additional programming expertise, or using a tool like Yahoo Pipes, you can start to combine these APIs to get some really interesting information that can’t easily be gathered in other ways. For example, I have one Yahoo Pipe that:
- starts with an RSS feed of all of my WebWorkerDaily posts
- takes each link and runs it through the BackType API to see who has posted the link to Twitter
- then uses the Twitter API to see how many followers each person who tweeted the link has
- and finally it formats all of this information into a new RSS feed in the form of “username (num of followers): Tweet text” that I can view in my RSS reader
What are your favorite tricks for using APIs to gather interesting data?
Related content from GigaOM Pro (sub. req.):