Is There Money in Voice APIs?
I’ve been covering the VoIP space since 2004, and lately it seems like every other press release sent my way is from a company announcing the addition of an application programming interface (API) to its telephony platform. The promise of APIs is that they make it easy to integrate different services — even those provided by different vendors — into a single application. The press release from one carrier even went so far as to claim that its API would “boost innovation and development of new apps exponentially.”
But is simply providing an API to your telephony infrastructure enough to prompt the world to beat a path to your door? Don’t count on it.
To be sure, these APIs are necessary, particularly in the world of voice mashups. Voice mashups combine voice as well as data and applications across multiple systems to create a new, useful service.
One example of a voice mashup is Twitterfone, a free service that takes your voice, converts it to text and sends it to Twitter. MAXroam provides the overall infrastructure and inbound telephone numbers, Dial2Do does the speech-to-text part, and Zong provides some inbound SIP termination. APIs are needed all around — including on the voice side — to make this happen seamlessly.
Voice mashups can be useful in the business space. They can save a ton of money, and can help to enforce both business process quality and consistency. Imagine calling an airline and speaking to an interactive voice response (IVR) system. A certain percentage of calls could easily be handled by the IVR, which can ask all the correct questions to ensure customers have the right information.
There are, of course, times when speaking with a live human being is necessary. So imagine that all the data collected by the IVR about your call is then sent to a customer service representative so that by the time the two of you are connected, they already know exactly why you’re calling. The call could even be routed to a particular rep based on the reason you’re calling.
This is the power of a voice mashup — the ability to treat voice and data interchangeably. While large companies have been able to afford the cost of developing these custom voice mashups, tools and services are now becoming available that let you make your own.
Jaduka started out by providing a voice API to their telephony infrastructure, which is maintained by their parent company, NetworkIP. But Jaduka quickly discovered that while developers signed up for the API, few were actually using it to launch services. The company now offers customized voice-enabled applications to enterprise customers.
Jaduka’s customers currently use over 4 million minutes a month, a number that is trending upward. But that’s a drop in the bucket compared to the more than half a billion minutes a month their parent company serves.
Ifbyphone provides a number of voice-related small business services as well. They also offer a voice API, but it’s essentially driven by web forms, which makes it easy to integrate their telephony services into any web site without needing to be a programmer.
And while not everyone agrees that what Ifbyphone provides would qualify as a proper API, it does offer a range of useful services to small businesses, such as interactive voice response, intelligent call routing and voice broadcast. They are all designed to help small businesses interact directly with their customers in the most efficient manner possible.
Indeed, APIs enable some great solutions. But APIs aren’t solutions in and of themselves. Nor do they necessarily make money.
Consider Ribbit, a company whose business model is to make telephony available through APIs. The thinking is that they’ll make their money on revenue shares as developers create interesting applications.
If Jaduka’s experience is any indication, however, I don’t expect Ribbit will last too much longer without a complete change of strategy. Ribbit might have 4,000 developers, but how many of them are actually making applications on which Ribbit is able to share revenue? I don’t put a lot of stock in the rumor that BT has purchased Ribbit for $55 million.
Even where you’ve got more than just an API, such as the case with Jaduka and Ifbyphone, the prospects for making a pot of money just don’t seem that great. The combined revenue of Jaduka and parent company NetworkIP is thought to be north of $150 million a year. Assuming Jaduka’s share of minutes per month also translates into share of revenue, that suggests Jaduka is responsible for $1.2 million of the revenue. Ifbyphone would not disclose customer numbers or revenues.
I think the market has a lot of potential, but so far, that’s about it. Go ahead and make those telephony APIs available, but don’t expect the world to beat a path to your door, and don’t expect to make any money just by publishing APIs. Figure out who your customers are, find out what problems they have, and develop solutions to meet their needs. APIs can certainly be a part of the overall strategy, but relying on APIs alone to generate revenue is a pipe dream.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.

Nice review of the space, Dameon. I think the value of an API is determined by A) what functionality it exposes and B) the power of the underlying infrastructure.
In particular, I think carriers have a lot to gain today from the right API’s. They need innovative services to help differentiate what has become a real commodity (plain old voice). They also need ways to drive consumer loyalty to their brand. But, they are not organizations that are conducive to the kind of creativity this needs.
I know that isn’t entirely on point with your article, but it was fresh in my mind having just read this interesting piece from Alan Quayle: http://tinyurl.com/6fmrko
This is exactly what I have been preaching for a year, and I have met with much snickering and eye rolling. Voice API’s (in and of themselves) cannot make money, there are no startups (that I am aware of) that have pitched and closed funding for a business concept that is based off of using someones else’s network or API. Conversely customers of network providers are willing to pay for minutes, but they are NOT willing to pay for the use of an API. API’s are expensive to develop and maintain, and unless your end goal is to provide a toolkit for others to quickly grow a business and have the ability to make a serious margin, they will remain unused.
Credit where credit is due, Ifbyphone seems to be the closest to “getting it” so far, Jaduka although backed by a major network, is still very much Voice 1.0 in there thinking. Skype even with its developer network and 1000′s of applications, has not made anyone any real money.
There will be a major shakeout in this space in the coming 18 months, Voice mashups are great for proof of concept, but when it comes down to the brass tacks, you end up always building your own telephony infrastructure, rather than relying on someone else’s – as no one will fund a startup, that is 100% reliant upon another startup..
@Shai APIs at the carrier level make a lot of sense. Telcos are built on interoperable standards. APIs are simply a logical extension of that. However, I think the APIs will be geared more at other carriers, not at enterprise and small businesses like Jaduka and IfByPhone are doing.
@Andrew: I don’t see Jaduka being “voice 1.0 in their thinking” as entirely a bad thing. They do have telco-grade equipment with telco-grade reliability at their disposal. Also, while we spent a few years thinking about Voice 2.0 and beyond, the vast majority of enterprises and businesses are just now starting to come to terms with Voice 1.5–the fact voice can go over IP at all.
I think you hit the nail on the head about building your own infrastructure versus depending on someone else’s. That being said, APIs come in real handy when you can wrap your hands around the infrastructure, e.g. a soft switch or PBX.
Great post, Phoneboy. Good to see you here. I am already waiting for a comment from Thomas Howe, as he is the pope of voice mashups but never could really convince me with his takeup theories.
Nice write up Dameon. There is a lot of buzz around Web Services for voice. Its great to see voice join the development mainstream from this standpoint. However, I completely agree that having an API is just a third of the three pronged challenge of developing innovative applications, maintaining a robust voice network, and attracting a critical mass of end-users in a win-win value chain that should be measured by usage and revenue generated for the developer.
Most players today are doing the easy stuff – APIs + handful of developers + toy mashups – and declaring victory. Getting to a critical mass (and I mean millions) of end-users almost inevitably means working with an incumbent Telco today, working through their constraints of operationalizing innovative applications at scale, and promoting them to an end user base that only knows of a vanilla POTS service. Only a brave few are focusing on the tremendous potential here. By the way, this is not just a technical problem. It is a huge business and marketing challenge as well.
Before you get too carried away judging this new category, you might want to check out Ribbit for Salesforce which was built using the Ribbit API. While it’s only been out since May, it’s getting great traction among Salesforce users for the way it integrates voice directly into the application to increase productivity. Intelligently integrating voice into the actual workflow effectively unlocks a new layer of value for the users, in this case, to the order of around $30.00+ a month.
We believe in the future, voice largely becomes a feature layer inside applications (“dumb” voice goes away over time). So it’s not as much about voice, as it is about intelligent integration and workflow automation. If you think of voice as an under-utilized data object, interesting things start to happen. We are convinced this is just the first example – we know because we have visibility into “work in progress” by other developers. Applications without voice integration will eventually become the exception. Check out Ribbit for Salesforce:
http://www.ribbit.com/salesforce/
Yes, I work for Ribbit :)
The voice API space you are talking about already exists at CTI – computer telephony integration.
Unless you were being sarcastic (since it seems to work so rarely), the ability to grab that info from the IVR – as well as other info such as your phone number via Caller ID and the number you called in on – and then pass it on to the contact center agent is well established and has been around for a number of years.
Businesses are already making call routing decisions based on this kind of data – whether you like it or not. An example would be a credit card company that I know of will have you enter your account number and if you are delinquent on your payment will always route you to the “why are you late on your payment” queue, regardless of what you were actually calling to do.
The two biggest players in the field are Cisco with the ICM/IPCC product and Alacatel with Genesys.
r.
Yes Markus, I couldn’t let this one go without comment…
http://thethomashowecompany.com/410/is-there-money-in-voice-apis
I wish I was the Pope, I’d tool around town in my Pope-mobile and my Prada shoes. That would be awesome. As it is, I have to hang around and make money from Voice APIs. Seriously, I’m no Pope. Think of me as slightly smaller Friar Tuck.
Here’s the thing (and you imply it in your comment): the question isn’t about the existence of money in Voice APIs. All the money is in APIs. The real question is who gets it and why, and how your company can get its share of it. Certainly there is money to be made providing APIs, as they are a basic enabler for everything that goes downstream of them, and they certainly, certainly, certainly are going to make some serious cheese. The question is how does the market mature, not if there’s a market. Rob is quite right – the market exists. VoiceAPIs makes it grow.
Dameon,
Nice piece – and congrats on posting on GigaOm. I agree with you and the others that have commented here that APIs *alone* will not bring the masses to your door. In fact, I’ve become increasingly disheartened by the sheer number of new voice APIs that are appearing… most of which are NOT based on open standards and are NOT interoperable. So developers are forced to learn Yet-Another-API in order to communicate with a given platform… and a voice application built for that platform can’t run on another platform or even communicate with that other platform.
I am a huge fan of voice mashups (in fact, I’ll be speaking about Voice Mashups using Open Standards next week at O’Reilly’s Open Source Convention(OSCON)) and I believe that we need more openness and more usage of APIs… but I also believe that for the overall industry to be successful, we need to ensure that those APIs provide interoperability and avoid vendor lock-in. Yes, a company will undoubtedly want to provide access through an API to their “special sauce”, whatever it is that makes their platform/service special… but that does, in my opinion, need to be balanced with using open standards so that developers don’t need to re-learn everything just to program for that service. The building blocks are already out there… the Session Initiation Protocol (SIP) for call signaling, RTP/SRTP for voice media, VoiceXML for voice applications (including IVR systems) and Call Control XML (CCXML) for an XML-based API to control signaling on top of something like SIP. These blocks just need to be used in the new APIs.
I’ve been writing about issues like this over on my Disruptive Telephony blog ( http://www.disruptivetelephony.com/ ) for some time, but since last fall I’ve also been employed by Voxeo ( http://www.voxeo.com/ ), a company providing a voice application platform since 2001. We’ve had over 30,000 developers create over 66,000 applications on our hosted platform (http://evolution.voxeo.com/ – which runs on our own computing cloud/infrastructure), and thousands more on our premise platform…. most all based on the open standards of VoiceXML, CCXML and SIP.
We’ve seen customers build some amazing voice applications that integrate deep into other business systems using these protocols and APIs. But you’re right, it’s not solely about the APIs… the APIs are just part of the overall tool set used to help customers solve problems.
Dan
P.S. As to your overall question about is there money in voice APIs, I can say that there *is* money in providing a platform for executing apps built on those open APIs. Voxeo’s been profitable (and growing) for now over 4 years…
Any historical discussion of this space should not fail to mention Angel.com They were founded around the same time as Voxeo and seem to have been growing steadily since 1999. I’m not affiliated with them but I remember looking at what they were doing 8 years ago and thinking they were way ahead of their time.