IDNA and its impact on the global web

Get ready for a fracturing of the Internet domain name space and the global Internet looking far less homogenous and more like the ancient town of Babel. And a lot of it has to do with International Domain Names in Applications or IDNA.

The Internet and its user applications have long been dominated by the ASCII character set that is familiar to English speakers – or at least a combination of characters that are designed to be spoken and understood.

With the implementation of international domain names (officially International Domain Names in Applications or IDNA) in the root of the DNS by ICANN in the near future, it appears that your favorite applications (email, browser, IM, etc.) are about to be multilingual, potentially leading to a fracturing of the Internet domain name space and the global Internet looking far less homogenous and more like the ancient town of Babel.

I first learned of international domain names from a recent conversation with David Conrad, the General Manger of Internet Assigned Numbers Authority (IANA) at the Internet Corporation for Assigned Names and Numbers (ICANN). As David explained, IDNA is a mechanism for domain names to appear to contain non-ASCII characters.

This means that a business that operates in a geography (or wants to reach a community from a geography) that does not use ASCII characters would be able to represent themselves on the Internet by using their native character set. Without diving into the social and political issues that are far beyond my areas of expertise, this intuitively makes sense to me but has global significance on how we use domain names.

For example, if a local Chinese bakery wants to build a website, why should it have to come up with an ASCII character representation of its domain name, especially when most of its customers may speak Chinese and have a Chinese character keyboard? Users with a Chinese input device would type the proper characters into their browser and be directed to the bakery’s website.

For users without a Chinese input device who want to go to the same bakery, things get more complex. IDNA describes a manner where users enter into their application a Punycode, or a string that provides a way to translate non-ASCII to ASCII characters, defined by RFC3492. This mechanism is beneficial as it removes any concerns about backwards compatibility with existing Internet applications and services.

However, this means that the ASCII character user has to type a string that begins with “xn--” into a browser or email client if they want to reach the Chinese bakery (Wikipedia gives a good example of Punycode encoding for the Swiss domain bücher.ch into the Punycode xn--bcher-kva.ch). Today, both IE and Firefox support Punycode, as do other browsers and some email servers.

And here is where the Internet starts to resemble Babel. If IDNA takes root (no pun intended) and every country starts to use domain names with non-ASCII characters, how will those of us with ASCII input devices find these domain names and their associated Punycodes to enter into our applications?

How will we know that the Chinese bakery even exists? Better yet, how will someone with a Chinese input device be able to reach a website or email address in a different non-ASCII domain (such as Cyrillic, for example) if they cannot enter a Punycode string with the “xn—” characters?

In the end, I think that international domains make intuitive and practical sense even if they will force a change in user behavior. IDNA is leading us toward a fractured domain name space based on native language characters sets and while I am not a sociologist, this feels like the globally correct thing to do.

Anyone have a business plan for a Punycode Babel fish that I can fund?

Allan Leinwand is a venture partner with Panorama Capital and founder of Vyatta. He was also the CTO of Digital Island.

loading

Comments have been disabled for this post