Weekly Update

Customers to Cloud Providers: ‘Fess Up, and Fix It

A couple of times over the past two months, I brushed aside concerns over failures and outages at Google and Amazon Web Services. But that was then, and this is now. This week, failures at Rackspace and again at Google, this time affecting App Engine, have even me a little worried about the future of cloud computing. What strikes me hardest, however, are the reactions to Rackspace’s and Google’s respective explanations of their failures. These are new vendors selling a new paradigm, and users expect a new type of customer experience.

Rackspace received praise for Tweeting throughout the entirety of its outage (which took down numerous popular web sites), but it also received criticism for being disingenuous. At least one customer was appalled that Rackspace tried to pass this off as a freak occurrence without offering up plans to ensure it never happens again. Of course, this is not the first Rackspace outage — a 2007 outage, too, knocked out many popular sites. (Rackspace’s eponymous cloud platform was not affected by this week’s outage, but the line between hosting and cloud computing is blurring.)

As for Google, well, it has been a tough year availability-wise. When App Engine went down on Thursday, Google was criticised for the impersonal manner in which it dealt with customer concerns. This has some, including Stacey Higginbotham at GigaOM, wondering whether Google has what it takes to attract and keep enterprise customers.

The problem with the responses is this: Cloud computing is about revolting from old IT practices, which include customer service and problem resolution as well as provisioning machines. Letting customers spin up machines on demand using only a credit card, and charging them by the drink, has freed users from the slowness, inefficiency and bureaucracy that define many traditional procurement models. Apparently, this freedom has catalyzed a change in expectations across the board.

When problems arise in the cloud, users expect transparency from their providers, and they expect solutions. They want to hear, “We messed up, and this is how. It won’t happen again, and this is why.” This might be a far cry from the traditional problem-resolution methodology of vague explanations and ad hoc bug fixes, but, done right, cloud computing is a far cry from traditional.

Question of the week

How does the recent spate of outages affect your opinion of cloud reliability?