6 Comments

Summary:

Microsoft Azure, which started the week with kudos as the best cloud storage service, is down and out on Friday.

That’s life, as Frank Sinatra once sang. Microsoft Azure Storage was named the world’s best public cloud storage service on Tuesday, then crashes and burns on Friday.

Here are a few of the posts to the Windows Azure status dashboard: 

22-Feb-13  ·  9:45 PM UTC

Access Control v2, Service Bus, WindowsAzure.com and WebSites services are impacted by Storage service degradation worldwide. We are actively validating the recovery steps to resolve it as soon as possible. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.

22-Feb-13  ·  8:44 PM UTC

We are experiencing an issue with Storage Worldwide and this is impacting all dependent services. We are actively investigating this issue and working to resolve it as soon as possible. Further updates will be published to keep you apprised of the situation. We apologize for any inconvenience this causes our customers.

azure storage outageFolks on Twitter and elsewhere attributed the snafu to the lack of a new SSL certificate. If such a certificate does expire, users cannot authenticate against their various services: No authentication, no access.

Update: As of Saturday morning, this message was posted to the Azure status page — there was no timestamp so it is unclear when it posted. All of the storage areas affected on Friday still showed “service interruption” status.

On Friday, February 22 at 12:44 PM PST, Storage experienced a worldwide outage impacting HTTPS traffic due to an expired SSL certificate. This did not impact HTTP traffic. We have executed repair steps to update SSL certificate on the impacted clusters and have recovered to over 99% availability across all sub-regions. We will continue monitoring the health of the Storage service and SSL traffic for the next 24 hrs. Customers may experience intermittent failures during this period. We apologize for any inconvenience this causes our customers

I’ve asked Microsoft for comment and will update this when they do. Whatever the cause of the problem, it’s been an up-and-down week for Windows Azure. On Tuesday, Nasuni, a company that manages cloud storage for business customers, said Windows Azure storage outperformed all four other cloud services — including Amazon S3 —  in rigorous performance testing. Despite Azure’s performance, Nasuni said it would stick to S3 as its primary supplier, citing its maturity. Looks like that may have been the right call.

Well, as Sinatra sang: “Riding high in April, shot down in May.” Web time just accelerates the process.

This story was updated February 23 at 6:25 a.m. PDT with a newer statement from the Microsoft Azure status page.

  1. Most of our services are down coz of this funny problem..

    Share
  2. Why would anyone want to trust Microsoft with their cloud infrastructure when their OS has been notorious for bugs and security holes for decades?

    I don’t understand how Microsoft can consistently put out such buggy products.

    People used to say that Windows only had so many problems because it was so popular. Yet here it is in 2013 and I don’t see Apple and Google having near as many issues like that as Microsoft has had historically. The smartphone and tablet market is gigantic, and those products “just work”.

    It looks to me like Microsoft has a cultural problem that allows buggy code to be shipped instead of doing things the right way the first time.

    Share
    1. You should check how is the boss of Windows Azure! And what his career archievent in Microsoft!

      Share
    2. Apple and Google’s products “just work”? Apparently you have never owned an Android or iPhone or iPad. I’ve owned four over the last three years, and troubleshoot for end users in our company, and software glitches are just as prevalent on both Android and Apple, and do not rebooting, reinstalling, resetting, etc. Please do not bash one side without giving due bashing to what actually has the problem: Anything that is programmed by a human, i.e., all electronics.

      Share
  3. Reblogged this on Enterprise Computing Speedbumps and commented:
    Ouch!

    Share
  4. human error. bet the will be tracking cert expiration dates from now on..

    Share

Comments have been disabled for this post