Blog Post

Parts of Amazon Web Services suffer an outage

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

Updated: Amazon (s AMZN) web services are having trouble this evening and in the process are taking down some major sites and services. Among sites being impacted are Quora and HipChat. In addition, the Amazon outage has had an impact on Heroku, a division of Salesforce (s CRM).

Amazon is one of the key infrastructure providers to some of the biggest and many well known startups such as Pinterest and Dropbox. The outages were related to Amazon’s EC2 and RDS services and the problems it seemed were localized to Amazon’s Virginia datacenter. Other services in the North Virginia data center such as ElastiCache and Elastic Beanstalk were also impacted. The problem appears to be rooted in a power outage.

On their status website, regarding EC2 Amazon notes:

We continue to investigate this issue. We can confirm that there is both impact to volumes and instances in a single AZ in US-EAST-1 Region. We are also experiencing increased error rates and latencies on the EC2 APIs in the US-EAST-1 Region.

9:55 PM PDT We have identified the issue and are currently working to bring effected instances and volumes in the impacted Availability Zone back online. We continue to see increased API error rates and latencies in the US-East-1 Region.

On the issue of RDS problems, AWS notes:

9:33 PM PDT Some RDS DB Instances in a single AZ are currently unavailable. We are also experiencing increased error rates and latencies on the RDS APIs in the US-EAST-1 Region. We are investigating the issue.
10:05 PM PDT We have identified the issue and are currently working to bring the Availability Zone back online. At this time no Multi-AZ instances are unavailable.

00:11 AM PDT As a result of the power outage tonight in the US-EAST-1 region, some EBS volumes may have inconsistent data.
01:38 AM PDT Almost all affected EBS volumes have been brought back online. Customers should check the status of their volumes in the console. We are still seeing increased latencies and errors in registering instances with ELBs.

AWS has suffered outages in past. A widespread problem impacted major websites in April 2011. In July 2008, Amazon’s S3 service was offline and caused major problems for many of its customers. I have been in touch with folks from Amazon and Heroku to get better idea of what is going on. In the interim enjoy some of the tweets about the outage.

Image courtesy of Shutterstock user michaket.

14 Responses to “Parts of Amazon Web Services suffer an outage”

  1. Michel

    Thank you RUben for your sugestion, I have tested Lunacloud and it’s great (until now). I have made some tests and the performance is amazing :)

    Thank you once again!!

  2. James Ward

    Amazon has really started to annoy me these days. Apart from their immoral attempt to take over all things relating to buying and selling on the internet they now cause havoc by taking half the internet down with them. They have also recently entered the B2B market and begun to compete with sites such as Thomasnet and Daily Sales Exchange but as this news peice shows ( ) not all is well in the Amazon camp. Well i’m happy about that then :-)

  3. Amazon provided more detail about the outage early Saturday morning. Turns out startups that were deployed across multiple availability zones were not affected. Anyone have any idea what this kind of redundancy costs?

    • There’s no comparison to the EBS outage. This was an annoying glitch; something went blooey in a single AZ in US-EAST, but it was very, very minor compared what happended with EBS. That was a systemic infrastructure failure, but even then, applications that were properly designed for the AWS environment did not go down. The lesson is still to architect for failure; sites that haven’t grokked AWS’s strentghs and weaknesses at this point don’t do this don’t have much to gripe about. Not to mention, this is a vocal, engaged, but tiny part of the internet we’re talking about. It’s not like anything really bad happened.

  4. We got impacted too, outage impacted us in us east 1b, funny thing is that our paging service pagerduty also went down temporarily

    Outage lasted 15mins for us and now we are able to bring up boxes