Amazon reboot causes a tempest on Twitter
Amazon is planning to reboot an undetermined number of customer EC2 instances in the coming week or so, something that caused a stir among Amazon watchers today.
Randy Bias, CTO and founder of Cloudscaling, an authority on cloud services, first reported what appeared to be the scheduled reboot of ”hundreds to thousands” of EC2 instances early Wednesday afternoon and followed his initial Twitter post and blog with live updates.
A check of the Amazon dashboard at 3:45 p.m. EST showed operations running normally except in the Northern Virginia data center which reported “problems with ELB scaling and provisioning” for the eastern U.S. region. ELB is elastic load balancing which distributes a user’s application load across multiple EC2 instances.
An Amazon spokeswoman said via email that EC2 services are running normally and that the reboots are scheduled and customers are informed of them in advance. The company upgrades the EC2 frequently, often in a way that is invisible to customers, but some updates require that instances be restarted, she said. “That’s what you’re seeing discussed today as we’ve sent customers the schedule.”
Customers can also restart their instances earlier than the scheduled time if they prefer.
“This rollout schedule matches pretty closely with the maintenance schedule you might see from traditional hosting providers or internal IT groups when they roll out software patches or updates. We are careful not to perform updates to multiple Availability Zones in the same Region on the same day so that customers won’t have instances in different Availability Zones update on the same day. We’re also giving notice a few days in advance for this maintenance window,” the spokeswoman added.
Bias published a redacted letter Amazon sent to a customer on December 6 with reboots slated to begin on December 15.
Amazon did not specify what the patches addressed, although some Amazon watchers think they may have to do with Xen hypervisor issues.
Photo courtesy of Flickr user liber.
Related research and analysis from GigaOM Pro:
Subscriber content. Sign up for a free trial.
-
Comments
9
FeedbackCloudscaling builds large IaaS clouds, chiefly for service providers. They are not “a big user of Amazon services.” And the basis of Randy’s post — and subsequent tweets — was that he’s impressed that AWS can manage such a massive undertaking with minimal disturbance to customers.
agreed that randy (and others) later gave amazon high marks for handling this … Amazon got their due in the post i think given the spokeswoman’s comments. I will double ck on the characterization of cloudscaling . thanks for your comment.
We have been affected by the reboot. And have worked with Amazon to minimize impact. I am not sure why this is worthy of even a post. Par-for-the course large install systems operations.
thanks for the comment. When you say affected–assume you mean that they did the reboot already and it was a non issue? How do you work w/ amazon to minimize impact? i’m curious, not trying to be unfair. People w/ their own datacenters have similar issues i know.
Instead of rebooting, we cloned our instances and just moved the IPs over resulting in virtually no down time.
do you know what the underlying issue was? Was it Xen related?
If you are interested by Cloud Issues with the Patriot Act or Amazon downtime, you can take a look at this blog : http://iwgcr.wordpress.com/
I will provide some interesting studies about cloud resilience within next year !
We too were (and continue to be) affected by the reboot schedule. Amazon seems to be working the reboot schedule from the west coast eastward. However, because of the way our database platform is globally distributed, our customers don’t have to lose access to their databases during times like these – or even during failures. While these events take time to think through (at least on larger systems), what fabianschonholz said is correct – it’s par for the course in cloud computing.
you say it’s par for the course but does this appear to be a bigger reboot than has been typical for Amazon in the past? That’s the perception i came away with. thanks for your comment.