<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Why Amazon Went Down, and Why It Matters</title>
	<atom:link href="http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/</link>
	<description>Trusted Insights and Conversations on the Next Wave of Technology</description>
	<lastBuildDate>Thu, 26 Nov 2009 15:23:00 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Amazon Was Out For The Count &#171; Web News and Practical websites</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-963872</link>
		<dc:creator>Amazon Was Out For The Count &#171; Web News and Practical websites</dc:creator>
		<pubDate>Sun, 02 Aug 2009 14:57:55 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-963872</guid>
		<description>&lt;p&gt;[...] So what exactly happened. Well here are the facts from Gigaom [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] So what exactly happened. Well here are the facts from Gigaom [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Can Today&#8217;s Hardware Handle the Cloud? - GigaOM</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-886050</link>
		<dc:creator>Can Today&#8217;s Hardware Handle the Cloud? - GigaOM</dc:creator>
		<pubDate>Fri, 27 Jun 2008 15:48:32 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-886050</guid>
		<description>&lt;p&gt;[...] isn’t the first time load balancers have been implicated in an outage at Amazon. At O’Reilly’s Velocity conference, conference co-chair Jesse Robbins talked about a [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] isn’t the first time load balancers have been implicated in an outage at Amazon. At O’Reilly’s Velocity conference, conference co-chair Jesse Robbins talked about a [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Milly Dawson</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-884514</link>
		<dc:creator>Milly Dawson</dc:creator>
		<pubDate>Thu, 19 Jun 2008 19:03:04 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-884514</guid>
		<description>&lt;p&gt;It&#039;s too bad about Amazon going down for a spell but maybe it inspired someone or several someones to check out their local library. I hope so.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>It&#8217;s too bad about Amazon going down for a spell but maybe it inspired someone or several someones to check out their local library. I hope so.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: William Nortman</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-883622</link>
		<dc:creator>William Nortman</dc:creator>
		<pubDate>Fri, 13 Jun 2008 18:14:23 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-883622</guid>
		<description>&lt;p&gt;So how do you manage and diagnose such complex systems? You are probably talking about 1,000s of devices that all could be the root cause of the issue. How do you isolate the root cause? Classically, you’d have some type of monitor with rules to detect when certain issues occurred.  This MIGHT point you to the right location. However, with IT being the main differentiator to the end customer, new changes are constantly being rolled out. The system is always growing, changing and the customer usage patterns are always altering over time. So the rules originally written tend to get lightened up so that you don’t have alert storms. Now with the rules loosened up, you might not detect the failure and, even if you do, the events will not be as helpful in detecting the root cause. You could throw more and more smart people at the problem and constantly update and maintain your set of rules. However, as the system gets more and more complex this human cost will grow at an enormous rate. You need something different, a tool that automatically detects issues and adjusts to the changing system and usage patterns. You need a tool that uses statistical analytics to weed through the noise of the system and determines the relationship between the business information and the IT information in order to allow you to quickly get to the root cause of issues like this. If you got a single alert that told you that the load balances were getting a higher than average reconnect rate, your number of sales was dropping below normal, your average load on all web servers was way below normal, the number of search transactions was way below normal, the normal of connected users was way below normal etc., you would be able to have a quick idea on where to start.  No need to experience all your line of defenses being down again.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>So how do you manage and diagnose such complex systems? You are probably talking about 1,000s of devices that all could be the root cause of the issue. How do you isolate the root cause? Classically, you’d have some type of monitor with rules to detect when certain issues occurred.  This MIGHT point you to the right location. However, with IT being the main differentiator to the end customer, new changes are constantly being rolled out. The system is always growing, changing and the customer usage patterns are always altering over time. So the rules originally written tend to get lightened up so that you don’t have alert storms. Now with the rules loosened up, you might not detect the failure and, even if you do, the events will not be as helpful in detecting the root cause. You could throw more and more smart people at the problem and constantly update and maintain your set of rules. However, as the system gets more and more complex this human cost will grow at an enormous rate. You need something different, a tool that automatically detects issues and adjusts to the changing system and usage patterns. You need a tool that uses statistical analytics to weed through the noise of the system and determines the relationship between the business information and the IT information in order to allow you to quickly get to the root cause of issues like this. If you got a single alert that told you that the load balances were getting a higher than average reconnect rate, your number of sales was dropping below normal, your average load on all web servers was way below normal, the number of search transactions was way below normal, the normal of connected users was way below normal etc., you would be able to have a quick idea on where to start.  No need to experience all your line of defenses being down again.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Links List 6.13.08 &#124; IT's About Uptime - The StackSafe Blog</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-883583</link>
		<dc:creator>Links List 6.13.08 &#124; IT's About Uptime - The StackSafe Blog</dc:creator>
		<pubDate>Fri, 13 Jun 2008 14:49:46 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-883583</guid>
		<description>&lt;p&gt;[...] light of Amazon’s latest downtime issues, Gigaom explains why Amazon went down and why it matters. In a thorough explanation, Gigaom bets the problem to be with the CDN or AFE. The moral of their [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] light of Amazon’s latest downtime issues, Gigaom explains why Amazon went down and why it matters. In a thorough explanation, Gigaom bets the problem to be with the CDN or AFE. The moral of their [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: A Tale of Two Outages &#124; IT's About Uptime - The StackSafe Blog</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-883246</link>
		<dc:creator>A Tale of Two Outages &#124; IT's About Uptime - The StackSafe Blog</dc:creator>
		<pubDate>Wed, 11 Jun 2008 16:20:39 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-883246</guid>
		<description>&lt;p&gt;[...] downtime incidents crossed our paths recently that we thought deserved comment. You probably have read about the first (Amazon), but maybe missed the second (Southern Company Nuclear Power [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] downtime incidents crossed our paths recently that we thought deserved comment. You probably have read about the first (Amazon), but maybe missed the second (Southern Company Nuclear Power [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Vito</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-883231</link>
		<dc:creator>Vito</dc:creator>
		<pubDate>Wed, 11 Jun 2008 14:37:41 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-883231</guid>
		<description>&lt;p&gt;I still have the headers in my scroll back buffer:&lt;/p&gt;

&lt;p&gt;$ wget -S http://www.amazon.com -O /dev/null
--14:22:58--  http://www.amazon.com/
           =&gt; `/dev/null&#039;
Resolving www.amazon.com... 72.21.210.11
Connecting to www.amazon.com&#124;72.21.210.11&#124;:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 503 Service Unavailable
  Server: NS_6.1
  Content-Length:62
  Connection: close
14:22:58 ERROR 503: Service Unavailable.&lt;/p&gt;

&lt;p&gt;It appears that they are indeed running Citrix Netscalers (Server: NS_6.1) which is what returned the 503 error you see above.&lt;/p&gt;

&lt;p&gt;&quot;Sorry&quot; pages only work if configured; they are not a default.  Maybe Amazon hasn&#039;t gotten around to that.  ;)&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>I still have the headers in my scroll back buffer:</p>

<p>$ wget -S <a href="http://www.amazon.com" rel="nofollow">http://www.amazon.com</a> -O /dev/null
&#8211;14:22:58&#8211;  <a href="http://www.amazon.com/" rel="nofollow">http://www.amazon.com/</a>
           =&gt; `/dev/null&#8217;
Resolving <a href="http://www.amazon.com.." rel="nofollow">http://www.amazon.com..</a>. 72.21.210.11
Connecting to <a href="http://www.amazon.com" rel="nofollow">http://www.amazon.com</a>|72.21.210.11|:80&#8230; connected.
HTTP request sent, awaiting response&#8230;
  HTTP/1.1 503 Service Unavailable
  Server: NS_6.1
  Content-Length:62
  Connection: close
14:22:58 ERROR 503: Service Unavailable.</p>

<p>It appears that they are indeed running Citrix Netscalers (Server: NS_6.1) which is what returned the 503 error you see above.</p>

<p>&#8220;Sorry&#8221; pages only work if configured; they are not a default.  Maybe Amazon hasn&#8217;t gotten around to that.  ;)</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Fortresses in the Clouds: On-demand Platforms Had Better Build Moats - GigaOM</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-883103</link>
		<dc:creator>Fortresses in the Clouds: On-demand Platforms Had Better Build Moats - GigaOM</dc:creator>
		<pubDate>Tue, 10 Jun 2008 23:18:03 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-883103</guid>
		<description>&lt;p&gt;[...] Friday, Amazon’s U.S. site went off the air, and later some of its other properties were unavailable. Lots of folks who wouldn’t let me quote [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] Friday, Amazon’s U.S. site went off the air, and later some of its other properties were unavailable. Lots of folks who wouldn’t let me quote [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: michaelportent</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-883086</link>
		<dc:creator>michaelportent</dc:creator>
		<pubDate>Tue, 10 Jun 2008 21:02:47 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-883086</guid>
		<description>&lt;p&gt;@Mel: I think that rumor is hilarious too. It wouldn&#039;t surprise me that a bunch of gamers writing scripts to auto-buy available items could crash Amazon. That would be awesome.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>@Mel: I think that rumor is hilarious too. It wouldn&#8217;t surprise me that a bunch of gamers writing scripts to auto-buy available items could crash Amazon. That would be awesome.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Janke</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882997</link>
		<dc:creator>Michael Janke</dc:creator>
		<pubDate>Tue, 10 Jun 2008 12:41:35 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882997</guid>
		<description>&lt;p&gt;@john adams:  Netscaler load balancers (as Amazon is rumored to use) can re-direct to a &#039;Sorry Page&#039; when all backend services are down. It&#039;s a simple config, and we require it on all our load balanced services.&lt;/p&gt;

&lt;p&gt;That, and the fact that the HTTP error page that was presented looks just like the one generated by Netscalers when operating in proxy mode, indicates to me that the load balancer layer was up &amp; functional, but there was nothing behind it to which to send traffic, and that there was no redirect enabled.&lt;/p&gt;

&lt;p&gt;Having said that, re-directing a high volume site to a sorry page is a challenge itself. We maintain a load balanced pool of servers on a separate pair of load balancers, just to handle the sorry page from the load balanced applications.&lt;/p&gt;

&lt;p&gt;--Mike&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>@john adams:  Netscaler load balancers (as Amazon is rumored to use) can re-direct to a &#8216;Sorry Page&#8217; when all backend services are down. It&#8217;s a simple config, and we require it on all our load balanced services.</p>

<p>That, and the fact that the HTTP error page that was presented looks just like the one generated by Netscalers when operating in proxy mode, indicates to me that the load balancer layer was up &amp; functional, but there was nothing behind it to which to send traffic, and that there was no redirect enabled.</p>

<p>Having said that, re-directing a high volume site to a sorry page is a challenge itself. We maintain a load balanced pool of servers on a separate pair of load balancers, just to handle the sorry page from the load balanced applications.</p>

<p>&#8211;Mike</p>]]></content:encoded>
	</item>
	<item>
		<title>By: the application delivery network &#187; Load balancers don't have what???</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882903</link>
		<dc:creator>the application delivery network &#187; Load balancers don't have what???</dc:creator>
		<pubDate>Mon, 09 Jun 2008 19:45:37 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882903</guid>
		<description>&lt;p&gt;[...] have what??? Posted by: The ADC in Events, Opinion   Alistair Croll over at Gigaom had an interesting dissection of Amazon&#8217;s recent outage and an interesting deduction based on the facts regarding what went [...]&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>[...] have what??? Posted by: The ADC in Events, Opinion   Alistair Croll over at Gigaom had an interesting dissection of Amazon&#8217;s recent outage and an interesting deduction based on the facts regarding what went [...]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: BloggerBen</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882897</link>
		<dc:creator>BloggerBen</dc:creator>
		<pubDate>Mon, 09 Jun 2008 19:31:36 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882897</guid>
		<description>&lt;p&gt;This may be a dumb question, but did the Amazon cloud go down too?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>This may be a dumb question, but did the Amazon cloud go down too?</p>]]></content:encoded>
	</item>
	<item>
		<title>By: John Franks</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882893</link>
		<dc:creator>John Franks</dc:creator>
		<pubDate>Mon, 09 Jun 2008 19:05:44 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882893</guid>
		<description>&lt;p&gt;Check out David Scott&#039;s interview at the Business Forum:  http://www.businessforum.com/DScott_02.html.  It seems Amazon (and a lot of other organizations suffering data thefts, outages, bad projects, etc.) needs his book!&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Check out David Scott&#8217;s interview at the Business Forum:  <a href="http://www.businessforum.com/DScott_02.html" rel="nofollow">http://www.businessforum.com/DScott_02.html</a>.  It seems Amazon (and a lot of other organizations suffering data thefts, outages, bad projects, etc.) needs his book!</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Lori MacVittie</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882892</link>
		<dc:creator>Lori MacVittie</dc:creator>
		<pubDate>Mon, 09 Jun 2008 19:01:33 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882892</guid>
		<description>&lt;p&gt;@John Adams&lt;/p&gt;

&lt;p&gt;I think you haven&#039;t been looking around very much. Modern load balancers (application delivery controllers) a la F5 BIG-IP, have configurable &quot;apology&quot; pages when all nodes are down. This technology has been around for quite a while, it&#039;s not something new or unknown in the industry at all.&lt;/p&gt;

&lt;p&gt;Lori&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>@John Adams</p>

<p>I think you haven&#8217;t been looking around very much. Modern load balancers (application delivery controllers) a la F5 BIG-IP, have configurable &#8220;apology&#8221; pages when all nodes are down. This technology has been around for quite a while, it&#8217;s not something new or unknown in the industry at all.</p>

<p>Lori</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Avneesh Balyan</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882879</link>
		<dc:creator>Avneesh Balyan</dc:creator>
		<pubDate>Mon, 09 Jun 2008 17:21:37 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882879</guid>
		<description>&lt;p&gt;Seems like, The site is down again....&lt;/p&gt;

&lt;p&gt;I was thinking of Amazon as Google in Shopping domain (item search, reviews etc..).
Time to re-tink???&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Seems like, The site is down again&#8230;.</p>

<p>I was thinking of Amazon as Google in Shopping domain (item search, reviews etc..).
Time to re-tink???</p>]]></content:encoded>
	</item>
	<item>
		<title>By: robotthink</title>
		<link>http://gigaom.com/2008/06/06/why-amazon-went-down-and-what-it-means-to-you/#comment-882737</link>
		<dc:creator>robotthink</dc:creator>
		<pubDate>Sun, 08 Jun 2008 21:28:44 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=13706#comment-882737</guid>
		<description>&lt;p&gt;Alastair and Geo,&lt;/p&gt;

&lt;p&gt;It most certainly was a DOS attack, I assure you.&lt;/p&gt;

&lt;p&gt;And Seattle-ite is right about everything he says in his post, and that was in part why the DOS was possible.&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Alastair and Geo,</p>

<p>It most certainly was a DOS attack, I assure you.</p>

<p>And Seattle-ite is right about everything he says in his post, and that was in part why the DOS was possible.</p>]]></content:encoded>
	</item>
</channel>
</rss>
