<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Why are mobile networks dropping like flies?</title>
	<atom:link href="http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/feed/" rel="self" type="application/rss+xml" />
	<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/</link>
	<description></description>
	<lastBuildDate>Sat, 25 May 2013 16:05:44 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
	<item>
		<title>By: Jacinta</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-870064</link>
		<dc:creator><![CDATA[Jacinta]]></dc:creator>
		<pubDate>Sat, 28 Jul 2012 15:51:30 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-870064</guid>
		<description><![CDATA[I completely agree with those who say some more background is required to write such an article..otherwise nothing but random pieces of information are put together...being of little use,since technically the whole thing doesn&#039;t make any sense!!
What you&#039;re saying here is that there was an increment of signalling messages that lead to a network crash..mm...sounds familiar to me for in 2001 vodafone spain network crashed for a similar reason...and tadaaa!! There were no smartphones on scene.
Whenever a new architectural design takes place for core nodes the whole thing is at stake. HLR architecture has remained the same for decades.Eventhough they&#039;ve always been cared of, they were old friends with network designers and engineering.Now, when the different subscriber&#039;s profiles (LTE,IMS, GSM,FNR...)are doomed to become together in a single database ( the so called next generation HLR, splitting FE from DDBB) the unstable behaviours of new designs long ago forgotten for HLR, have arisen.
Now, resilience has to be strenghtened in all vendors&#039; Next generation HLR. Meanwhile, let&#039;s hope no other similar case arises.
If it were only a matter of increasing capacity for signalling, i must say all operators are already skilled and used to it.
Future evolution to ip is a different topic to discuss on, but nothing to do with the latter.]]></description>
		<content:encoded><![CDATA[<p>I completely agree with those who say some more background is required to write such an article..otherwise nothing but random pieces of information are put together&#8230;being of little use,since technically the whole thing doesn&#8217;t make any sense!!<br />
What you&#8217;re saying here is that there was an increment of signalling messages that lead to a network crash..mm&#8230;sounds familiar to me for in 2001 vodafone spain network crashed for a similar reason&#8230;and tadaaa!! There were no smartphones on scene.<br />
Whenever a new architectural design takes place for core nodes the whole thing is at stake. HLR architecture has remained the same for decades.Eventhough they&#8217;ve always been cared of, they were old friends with network designers and engineering.Now, when the different subscriber&#8217;s profiles (LTE,IMS, GSM,FNR&#8230;)are doomed to become together in a single database ( the so called next generation HLR, splitting FE from DDBB) the unstable behaviours of new designs long ago forgotten for HLR, have arisen.<br />
Now, resilience has to be strenghtened in all vendors&#8217; Next generation HLR. Meanwhile, let&#8217;s hope no other similar case arises.<br />
If it were only a matter of increasing capacity for signalling, i must say all operators are already skilled and used to it.<br />
Future evolution to ip is a different topic to discuss on, but nothing to do with the latter.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dennis Foster</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-868251</link>
		<dc:creator><![CDATA[Dennis Foster]]></dc:creator>
		<pubDate>Tue, 24 Jul 2012 15:57:16 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-868251</guid>
		<description><![CDATA[Kevin,
It would be interesting to see which of the carriers that have deployed Diameter core networks have tested their network architecture in the lab and which have not.]]></description>
		<content:encoded><![CDATA[<p>Kevin,<br />
It would be interesting to see which of the carriers that have deployed Diameter core networks have tested their network architecture in the lab and which have not.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-865408</link>
		<dc:creator><![CDATA[Dan]]></dc:creator>
		<pubDate>Wed, 18 Jul 2012 08:39:31 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-865408</guid>
		<description><![CDATA[So here&#039;s the thing.  Networks run and then they fall over, which suggests that something went wrong whether that is a software upgrade to a critical node, a new technology being introduced that isn&#039;t quite as well understood at scale as it might be, or a threshold of load gets surpassed.  The thing about networks is that they are pretty good at resisting the minor issues - network nodes get deployed with redundancy built in to componentry and when a node is critical, with geo-redundancy and warm stand by.  So when something breaks, the network generally adapts and no one notices unless you happen to be sat in the NOC.

That equally means, when something goes wrong that people do notice, it is usually something huge.  Kevin might have got it wrong by blaming Diameter for everything, but he is right on the money for Verizon.  Diameter is a protocol for AAA, but it is being used in LTE for something slightly skewed from that, and Verizon&#039;s LTE network is the biggest, shiniest LTE network there is right now.  If anyone was going to find the bugs that come with stress testing Diameter as a protocol and the scalability of Diameter interfaces, it was them.  The were bitten by it, they have deployed a solution.  No one thinks Verizon&#039;s LTE subscriber numbers have flatlined, so we can only assume that whatever they have done to rectify the issue has solved the problem.

For other outages, there is no Diameter in play.  That means something else has gone wrong.  Because the issues have been non-geographic, it suggests they are not related to access networks and so they are more likely core nodes that have fallen over.  A duff HLR upgrade has been reported as being the culprit in Orange, which would explain why it impacted pretty much everyone.  O2&#039;s outage affected only some customers so it could be something more involving core network signalling interfaces.  The problem being that when a core node dies in a big and spectacular way, there can be a domino effect where other nodes try to take up the traffic load, but get swamped by signalling which then causes those nodes to either back traffic off or close up shop themselves.

However, I do think Kevin&#039;s point on traffic load is a valid one.  Networks are still engineered in many cases on the basis of some old world thinking - phones make phone calls, send and receive text messages and attach to the internet when the customer wants them to.  This isn&#039;t true anymore and whilst all of these use models still exist, network signalling has massively increased because smartphones have &#039;a mind of their own&#039;, attach and detach from networks at the behest of applications and do this often multiple times for each individual application the device has running.  It is not the VLR that is creating the signalling load but the SGSN and GGSN, both between themselves and towards the HLR.  There has been work done to try and offset some of this signalling, but that doesn&#039;t change the fact that more device are being used to attach to networks to do more things more often.

Whether it is Diameter, MAP, GTP-C or, in the futue in all likelihood, SIP, there is going to be more signalling traffic and networks need to adjust their engineering principles to account for that.  Many have - the famous outages of AT&amp;T in Manhattan and O2 in London of a few years back have not been repeated for a little while - but throw in LTE, plus the potential step-change in connections that M2M is suggested to create, and it will be signalling as well as data that needs to be considered in terms of designing a network that scales for the purpose of supporting all traffic from all sources.]]></description>
		<content:encoded><![CDATA[<p>So here&#8217;s the thing.  Networks run and then they fall over, which suggests that something went wrong whether that is a software upgrade to a critical node, a new technology being introduced that isn&#8217;t quite as well understood at scale as it might be, or a threshold of load gets surpassed.  The thing about networks is that they are pretty good at resisting the minor issues &#8211; network nodes get deployed with redundancy built in to componentry and when a node is critical, with geo-redundancy and warm stand by.  So when something breaks, the network generally adapts and no one notices unless you happen to be sat in the NOC.</p>
<p>That equally means, when something goes wrong that people do notice, it is usually something huge.  Kevin might have got it wrong by blaming Diameter for everything, but he is right on the money for Verizon.  Diameter is a protocol for AAA, but it is being used in LTE for something slightly skewed from that, and Verizon&#8217;s LTE network is the biggest, shiniest LTE network there is right now.  If anyone was going to find the bugs that come with stress testing Diameter as a protocol and the scalability of Diameter interfaces, it was them.  The were bitten by it, they have deployed a solution.  No one thinks Verizon&#8217;s LTE subscriber numbers have flatlined, so we can only assume that whatever they have done to rectify the issue has solved the problem.</p>
<p>For other outages, there is no Diameter in play.  That means something else has gone wrong.  Because the issues have been non-geographic, it suggests they are not related to access networks and so they are more likely core nodes that have fallen over.  A duff HLR upgrade has been reported as being the culprit in Orange, which would explain why it impacted pretty much everyone.  O2&#8242;s outage affected only some customers so it could be something more involving core network signalling interfaces.  The problem being that when a core node dies in a big and spectacular way, there can be a domino effect where other nodes try to take up the traffic load, but get swamped by signalling which then causes those nodes to either back traffic off or close up shop themselves.</p>
<p>However, I do think Kevin&#8217;s point on traffic load is a valid one.  Networks are still engineered in many cases on the basis of some old world thinking &#8211; phones make phone calls, send and receive text messages and attach to the internet when the customer wants them to.  This isn&#8217;t true anymore and whilst all of these use models still exist, network signalling has massively increased because smartphones have &#8216;a mind of their own&#8217;, attach and detach from networks at the behest of applications and do this often multiple times for each individual application the device has running.  It is not the VLR that is creating the signalling load but the SGSN and GGSN, both between themselves and towards the HLR.  There has been work done to try and offset some of this signalling, but that doesn&#8217;t change the fact that more device are being used to attach to networks to do more things more often.</p>
<p>Whether it is Diameter, MAP, GTP-C or, in the futue in all likelihood, SIP, there is going to be more signalling traffic and networks need to adjust their engineering principles to account for that.  Many have &#8211; the famous outages of AT&amp;T in Manhattan and O2 in London of a few years back have not been repeated for a little while &#8211; but throw in LTE, plus the potential step-change in connections that M2M is suggested to create, and it will be signalling as well as data that needs to be considered in terms of designing a network that scales for the purpose of supporting all traffic from all sources.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: spotsville</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-864824</link>
		<dc:creator><![CDATA[spotsville]]></dc:creator>
		<pubDate>Mon, 16 Jul 2012 16:43:37 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-864824</guid>
		<description><![CDATA[You are right ..he doesn&#039;t know what he is talking about. The HSPA networks use HLR/VLR&#039;s but the LTE networks use HSS systems as specified in IMS.  In fact the VZ outage was caused by Diameter signalling overload in the IMS PCRF. In fact Tekelec was almost certainly partially to blame along with the vendor of the CSCF proxy.]]></description>
		<content:encoded><![CDATA[<p>You are right ..he doesn&#8217;t know what he is talking about. The HSPA networks use HLR/VLR&#8217;s but the LTE networks use HSS systems as specified in IMS.  In fact the VZ outage was caused by Diameter signalling overload in the IMS PCRF. In fact Tekelec was almost certainly partially to blame along with the vendor of the CSCF proxy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tsahi Levent-Levi</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-864464</link>
		<dc:creator><![CDATA[Tsahi Levent-Levi]]></dc:creator>
		<pubDate>Sun, 15 Jul 2012 06:39:19 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-864464</guid>
		<description><![CDATA[It is interesting to see how these networks cope better than their cloud counterparts: all major cloud providers had an outage in the past year or so - Amazon had one just last week.
The complexity of both is rather similar, but we tend to forget it and think at the carrier networks as things that must never fail - a bit more than we do for cloud providers.
This will probably change as we start relying on cloud providers more with each passing day.]]></description>
		<content:encoded><![CDATA[<p>It is interesting to see how these networks cope better than their cloud counterparts: all major cloud providers had an outage in the past year or so &#8211; Amazon had one just last week.<br />
The complexity of both is rather similar, but we tend to forget it and think at the carrier networks as things that must never fail &#8211; a bit more than we do for cloud providers.<br />
This will probably change as we start relying on cloud providers more with each passing day.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Telco Engineer</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-864222</link>
		<dc:creator><![CDATA[Telco Engineer]]></dc:creator>
		<pubDate>Sat, 14 Jul 2012 06:51:06 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-864222</guid>
		<description><![CDATA[You clearly don&#039;t understand what you are talking about and are just regurgitating some sales spin sold you by a bunch of telco vendors looking to push their kit.

In these failures it generally was one of the HLR (most networks have multiple ones and users are stored on a specific node) this is why the affect was seen accross a the whole network but only a subset of customers.

However any case that it is related to the amount of smartphone traffic &amp; apps is just jumping on the latest trend in order to blame that.
The HLR is involved in authentication and mobility management but not in traffic handling, an massive increase in data traffic though the packet core would have very little increase in load on the HLR the only thing that would increase HLR signalling would be users moving between VLR&#039;s which are roughly the area of a city.

In addition the newer HLR&#039;s are orders of magnitude larger than their predecessors and therefore scaled to handle the kind of subscriber volumes of modern networks. The latest generation of systems from people like Nokia &amp; Ericsson are in fact clustered systems with multiple front ends to handle the signalling from the network and multiple backed databases to hold the records. These are usually then geographically distributed across the operators sites. These are extremely resilient but then downside is that if the entire cluster fails it can take longer to restore, also they hold a lot more users in a single cluster than the traditional single platform solutions so an outage of one node rarer but affects a bigger section of your customer base.]]></description>
		<content:encoded><![CDATA[<p>You clearly don&#8217;t understand what you are talking about and are just regurgitating some sales spin sold you by a bunch of telco vendors looking to push their kit.</p>
<p>In these failures it generally was one of the HLR (most networks have multiple ones and users are stored on a specific node) this is why the affect was seen accross a the whole network but only a subset of customers.</p>
<p>However any case that it is related to the amount of smartphone traffic &amp; apps is just jumping on the latest trend in order to blame that.<br />
The HLR is involved in authentication and mobility management but not in traffic handling, an massive increase in data traffic though the packet core would have very little increase in load on the HLR the only thing that would increase HLR signalling would be users moving between VLR&#8217;s which are roughly the area of a city.</p>
<p>In addition the newer HLR&#8217;s are orders of magnitude larger than their predecessors and therefore scaled to handle the kind of subscriber volumes of modern networks. The latest generation of systems from people like Nokia &amp; Ericsson are in fact clustered systems with multiple front ends to handle the signalling from the network and multiple backed databases to hold the records. These are usually then geographically distributed across the operators sites. These are extremely resilient but then downside is that if the entire cluster fails it can take longer to restore, also they hold a lot more users in a single cluster than the traditional single platform solutions so an outage of one node rarer but affects a bigger section of your customer base.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave Mackey</title>
		<link>http://gigaom.com/2012/07/13/why-are-mobile-networks-dropping-like-flies/#comment-864118</link>
		<dc:creator><![CDATA[Dave Mackey]]></dc:creator>
		<pubDate>Fri, 13 Jul 2012 23:10:35 +0000</pubDate>
		<guid isPermaLink="false">http://gigaom.com/?p=542576#comment-864118</guid>
		<description><![CDATA[I was just going to ask if there isn&#039;t some sort of anycast equivalent for cellular - sounds like diameter may be it.]]></description>
		<content:encoded><![CDATA[<p>I was just going to ask if there isn&#8217;t some sort of anycast equivalent for cellular &#8211; sounds like diameter may be it.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
