One of the biggest questions out of the NSA snooping controversy was how much 9 tech vendors — Microsoft(s msft), Yahoo(s yhoo), Google(s goog), Facebook(s fb), PalTalk, AOL, Skype( msft), YouTube(s goog) and Apple(s aapl) — knew about a National Security Administration program for snooping on their users data.
They all denied — in carefully worded ways — that they provided direct access to customer information. The source of the original story — ostensibly leaked NSA slides obtained by the Washington Post and The Guardian — indicated that the National Security Agency tapped directly into these company servers to get at customer meta data.
In a radio interview Friday on WBUR, Post reporter Barton Gellman said it’s more likely that the slide was poorly worded and that the NSA placed its own “black boxes” on vendor property next to the servers in question. Those black boxes could mirror the server and be queried as a proxy while giving those vendors plausible deniability if asked whether their own servers had been accessed.
Obviously, if that is the case it’s hard for vendors to plead ignorance to what was going on. But there are ways the government could harvest people’s Google and Facebook and other data without those vendors knowing.
First, they could eavesdrop on the HTTP traffic flowing over the internet — which is not usually encrypted. Or there could be a covert back door into these services themselves, something that Jon Oltsik, senior principal analyst at Enterprise Strategy Group finds hard to believe.
And, there have been reports that government agencies are indeed collecting data provided from internet service providers and telcos. On Friday, The Wall Street Journal said that the NSA’s gathering data on Verizon customers, is just the tip of the iceberg. According to the Journal:
“… people familiar with the NSA’s operations said the initiative also encompasses phone-call data from AT&T Inc. and Sprint Nextel records from Internet-service providers and purchase information from credit-card providers.
Update: One security expert who did not want to be named becuase he does work for government agencies said if the NSA is doing what it does best — which is traffic analysis. This is stuff like who is talking to whom and for how long and when. “It is a kind of social network mapping … questions arise if there is a radio in an uninhabited jungle [or] when I call Djokar Tsarnaev at 1 a.m. or when one call comes in and one call goes out continguously,” he said via email.
If this is the sort of traffic analysis the NSA is doing — the default assumption — then there is “no requirement for the cooperation of the endpoints, only the carriers,” he said. In all internet-based TCP/IP situations, the communications are all multi-hop, and thus there is no need to surveil all possible paths, just the “must go through” paths,” he continued. “If I can listen one hop outside your firewall, then there is nothing you can do about it, you won’t know I am doing it, and to the extent that traffic analysis is sufficient for the surveillance team, the job is done.”
This is really key stuff. When you post updates to your Facebook page or Google Drive, that data typically flows unencrypted over the web. That data-in-transit could, in theory, also be intercepted at the routers directing traffic or at Content Delivery Network (CDN) points that optimize traffic flow. We just don’t know, because security agencies won’t say. But the upshot is, if the government is collecting that traffic, it truly does have a ton of information about everything you do, or at least everything you say. That is truly a sobering use of big data.
Beyond that, we probably won’t know the truth of what Google, Microsoft, et al. knew and if or how much they participated in snooping for years to come.
This story was updated at 4 p.m. PDT to correct the name of the town where the NSA data center is located and again at 9:14 a.m. PDT on June 8 to add additional comment and context from a security expert.