Ever wonder what Apple does with your Siri data?


Whether you’re asking Siri about the weather, the score of last night’s ballgame or something a little more personal, you’re probably at least a little curious about how Apple(s aapl) handles all those voice-activated search requests. Well, you can thank the ACLU and Wired for getting to the bottom of things: it turns out that Apple, using anonymized user ID numbers, holds on to your Siri data for up to two years.

Here’s Wired‘s explanation on what happens when you ask Siri to do something for you and the information goes off to an Apple data center:

Apple generates a random numbers to represent the user and it associates the voice files with that number. This number — not your Apple user ID or email address — represents you as far as Siri’s back-end voice analysis system is concerned.

Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes.

The report includes a statement from Apple, which confirms the anonymized information may be kept “for up to two years.” But, according to the Apple spokeswoman, “If a user turns Siri off, both identifiers are deleted immediately along with any associated data.”

The two-year mark is six months longer than Yahoo,(s YHOO) Microsoft(s MSFT) and Google,(s GOOG) all of which retain search data for 18 months.

Apple clarified what it does with this information because a lawyer for the ACLU started asking questions. The concern stemmed from Apple’s privacy policy for Siri users, which states in part, “Older voice input data that has been disassociated from you may be retained for a period of time to generally improve Siri and other Apple products and services.”

Machine learning technology and natural language processing, which is what Siri is based on, needs a lot of information to identify patterns in data and therefore be more helpful both in understanding your speech and providing correct answers.

Being connected to an anonymized set of digits may make some users uncomfortable, so the ACLU thinks Apple should make that more clear before they start using the service.


Sohin Shah

n this technology driven era, the collection and maintenance of Big Data has become the norm. Not only are they now capable of taking complete advantage of the reams of data that is at their disposal, but they have allowed a large number of businesses to streamline processes and increase their profit margins.



We all know how well anonymized data works and so far it is easily traced back to the original person making it not so anonymous. Hey Apple just purge the data or don’t associate it to begin with.

Nick Murphy

If Apple disassociates your randomized ID from your data after 6 months then how are they going to delete that data if a user turns Siri off? I would assume based on the description that only the 6 months of data associated with a randomized ID would be deleted.


For some reason, I don’t believe them. Remember that your “anonymized” voice data has to pass through your carrier (AT&T, Verizon, etc.) in order to reach Apple.

What do the carriers do with that same data? I’m betting that their data storage techniques are similar, but they are probably tap-wiggling their fingers together while making that evil laugh every time you speak to Siri.

“It’s easier to fool people than to convince them they have been fooled” -Mark Twain


Siri data is encrypted with SSL, so the carriers don’t know if it’s Siri or just you browsing some secure website.


Reblogged this on // Howie BM // and commented:
I bet whoever has the job of sifting through these gets quite a few laughs a day… Makes you think twice before you ask Siri questions like “Marry me” and other such desperate queries :P.

Comments are closed.