Whether you’re asking Siri about the weather, the score of last night’s ballgame or something a little more personal, you’re probably at least a little curious about how Apple(s aapl) handles all those voice-activated search requests. Well, you can thank the ACLU and Wired for getting to the bottom of things: it turns out that Apple, using anonymized user ID numbers, holds on to your Siri data for up to two years.
Here’s Wired‘s explanation on what happens when you ask Siri to do something for you and the information goes off to an Apple data center:
Apple generates a random numbers to represent the user and it associates the voice files with that number. This number — not your Apple user ID or email address — represents you as far as Siri’s back-end voice analysis system is concerned.
Once the voice recording is six months old, Apple “disassociates” your user number from the clip, deleting the number from the voice file. But it keeps these disassociated files for up to 18 more months for testing and product improvement purposes.
The report includes a statement from Apple, which confirms the anonymized information may be kept “for up to two years.” But, according to the Apple spokeswoman, “If a user turns Siri off, both identifiers are deleted immediately along with any associated data.”
The two-year mark is six months longer than Yahoo,(s YHOO) Microsoft(s MSFT) and Google,(s GOOG) all of which retain search data for 18 months.
Machine learning technology and natural language processing, which is what Siri is based on, needs a lot of information to identify patterns in data and therefore be more helpful both in understanding your speech and providing correct answers.
Being connected to an anonymized set of digits may make some users uncomfortable, so the ACLU thinks Apple should make that more clear before they start using the service.