With companies looking to follow your data trail online, why not take possession of that information and find innovative uses for it yourself? That’s the question Jeremie Miller — the developer known for building the open-source protocol that powers many Instant Messaging programs — is trying to answer. Miller is building The Locker Project, an open-source effort that allows users to capture and archive their own online “data exhaust,” the term used for the crumbs of data we leave behind as we move around the web.
Using APIs and feeds, the Locker Project will pull in tweets, updates, pictures, check-ins, transactions, contacts and webpages, and will allow a user to store it on his or her own server or as part of a hosted service similar to the blog platform WordPress. Miller’s company Sing.ly will provide the support for the open-source project. Readwriteweb has a good first look at the service. The idea is an interesting one, because it will give people a repository of their online behavior, where they can see at a glance what kind of trail they’ve left online, and look for patterns inside that data. The service is one of an emerging group of companies that are aimed at helping users capture their personal data, including Statz, Greplin and Personal.com.
Where things could get really interesting, however, is that The Locker Project is looking to have developers build apps on top of the service. Those apps — with permission from users — will be able to analyze a user’s data and extract trends and other interesting information. A user could get more personalized recommendations or a better assessment of behavior or spending habits, or someone with a medical condition might get a pre-diagnosis based on the symptoms he or she been searching for information on.
It’s a compelling idea that builds on the power of data — something we’ll be talking about at our Structure Big Data Conference in New York on March 23. People are creating huge amounts of information as they move around the web, but it’s not being leveraged very well, or if it is being leveraged, it’s big companies and marketing services like Rapleaf that are taking advantage of it.
Citibank (s c) spun out a project called Bundle that takes millions of anonymous user transactions and builds recommendations based off of that for customers. But the future lies in taking in personal data and crafting highly customized services: BankSimple, a New York-based start-up, is poised to launch a next-generation banking service that takes in a user’s data and preferences and builds a personalized finance system for them. BillGuard, another start-up that just won $3 million in funding, also leverages personal banking data and uses that to build a fraud alert system.
The challenge for many of these services — including The Locker Project — is that it will take a certain amount of trust for a user to give up their data. But if users do believe that their privacy is protected, and the results will be beneficial, it could open up a lot of opportunities: When you apply big data analysis to personal data, you can surface unseen trends, correlations and patterns, and it can also bring consumers closer to marketers on their own terms. Kaliya Hamlin, the executive director of the Personal Data Ecosystem Collaborative Consortium, said recently that, rather than tracking or stalking users online, marketing and advertising companies should learn to empower users to hold on to their data, then share it with them willingly:
Giving individuals choice about where they store their personal data and who has access to it, and under what terms and conditions, grows trust. This trust is hugely valuable, because over time more and better services that combine and utilize valuable personal data can be offered. It supports new forms of advertising and marketing by enabling trusted relationships between customers and vendors that enable “relationship marketing” and opt-in, user controlled sharing of data, permissioned communications and offers, group buying, recommendations, social and viral marketing, more efficient commercial exchanges.
We’re still a ways off from something like this being mainstream, of course. Users have to get used to storing their own data, and companies will have to learn to work with consumers rather than go the easier route to track them. And the fact that Do Not Track proponents are pushing for more regulations of that kind of activity, working with consumers on a more level playing field might be the best resolution for all.
Related content from GigaOM Pro (subscription req’d):
- Big Data Marketplaces Put a Price on Finding Patterns
- What IBM Does With Big Data
- Why the Hoopla About Hadoop?