Blog Post

We need a data democracy, not a data dictatorship

Stay on Top of Enterprise Technology Trends

Get updates impacting your industry from our GigaOm Research Community
Join the Community!

The democratization of data is a real phenomenon, but building a sustainable data democracy means truly giving power to the people. The alternative is just a shift of power from traditional data analysts within IT departments to a new generation of data scientists and app developers. And this seems a lot more like a dictatorship than a democracy — a benevolent dictatorship, but a dictatorship nonetheless.

These individuals and companies aren’t entirely bad, of course, and they’re actually necessary. Apps that help predict what we want to read, where we’ll want to go next or what songs we’ll like are certainly cool and even beneficial in their ability to automate and optimize certain aspects of our lives and jobs. In the corporate world, there will always be data experts who are smarter and trained in advanced techniques and who should be called upon to answer the toughest questions or tackle the thorniest problems.

Last week, for example, introduced a new feature of its Chatter intra-company social network that categorizes a variety of data sources so employees can easily find the people, documents and other information relevant to topics they’re interested in. As with similarly devised services — LinkedIn’s People You May Know, the gravitational search movement, or any type of service using an interest graph — the new feature’s beauty and utility lie in its abstraction of the underlying semantic algorithms and data processing.

The problem, however, comes when we’re forced to rely on these people, features and applications to decide how data can affect our lives or jobs, or what questions we can answer using the troves of data now available to us. In a true data democracy, citizens must be empowered to make use of their own data as they see fit and they must only have to rely apps and experts by choice or when the task really requires an expert hand. At any rate, citizens must be informed enough to have a meaningful voice in bigger decisions about data.

The democratic revolution is underway

The good news is that there’s a whole new breed of startups trying to empower the data citizenry, whatever their role. Companies such as 0xdata, Precog and BigML are trying to make data science more accessible to everyday business users. There are next-generation business intelligence startups such as SiSense, Platfora and ClearStory rethinking how business analytics are done in an area of HTML5 and big data. And then there are companies such as Statwing, Infogram and Datahero (which will be in beta mode soon, by the way) trying to bring data analysis to the unwashed non-data-savvy masses.

Combined with a growing number of publicly available data sets and data marketplaces, and more ways of collecting every possible kind of data —  personal fitness, web analytics, energy consumption, you name it — these self-service tools can provide an invaluable service. In January, I highlighted how a number of them can work by using my own dietary and activity data, as well as publicly available gun-ownership data and even web-page text. But as I explained then, they’re still not always easy for laypeople to use, much less perfect.

Statwing spells out statistics for laypeople.
Statwing spells out statistics for laypeople.

Can Tableau be data’s George Washington?

This is why I’m so excited about Tableau’s forthcoming IPO. There are few companies that helped spur the democratization of data over the past few years more than Tableau. It has become the face of the next-generation business intelligence software thanks to its ease of use and focus on appealing visualization, and its free public software has found avid users even among relative data novices like myself. Tableau’s success and vision no doubt inspired a number of the companies I’ve already referenced.

Assuming it begins its publicly traded life flush with capital, Tableau will not just be financially sound — it will also be in a position to help the burgeoning data democracy evolve into something that can last. More money means being able to develop more features that Tableau can use to bolster sales (and further empower business users with data analysis), which should mean the company can afford to also continually improve its free service and perhaps put premium versions in the hands of more types of more non-corporate professionals for free.

Tableau is already easy -- but not easy enough.
Tableau is already easy (I made this) — but not easy enough.

The bottom-up approach has already proven very effective in the worlds of cloud computing, software as a service and open-source software, and I have to assume it’s a win-win situation in analytics, too. Today’s free users will be tomorrow’s paying users once they get skilled enough to want to move onto bigger data sets and better features. But the base products have to be easy enough and useful enough to get started with, or companies will only have a lot of registrations and downloads but very few avid users.

And if Tableau steps ups its game around data democratization, I have to assume it will up the ante for the company’s fellow large analytics vendors and even startups. A race to empower the lower classes on the data ladder would certainly be in stark contrast to the historical strategy of building ever-bigger, ever-more-advanced products targeting only the already-powerful data elite. That’s the kind of revolution I think we all can get behind.

Feature image courtesy of Shutterstock user Tiago Jorge da Silva Estima.

6 Responses to “We need a data democracy, not a data dictatorship”

  1. Rachel M. Murray

    Great article Derrick – appreciating your work on the topic here on GigaOm.

    We’re seeing wider availability of reasonably priced BI and visualization software tools to help us understand that harnessing all this data is possible – and I think even consumers are beginning to understand the value of all the data, and the ability to make meaning from it. One part of the puzzle that’s missing from what I can see is the education – knowledge transfer of how individuals can use the tools, what good data science methods are, and how data citizens can actively contribute to the larger data analysis community. I see movements like the Open Data/Open Gov folks, and events like the NYC Big Apps hackathon as part of the solution – but as individuals, where do we go to take part? What is the role of an informed, curious citizen in this? More venues exist for learning some of the ‘how’ to make sense of big data as an individual taking a course online, but I’m not seeing a vision from anyone talking about how to connect all of the dots. To make sense of data, we need the tools, the practitioners, the analysis of the problems, but we also need a vision of how all of these will work. If anyone has ideas of who’s got that vision, I’d love to hear it.

  2. seanrwcrawford

    I feel one of the biggest impediments to the democratization of data is access. Most people know what they would like to answer, and how the data needs to be shaped to achieve that, but getting the data to do the actual analysis with can be one of the most difficult aspects.

    This is a bit of a plug, but we’re working on enabling data access that is easily attainable by everyone. Our platform is a “search engine for data” that is able to fetch time series data from a disparate sets of sources, and provide it in a simple searchable form that allows users to extract, validate, format, merge, graph, and share it however they want.

    By providing the underlying data for analysis tools like Tableau, Statwing, and many others, we feel we can help to create the tool stack that empowers people to create a sustainable DIY data culture.

  3. Mark Janssen

    You know, I’ve been trying to do just that, but it’s been meeting a lot of resistance. But the overall project is pretty vast. It involves a unified data model to create a data “ecosystem”. You can find more at the wiki:


  4. In every company I’ve worked at, I’ve seen this major divide between IT analysts and Business users. Part of it was cultural, but a major reason was as you point out: “a historical strategy of building ever-bigger, ever-more-advanced products targeting only the already-powerful data elite”. The business user typically was left to use Excel to prepare and analyze data.

    It took 15+ years, but thanks to new players like Tableau, Spotfire and Qlikview which were sold primarily to the business user and focused on ease of use, the data democratization process has resulted in a power shift to the business user. Some IT departments have now come around and are trying to accommodate these “shadow IT” projects by providing IT support and giving Tableau users limited access to enterprise data stores.

    As for upping the ante for the traditional players, it has happened already. Over the last two years, the larger vendors have responded with products like Visual Insight (MicroStrategy), Visual Intelligence (SAP), PowerPivot (MicroSoft), JMP (SAS) etc. taking aim at this segment of the market. The Big Data market is still new, but the trend to build user-friendly (or at the very least, SQL-aware) tools on top of Hadoop is also hitting its stride.
    One good thing coming out of this data democratization is the realization that it has to be supported by a Data Governance effort. Otherwise we’ll see the unfortunate return of a major problem with data democracy: data chaos. Previously it would have meant comparing and reconciling two Excel spreadsheets, now we may end up reconciling the findings from two Tableau workbooks.

    • Derrick Harris

      Thanks for the comment, and for making a really good point about data governance. Obviously, that’s not too big a concern for personal data use, but competing findings from lots of disparate data sets would be problematic.

  5. Steve Ardire

    hurrah for Tableau but massive data democratization will only happen with ‘methods’ that can to convert data to insight fast with with less dependencies on data scientists