Blog Post

Like your data big? How about 5 trillion records?

1010data, the analytics-as-a-service pioneer, says it now hosts more than 5 trillion — yes, trillion with a “t” — records for its customers. If 1010data’s growth over the past year is a microcosm of the greater market, it’s no wonder we’re seeing so much excitement around technologies such as Hadoop, NoSQL databases and massively parallel analytic databases. Organizations are gathering mountains of data and need some way to store and analyze it — if not with a service provider like 1010data, then on premise.

According to 1010data’s annual assessment, data volumes just keep climbing. The total volume of data it houses grew by 33 percent, while the number of records grew by 45 percent. The records are spread across thousands of tables, the largest of which tops out at around 500 billion rows. And although these numbers might seem small in comparison with the astronomical data growth predicted by reports such as IDC’s Digital Universe Index, it’s important to remember 1010data deals only with business data, not with the entirety of data produced in any digital form.

Perhaps the most telling aspects of 1010data’s growth have to do with people, though. The company claims increases of about 50 percent in both the number of customers and the number of people consuming data produced using the service. The number of business analysts using 1010data grew by 25 percent. Data volumes have been growing forever, but the growing number of people concerned with that data means we should only expect the pace of analytics innovation to pick up.

It’s noteworthy, however, that 1010data’s growth is primarily in structured data, the type generally stored in relational databases and data warehouses. For unstructured data, such as that stored in Hadoop, anecdotal evidence suggests growth is much higher among organizations. In July, Cloudera noted average cluster size among its customers had more than tripled since October 2010, and 22 Cloudera customers had clusters storing more than a petabyte of data apiece.