Lately, much of the discussion around storage has been about speeds and feeds of the latest flash arrays — and that’s valid. But Long’s position is that much of the value of what companies store is lost because that data goes into a black box, and companies have to deploy audit software and other extras it to wring important information out of it. DataGravity integrates those tools, search and analytics, into its software.
Aggregating data about the data
What are some examples of that important information? For instance: Who at the company accessed a file and how often? Who is working together on shared files? Is there personally identifiable information (PII) or credit card information sitting in documents? Which files have not been touched in two years? All of that is really interesting data about that stored data — and it can be used for compliance and governance purposes, Long said in a recent interview.
The idea is to catalog and expose that data so it can be of use to admins or execs, and do all of that in the array without needing a lot of add-on software products.
“We’ve integrated data analytics into storage — as data is ingested we capture who’s reading and writing it using Active Directory or LDAP. We capture who’s interacting with the data on the front end, we provide audit and activity trail, and on the backend we index over 400 data types,” Long said.
That can lead to some interesting “aha!” moments. One beta tester found a termination letter addressed to him, Long said. His panic subsided when he realized that his agency automatically generated such letters when employees are reassigned to other departments.
Another beta tester, Mark Lamson, director of IT for the Westerly Public Schools in Westerly, Rhode Island, said DataGravity pulled attendance data from various files and the school system was able to find a student with a perfect attendance record. “Instead of it being a gotcha thing about truancy, we found something positive to celebrate,” he said.
Chris Berube, IT manager for the Law Office of Joe Bornstein, a Portland, Maine-based law firm, said that while moving some data, a file popped up with a company credit card number in it. That’s the kind of thing that can cause problems.
The data about the data can help admins delegate unused files for archival. A planned release will offer that off-boarding capability to AWS Glacier or other inexpensive archives. The company may also add an OCR capability that will scan PDF documents and make them searchable as well
The company adapted open-source search technology with its own secret sauce to provide the Google-like search capability. The data indexing takes place on the array’s secondary spindle. The new DataGravity Discovery arrays will be on display next week at VMworld 2014.
Long has lots of fans in the tech community. EqualLogic blazed the trail for iSCSI storage when it launched in 2001. It was headed for an IPO when Dell bought it in 2008 for about $1.4 billion. Nashua, New Hampshire–based DataGravity has raised about $42 million in venture funding from Andreessen Horowitz, Charles River Partners and others.
To hear more from Paula Long about storage and other trends in tech infrastructure, check out this video of her panel at Structure 2014.