Gigaom Gigaom Logo Skip Navigation
  • Newsletter
  • Twitter
  • Facebook
  • LinkedIn
  • Contact
  • Sign in
  • Subscribe
Gigaom Logo
Skip Navigation
  • By opting into our newsletter, you agree to receiving news, updates and event invites from Gigaom and our sponsors.
  • Newsletter
  • Twitter
  • Facebook
  • LinkedIn
  • Topics
  • Analysts
  • Webinars
  • Research
  • Podcasts
  • Contact
  • Sign in
  • Subscribe
  • Topics
  • Analysts
  • Webinars
  • Research
  • Podcasts
  • Contact
  • Sign in
  • Subscribe
  • Cloud
  • Data & Analytics
  • DevOps
  • Artificial Intelligence
  • Security and Risk
  • Research Calendar
  • GigaBrief
  • DeepDive
Stay on top of emerging trends impacting your industry with GigaOm Research Subscribe

Weekly Update

Metadata: a game changer for big data storage?

Paul Miller Jul 7, 2011 - 9:53 AM CDT
  • Cloud
data
  • Tweet
  • Share
  • Post

Market intelligence firm IDC recently released its fifth annual Digital Universe study which, using text and video, combines hard data and projections with analysis to highlight issues and opportunities. It also sets up the question that, in a world where the volume of data more than doubles every two years, should we just keep buying more storage, or are smarter approaches to data description, selection and retention, such as using metadata, required?

In 2010, the report claims, more than 1 trillion gigabytes (1 zettabyte) were stored on iPods, laptops, desktops, and servers worldwide. By the end of this year that figure will have exceeded 1.8 zettabytes, and there’s no sign of the pace slowing.

Meanwhile, since 2005, enterprise investment in IT has increased around 50 percent, reaching $4 trillion. This is despite the plummeting cost of an IT system’s building blocks, such as storage, which have dropped from almost $20 per gigabyte to less than $3 in the same period. IDC’s projections suggest that storage will cost mere cents in the next few years. And still the amount of data grows while the cost of IT to the enterprise rises.

Despite fresh enthusiasm for big data, there remains a real danger that enterprises (and individuals) simply keep data because doing so is easier than actively deciding what can be discarded. Sensor logs, email inboxes, customer transaction data and more continue to fill enterprise storage arrays, and without robust data management policies in place the only realistic solution is often to keep buying additional storage. IDC appears to recognize this problem, suggesting that the “ultimate value of a big data implementation will be judged” on three guidelines:

  • Does it provide more useful information [than the enterprise had access to previously]?
  • Does it improve the fidelity of the information?
  • Does it improve the timeliness of the response?

While there is clear value in collecting and analyzing more data than ever before, simply storing everything by default is a strategy that is increasingly hard to defend. Storage may be getting cheaper, but it’s not free, and valuable data could disappear under an unquantifiable mass of poorly structured dross.

First among a set of calls to action that includes “master virtualization” and “move what you can to the cloud,” IDC exhorts CIOs to “investigate the new tools for creating metadata.” The report continues, “Big data will be a fountain of big value only if it can speak to you through metadata.” Metadata, or data about data, records the context within which data was captured. It describes the characteristics of the process that generates data (for example, the resolution of a camera photographing queues outside a store, and the interval between pictures). Metadata defines the structures in which data is stored, and lends meaning to cryptic codes and measurements.

The use of metadata remains less prevalent than we might expect. But where it can be applied cheaply and automatically as part of the process of data creation, it offers information architects and storage managers the means to effectively curate the data for which they are responsible. In many industries, the hardware and software already in use is generating this information automatically — even cheap consumer cameras record data about themselves and the environment around them as they take pictures. Information managers simply need to keep the data and start using it as part of their management workflow.

As data volumes grow ever larger, and as truly valuable data starts to represent a smaller proportion of the whole, metadata is going to become increasingly important. It will help companies understand the data they need, and it will help them to more systematically discard the data that they do not. We are already seeing acquisitions that bring data analysis solutions and storage hardware together. EMC’s acquisition of Greenplum last year is just one example of this trend. It points toward solutions that enable complex data analysis as well as intelligent management of all the data flowing through an enterprise. Metadata will be the key that enables those intelligent decisions to be made, finally slowing the rate at which enterprise data centers fill with ever-more storage arrays.

Question of the week

What role does metadata play in your business, and where else might you use it?
Advertisement

About The Author

PaulMiller-med5f09011b40a21c20b941ac1948ceee31-avatar2

Former Analyst

Paul Miller

Paul Miller is an Analyst for Gigaom Research and a consultant, based in the East Yorkshire (UK) market town of Beverley, but… More
  • Tweet
  • Share
  • Post
  • data storage
  • david-reinsel
  • digital universe
  • digital-universes
  • Home Storage
  • john-gantz
  • metadata
  • realistic-solution
  • virtual-storage
Advertisement
Advertisement

More Posts

Double exposure, Circuit board server, Businessman using tablet with global network and data exchanges customer network connection on city background, Business innovation and technology concept.

Blog

NetApp Fabric Orchestrator, One More Step in the Right Direction

Enrico Signoretti Nov 27, 2019 - 3:46 PM CST
4 Min Read More
image6-1

Blog

Two Different Approaches to Data Storage Analytics

Enrico Signoretti Sep 18, 2019 - 2:40 PM CDT
3 Min Read More
IMG_20190917_095851

Blog

An Update on Pure Storage From Pure Accelerate

Enrico Signoretti Sep 17, 2019 - 3:04 PM CDT
5 Min Read More
Advertisement

Related

Programming code abstract technology background of software developer and  Computer script

Report

SQL Transaction Processing, Price-Performance Testing

William McKnight and Jake Dolezal
Sponsored by
Modern cloud technology. Integrated digital web concept

Live Webinar

Making the Right Call for Multicloud Solutions

Dec 18, 2019 - 12:00 PM CST
Register
Sponsored by
Voices in Devops_Episode_Voice Headshot Card

Podcast Episode

Voices in DevOps – Episode 18: A Conversation with Tracy Miranda of Cloudbees

Jon Collins
Listen
Advertisement

Podcasts

Podcast

Voices in AI

Byron Reese
  • iTunes
  • Google Play
  • Spotify
  • Stitcher
  • RSS
Listen
voices-in-data-storage-cover

Podcast

Voices in Data Storage

Enrico Signoretti
  • iTunes
  • Google Play
  • Spotify
  • Stitcher
  • RSS
Listen

More Podcasts

Advertisement
  • Topics
  • Analysts
  • Webinars
  • Research
  • Podcasts
Gigaom
  • About
  • Contact
  • Advertising
  • Jobs
  • Privacy Policy
  • Terms of Service
  • Twitter
  • Facebook
  • LinkedIn
  • RSS Feed
  • Newsletter
2019 GigaOm All Rights Reserved.
This website uses cookies; by continuing you are a agreeing to our Privacy Policy Accept
Privacy & Cookies Policy

Necessary
Always Enabled

This is an necessary category.

Save & Accept