Table of Contents
- Data Catalog Primer
- Report Methodology
- Decision Criteria Analysis
- Evaluation Metrics
- Key Criteria: Impact Analysis
- Analyst’s Take
- About Andrew Brust
Data catalogs are metadata repositories that contain everything organizations need to define, understand, contextualize, and act on their data in a strategic manner. They’re foundational to efforts around data governance, data access security, timely analytics, and data management. Since these repositories are replete with business, technical, and operational metadata, they’re the ideal medium for users with roles in each of these functional areas to collaborate with one another, establish best practices, and democratize data culture. As such, they underlie the success of data-driven organizations.
Enterprise-wide data catalogs directly enable—or are contiguous to—metadata management, data quality, data modeling, data reliability, lifecycle management, data provenance, data discovery, and data protection. The most effective ones connect to sources, automatically harvest their metadata, and automate both tagging and classifications of data assets, so organizations understand what data they have and what, more specifically, it contains.
Data catalogs are also instrumental for expressing the meaning of data in business-friendly terms through taxonomies, schema particulars, business glossaries, and data classifications. They are the bridge between the technical, statistical analysis of data and the business semantics that make it comprehensible—and usable—for analytics and applications. Top data catalog solutions suggest this information via ML, import these metadata models from additional sources, and enable users to provide business logic via low-code techniques.
The importance of this information is multifold. For example, it’s intrinsically searchable, often according to the various metadata tags, classifications, and additional enrichments users apply to the data. When vendors offer data marketplaces, their catalogs can serve as primary resources with which users can determine what data is available for a specific deployment and how to access it. Their rich metadata can facilitate implementing a host of role-based or attribute-based data access controls that institute and enforce data protection.
As centralized hubs, data catalogs have a host of features for annotating, grading, recommending, preventing inappropriate access to, and sharing data for any conceivable objective. They’re also starting points for implementing data quality measures—extending them to data validation, data reliability, and data observability at the data pipeline level—and for boosting organizational trust in data. As such, increased adoption of data-centric processes and tooling takes root, expanding ROI, and data becomes more readily governed and, ultimately, sustainable.
This is the third year that GigaOm has reported on the data catalog space in the context of our Key Criteria and Radar reports, and the need to holistically manage metadata and other facets of data intelligence has only continued to grow. This report builds on our previous analyses and considers how the market has evolved over the last year.
This GigaOm Key Criteria report details the capabilities (table stakes, key criteria, and emerging technologies) and non-functional requirements (evaluation metrics) for selecting an effective data catalog solution. The companion GigaOm Radar report identifies vendors and products that excel in those capabilities and metrics. Together, these reports provide an overview of the data catalog category and its underlying technology, identify leading data catalog offerings, and help decision-makers evaluate these solutions so they can make a more informed investment decision.
How to Read this Report
This GigaOm report is one of a series of documents that helps IT organizations assess competing solutions in the context of well-defined features and criteria. For a fuller understanding, consider reviewing the following reports:
Key Criteria report: A detailed market sector analysis that assesses the impact that key product features and criteria have on top-line solution characteristics—such as scalability, performance, and TCO—that drive purchase decisions.
GigaOm Radar report: A forward-looking analysis that plots the relative value and progression of vendor solutions along multiple axes based on strategy and execution. The Radar report includes a breakdown of each vendor’s offering in the sector.