Table of Contents
1. Executive Summary
The key focus of real-time analytics is to power analysis that can deliver results and insights as events happen in the real world. To help achieve this instant analysis, a new category of databases—real-time analytical databases—has emerged. Real-time analytical databases work to ensure the data used in analysis is as up to date as possible. These databases have their roots in traditional online analytical processing (OLAP) databases; however, they surpass these predecessors by providing the ability to connect to and ingest extremely large (up to petabyte-scale) volumes of data, often from streaming data sources and batch or change data capture (CDC) sources. In doing so, they offer functionality found in both specialized business intelligence (BI) platforms and streaming data platforms and combine it with the core capabilities found in operational databases and data warehouses.
To facilitate analytics over large volumes of data with minimal latency, the databases in this category make use of structural and architectural optimizations. Examples include columnar orientation, various types of indexing, partitioning, and segmentation, precomputations of aggregations to accelerate queries, and vector processing. Scalability—the resilience of the system under the demands of increasing workloads—and high availability are also important in this category because of the time-critical nature of the analysis.
Real-time analytical databases allow organizations to see an up-to-the-minute view of the state of their data. This enables decisions to be made as events occur or as conditions change in the real world. This technology benefits any organization or user that needs data to be current and accurate. Use cases can take many practical forms, spanning different industries and target audiences. These can include healthcare, emergency response, cybersecurity, fraud detection, shipment and inventory tracking, personalized advertising, financial trading, and apps for food delivery or ridesharing. The time-sensitive criticality of some of these use cases is a primary factor driving the impact and urgency of adoption for these databases. The simple decreasing tolerance for latency among the modern everyday end users of an app, regardless of its intended function, drives urgency as well.
In terms of maturity, real-time analytical database technology is uniquely divergent. The online analytical processing (OLAP) methodology on which a number of these platforms are based (even if they use relational, rather than multidimensional, storage) represents a longstanding one. On the other hand, the workloads and use cases powered by real-time analytical databases are fresh and cutting-edge. The vendor landscape consists largely of newer, up-and-coming commercial offerings, often built upon open source databases.
This is the first year that GigaOm has reported on the real-time analytical database space in the context of our Sonar reports. This GigaOm Sonar provides an overview of vendors of real-time analytical databases and their available offerings, outlines the key characteristics that prospective buyers should consider when evaluating the solutions, and equips IT decision-makers with the information they need to select the best one for their business and use case requirements.
ABOUT THE GIGAOM SONAR REPORT
This GigaOm report focuses on emerging technologies and market segments. It helps organizations of all sizes to understand a new technology, its strengths and its weaknesses, and how it can fit into the overall IT strategy. The report is organized into five sections:
- Overview: An overview of the technology, its major benefits, and possible use cases, as well as an exploration of product implementations already available in the market.
- Considerations for Adoption: An analysis of the potential risks and benefits of introducing products based on this technology in an enterprise IT scenario. We look at table stakes and key differentiating features, as well as considerations for how to integrate the new product into the existing environment.
- GigaOm Sonar Chart: A graphical representation of the market and its most important players, focused on their value proposition and their roadmap for the future.
- Vendor Insights: A breakdown of each vendor’s offering in the sector, scored across key characteristics for enterprise adoption.
- Near-Term Roadmap: 12- to 18-month forecast of the future development of the technology, its ecosystem, and major players in this market segment.
2. Overview
The emergence of the big data era over a decade ago resulted in data flooding into organizations at significantly higher velocities and greater volumes than had been seen previously. The rise of big data accelerated the pace of business of today’s organizations and drove demand for analytics that could be performed on data in real-time. The proliferation of use cases requiring real-time insights—including emergency services, cybersecurity and fraud detection, advertising personalization, and explosively popular ride-share and food delivery apps—made it crucial to be able to analyze data within milliseconds of its hitting databases, to ensure decisions could be made as events occur in the real world.
Real-time analytical databases were created to meet these requirements. As proponents of the technology say, these databases are purpose-built to facilitate fast analytics of huge volumes of data with low latency and high concurrency. These databases are architected with multiple optimizations in storage and query processing, which enable them to ingest large amounts of real-time data, make it quickly available for querying, and deliver sub-second query performance on it. This allows organizations to derive insight into the current state of events in the real world, such as the number of drivers available in a certain area, or the current state of a processing system.
Components of real-time analytical databases include:
- Indexing: Indexing is fundamental to the way real-time analytical databases store data, process data, and make data available for querying. Indexing provides the efficiency of scanning and aggregating data that results in the query performance essential for real-time analytics use cases. Real-time analytical databases often make use of different types of indexing to achieve optimizations for a variety of query patterns.
- Storage optimizations: Columnar storage and database sharding are also used by a number of these databases to optimize the storage of data and facilitate more efficient processing and scanning. Columnar storage reorients traditional row-based storage so that different values for the same column are stored in proximity to each other, creating efficiencies in scanning, aggregation, and storage. Database sharding is a technique whereby single tables or datasets are split across partitions or “shards.” Each partition, or shard, contains a subset of rows and is stored separately across multiple nodes in a database cluster, but all shards share a common schema. Since each shard contains fewer rows than the entire database, the intended benefit of this structure is to reduce the time it takes the system to search through and retrieve a specific result from the database.
- Analytics optimizations: Some real-time analytical databases make use of analytics optimizations such as massively parallel processing (MPP) and vector processing. In MPP, if the workload increases, additional worker nodes can be added to perform additional work in parallel, giving the database the ability and resilience to scale to meet increasingly large workloads. Vector processing involves “single instruction, multiple data” (SIMD) operations, which also process data in parallel, handling multiple data values simultaneously, even within a single CPU core. This allows for parallelization to occur at multiple levels: within a CPU core via vector processing, within a single node via multiple CPUs, and within the MPP cluster across all worker nodes. Real-time analytical databases leverage these storage and analytics optimizations to facilitate the low-latency, high-concurrency queries over huge amounts of data that are typically characteristic of these databases.
- Batch and real-time ingestion: Real-time analytical databases are able to ingest and process streaming data, usually (not always) through dedicated, native integrations or partnerships with a streaming data source. The majority of real-time analytical databases can also ingest batch or historical data. This ability differentiates the databases in this category from some traditional frameworks for analyzing streaming data, which may not have the ability to analyze the batch and historical data component.
- Data type support: Real-time analytical databases support, often natively, many different data types and formats, including those typical of streaming data processing, such as JSON, Avro, Protobuf, and so on. Some real-time analytical databases can support geospatial data types, open table formats such as Iceberg, Hudi, and Delta Lake, and unstructured data as well. The overlap continues to grow between real-time analytical databases and other categories such as observability platforms, vector databases, and time-series databases, along with data warehouses and data lakes.
- Materializations and pre-aggregations: While not all real-time analytical databases make use of analytics preprocessing techniques such as materialized views, and storing pre-aggregated values in the database for query acceleration, for those that do, they often constitute core components of their offerings. Some solutions in the real-time analytical databases category offer materialized views that are continually refreshed as data comes into the platform, resulting in consistently up-to-date views of query results.
Figure 1. Overlap Among Data Management Categories
Real-time analytics powers use cases that span many industries and aspects of day-to-day life. Cybersecurity and fraud detection, system monitoring and alerts, IoT and sensor data collection, healthcare and emergency services, adtech, clickstream and web analytics, and gaming behavior analytics are just some of the possibilities. Deployment approaches for real-time analytical database offerings include public cloud, private cloud, containerized, and on-premises.
Real-time analytical database use cases are pertinent to organizations of all sizes. They are certainly suited for larger organizations, such as financial services, advertising, and healthcare companies, with significant amounts of data that need to be consumed and analyzed. However, plenty of midsize and smaller organizations also find benefits in use cases powered by real-time analytics such as system monitoring and alerts, and analysis of user and customer behavior. Additionally, the simple expectation of responsive, low-latency applications among everyday end users drives the relevance and urgency of adopting systems for real-time analytics, regardless of organization size.
3. Considerations for Adoption
The development of systems for real-time analytics has been largely driven by the value that arises from making data-driven decisions as events happen in the real world. Real-time analytical databases specialize in enabling instantaneous analysis on a large scale, making data available for querying by a high number of concurrent users, with the lowest latency possible.
There are other types of systems for real-time data processing such as streaming data processing platforms and time-series databases whose capabilities overlap with real-time analytical databases to differing degrees. However, these other types of platforms either focus exclusively on streaming data processing, leaving out the workloads and context that can be tapped into when historical data is analyzed in addition to real-time data, or they are more specific to certain use cases, as in the case of time-series databases, which primarily partition data by time first, before any other attributes are applied.
A number of the offerings covered in this report are managed offerings, with differing degrees of support, that are based on open source databases. Customers therefore have a wide range of support levels and deployment options to choose from, catering to a range of priorities. On one end of the spectrum are the customer-managed, open source databases themselves, which provide a degree of customizability and control that may appeal to certain customers. On the other end of the spectrum are the fully vendor-managed, cloud-native, serverless offerings, which completely abstract away all the details of configuration and provisioning from the user. And in the middle, there is a range of combinations of vendor-hosted and customer-managed or customer-configured components.
Real-time analytical databases provide support for a broad array of use cases. Some provide native support for geospatial data types, which makes them especially suited to workloads for which location is a strong factor, including food delivery, supply chain and logistics tracking, and ride-sharing apps. In addition, platforms in this category are increasingly adding support for new data types and workloads, further broadening the scope of what can be handled by a single database platform. Some real-time analytical databases can support open table formats and even unstructured data types. They can also support storing and searching of vector embeddings, facilitating use cases that make use of generative AI and increasingly creating overlap with the standalone vector database category.
Key Characteristics
Here we explore the key characteristics that may influence enterprise adoption of the technology, based on attributes or capabilities that may be offered by some vendors but not others. These criteria will be the basis on which organizations decide which solutions to adopt for their particular needs. The key characteristics for real-time analytical database solutions are:
- Storage/analytics optimizations
- Data ingestion
- Analytics preprocessing
- Schema management
- Client/tool connectivity
- Scalability
- High availability
Storage/Analytics Optimizations
In order to perform their core function, which is facilitating analytics on high volumes of streaming data with low latency and high concurrency, real-time analytical databases possess numerous storage and analytic optimizations that help drive efficiency of data retrieval and query performance. This characteristic is intended to capture the breadth and/or depth of a platform’s storage and analytics optimizations, which can include any column-oriented storage; the types and effectiveness of any database sharding strategies; the types and diversity of the platform’s indexing capabilities, which often translate into what types of queries the platform is especially primed to support; and any use of MPP architecture or vector processing.
Data Ingest Versatility
In order for the database to perform any analytical queries, the data must be ingested into the system. This characteristic assesses the level of support for data ingestion from different types of data sources, including both batch and real-time sources. The ability to ingest data from real-time data sources in some manner is, of course, critical to the success of any of these real-time analytical databases. Some databases possess native ingestion capabilities from streaming data sources, which streamlines ingestion performance and reduces complexity and latency. Some databases also support ingestion in batch or bulk from other types of sources as well.
Analytics Preprocessing
This characteristic is intended to capture the level of support, if any, that the database provides for different optimizations, such as materialized views or precomputations of aggregations to be stored in the database to speed query performance. Not all the vendors in this category make use of such preprocessing optimizations, but of the vendors that do, vendors rated high on this metric possess strong support for one or more of these optimization techniques.
Schema Management
This criterion is intended to capture the ability, if any, of the database to support making updates to the database schema dynamically, detecting and generating a schema for data as it comes into the platform, and/or any overall capabilities the offering has that provide the ability to detect and respond to changes in the schema dynamically. Real-time analytical databases are intended to process and make data available for querying with low latency, so usually (though not in all cases) they do not have a predefined schema. In the most advanced cases, the database can detect and generate a schema for data as it is ingested into the platform. Some databases also support schema evolution, or the ability to detect and respond to changes in the schema dynamically if the schema of the underlying source data changes.
Client/Tool Connectivity
This characteristic is focused on the ability of the platform to connect to different client applications or tools, such as those for business intelligence, data visualization, or SQL querying, that help users explore their data, create real-time dashboards for system monitoring or observability, or visualize or present the data for use cases such as management analytics or reporting. While some platforms can integrate with tools and applications through standard ODBC/JDBC drivers, some further possess “deep” or “native” integrations with client applications or tools.
Scalability
The ability of the platform to scale, or to exhibit resilience in handling workloads involving increasing data volumes, is captured in this metric. Support for ingesting and processing large data volumes and supporting queries over this data at speed is necessary for organizations that need to gather a true and up-to-date picture of their business and/or systems that enables them to make critical decisions about real-world events. Platforms make use of various methods to scale, with some providing auto-scaling capabilities, and some fully handling resource provisioning and scaling on behalf of the user through serverless, cloud-native platforms.
High Availability
High availability is an important characteristic of real-time analytical databases because they require continuous uptime in order to support the critical use cases assigned to them. This criterion captures the structures and methods the database employs to enable high availability and fault tolerance, and to prevent resource failure. One example of this would be the use of data replication to have redundant or secondary nodes with copies of the data on standby, which could be activated should a primary node fail.
Table 1 shows how well the solutions discussed in this report score in each of these areas.
Table 1. Key Characteristics Comparison
Key Characteristics Comparison
Exceptional | |
Capable | |
Limited | |
Not Applicable |
4. GigaOm Sonar
The GigaOm Sonar provides a forward-looking analysis of vendor solutions in a nascent or emerging technology sector. It assesses each vendor on its architecture approach (Innovation), while determining where each solution sits in terms of enabling rapid time to value (Feature Play) versus delivering a complex and robust solution (Platform Play).
The GigaOm Sonar chart (Figure 2) plots the current position of each solution against these three criteria across a field of concentric semicircles, with solutions set closer to the center judged to be of higher overall value. The forward-looking progress of vendors is further depicted by arrows that show the expected direction of movement over a period of 12 to 18 months.
Figure 2. GigaOm Sonar for Real-Time Analytical Databases
As you can see in Figure 2, the entire market landscape consists of vendors for whom their real-time analytical database forms the core, or “flagship,” component of their offering; hence all the vendors in the real-time analytical databases category are placed on the Platform Play side of the Sonar.
While this is a Sonar Report and all vendors are classified as Innovation providers by default, reflecting the up-and-coming nature of the category, its cutting-edge use cases, and forward-thinking roadmaps, the placement of all vendors on the Platform Play side of the Sonar also attests to the investment and dedication these vendors have poured into their offerings. Additionally, many vendors are either already Leaders or are projected to move from Challenger to Leader in the near term. Each solution here presents a comprehensive, well-rounded offering to potential customers.
In reviewing solutions, it’s important to keep in mind that there are no universal “best” or “worst” offerings; there are aspects of every solution that might make it a better or worse fit for specific customer requirements. Prospective customers should consider their current and future needs when comparing solutions and vendor roadmaps.
5. Solution Insights
Aerospike: The Aerospike Real-Time Data Platform
Solution Overview
Aerospike is a real-time analytics database written in C for optimal performance. It was originally created as a key-value store, but it has since expanded to encompass document store, graph database, and vector database use cases and capabilities. It thus stands out within the vendor landscape as an expanded platform for a breadth of use cases and workloads.
The vendor’s offerings are centered around the Aerospike Database, which its documentation says is architected with the key objectives of creating a flexible and scalable platform from which applications can be built, providing the robustness and reliability (via ACID compliance) of traditional relational database management software (RDBMS), and providing operational efficiency with minimal manual involvement. According to the vendor, the particular attributes of Aerospike that deliver value to its customers include its high performance (millisecond response time), reliability (high availability), scalability (to handle workloads of up to petabytes of data), and reduced infrastructure needs and costs.
Strengths
The Aerospike Database leverages storage and analytics optimizations that the vendor says are key to enabling it to perform analytics at speed on petabyte-scale data volumes. Aerospike leverages data compression, indexing (primary and secondary), partitioning/database sharding, and the vendor’s proprietary Hybrid Memory Architecture. The latter is described by the vendor as a structure by which indexes are placed in dynamic random-access memory (DRAM), and data is persisted in SSD but accessed at in-memory speeds. Aerospike integrates with Prometheus and Grafana for observability and monitoring workloads. Aerospike also has Open Telemetry (OTEL) integrations with Datadog, Chronosphere, ServiceNow/Lightstep, New Relic, and Amazon CloudWatch, to name a few. Aerospike makes use of synchronous active-active replication for high availability upon resource failures without the need for user interventions, and asynchronous cross-datacenter replication (XDR) with fine-grained access control, which the vendor says helps with compliance use cases.
Challenges
At the time of this research, it appeared that Aerospike doesn’t make use of analytics preprocessing capabilities such as pre-aggregations and materializations.
Purchase Considerations
Aerospike provides a number of offerings that are primarily centered around its Aerospike Database. Aerospike Cloud is the vendor’s database-as-a-service offering. Aerospike Cloud Managed Service is the vendor’s fully managed offering, providing a “white-glove” level of service, designed for enterprises and deployed in the cloud of the customer’s choice. Products complementary to those based on Aerospike Database include Aerospike Graph and Aerospike Vector Service, the vendor’s vector database offering, currently in beta.
Aerospike originated as a database that could handle the requirements for adtech use cases that used real-time bidding. From those origins, Aerospike has since grown to be used in a wide variety of workloads. According to the vendor, examples include fraud detection and prevention, use as an AI/ML feature store, customer 360 and recommendation engine use cases, application modernization and mainframe augmentation use cases, and use as an operational data store. Additionally, retrieval augmented generation (RAG) and semantic search will also be part of Aerospike’s offering once its vector database offering mentioned above moves into general availability.
Sonar Chart Overview
Aerospike is positioned as a Challenger on the Platform Play side of the Sonar. It has a comprehensive offering but doesn’t score as well as the Leaders in the decision criteria we evaluated. It stands out within the vendor landscape as an expanded platform for a breadth of use cases and workloads. It’s close to the Feature Play/Platform Play axis, reflecting a balanced blend of both specialized workloads and a broader scope. Aerospike is also designated a Fast Mover based on its forward-thinking roadmap.
ClickHouse: ClickHouse Cloud, ClickHouse Server, chDB
Solution Overview
ClickHouse provides a managed offering for the open source ClickHouse database, a column-oriented, distributed database management system (DBMS) for real-time analytics that is written in C++. The managed offering, ClickHouse Cloud, is the vendor’s primary offering, a serverless, cloud-native solution intended to provide ease of administration and maintenance for operating the underlying ClickHouse database.
The vendor credits specific design and architecture choices for enabling ClickHouse to achieve analytics over large volumes of data with sub-second response times. Column-oriented storage provides scanning, aggregation, and storage efficiencies. ClickHouse also makes use of database sharding, indexing, and vector processing. The vendor also describes a proprietary “table engine” based on the MergeTree family of engines as one of the main components that is foundational to ClickHouse’s offerings. According to the vendor, these engines are designed for quickly inserting extremely large amounts of data into multiple tables and breaking each table down into groups of rows called “parts,” which are then periodically merged in the background. These engines contribute to ClickHouse’s ability to handle large-scale analytics.
Strengths
Central to ClickHouse’s database is its comprehensive range of storage and analytics optimizations. For data ingestion, ClickHouse Cloud includes a managed connector component known as ClickPipes. Included among ClickPipes’ list of connectors are those for streaming data sources and object stores: Kafka, Confluent Cloud, RedPanda, Azure Event Bus, Amazon Kinesis, AWS S3, and Google Cloud Storage.
ClickHouse supports analytics preprocessing in the form of materialized views, which are automatically managed and updated as new data is ingested. ClickHouse’s table engines, particularly AggregatingMergeTree, compute partial aggregations with a continuous data summarization process, which assists with aggregated materialized views and in the performance of aggregation queries. For client and tool connectivity, ClickHouse supports an array of third-party integrations: Looker, Superset/Preset, Metabase, Tableau, and PowerBI for business intelligence; a partnership and connector with Grafana for observability; and Deepnote and Jupyter Notebooks for data science. As a serverless offering, ClickHouse Cloud benefits from the elastic scaling of resources and high availability that such a cloud-native architecture provides.
Challenges
While the vendor currently has some support for updating or deleting columns, some of the functions currently available are comparatively “heavy.” However, at the time of this research, the vendor stated that it was working on more “lightweight” versions of these functions and that updates would take effect immediately and ensure query results reflected changes right away. While those updated functions were still in development during our research, this demonstrates ClickHouse’s awareness of areas where its capabilities can be refined and its commitment to pursuing that development.
Purchase Considerations
The original, open source ClickHouse self-managed database offering provides the customizability of a self-managed offering. The ClickHouse Cloud fully-managed, serverless offering provides a higher level of support for configuration, setup, and administration. ClickHouse Cloud is deployable on any of the three major public clouds. Additionally, ClickHouse’s chDB offering consists of open source libraries for using the ClickHouse SQL OLAP engine in Python, Go, Rust, Node.js, and Bun.
Not specifically covered by any of the decision criteria for this report, but interesting and worth noting is an LLM-powered query suggestion capability powered by Amazon Bedrock. The query console will convert natural language typed in the console to SQL, which the vendor says will make its platform easier to use for non-technical users and assist SQL experts as well.
ClickHouse was originally developed for real-time analytics and supports many use cases across industries including adtech, marketing and web analytics, gaming, and usage-based billing.
The vendor says that ClickHouse is also used:
- As a “real-time data warehouse” for internal reporting and analytics workloads, often in tandem with business intelligence applications.
- As a datastore for observability use cases, often coupled with Grafana or a custom UI.
- As a feature store used in training machine learning models.
- For storing vector embeddings and facilitating vector search.
Sonar Chart Overview
ClickHouse is positioned as a Leader on the Platform Play side of the Sonar. It provides a comprehensive offering and scored well across the decision criteria we evaluated.
Imply: Imply Polaris, Imply Enterprise, Imply Enterprise Hybrid
Solution Overview
Imply provides a fully managed offering for open source Apache Druid, a real-time, columnar OLAP database written in Java. The open source Druid provides the core for Imply’s offering. Imply provides additional capabilities such as built-in visualization, simplified setup and maintenance, support services, and platform monitoring. Components of Imply include the underlying Druid database; Pivot, a built-in, self-service visualization component; Imply Manager, a central dashboard from which administrative tasks can be managed; and Imply Clarity, a performance monitoring component. Imply is available in multiple deployment options, including Imply Polaris, Imply Hybrid, and Imply Enterprise (see further details under Purchase Considerations).
Strengths
Storage and analytics optimizations include massively parallel processing, columnar storage, indexing, and time-based partitioning. Data is automatically time-indexed after it is ingested, and it can be partitioned further based on other attributes. The database can optionally roll up (summarize or pre-aggregate) data as it is ingested, which the vendor says can assist with speeding query performance by reducing the size of data stored in the database and reducing row counts, potentially by orders of magnitude. Schema discovery capabilities allow Druid to automatically and flexibly infer schema and data types for data coming into the platform. Data can be ingested from batch or streaming sources. The built-in Pivot visualization capability of the managed Imply offering provides a graphical, drag-and-drop interface for users to explore data, set up alerts, view monitoring, and build dashboards.
Challenges
While rich built-in visualization capabilities are provided through Imply Polaris’s Pivot component, the ecosystem with third-party tools is more limited than some of the other offerings in this report.
Purchase Considerations
Imply provides multiple deployment options, with various combinations of self-hosted and Imply-managed choices available. Imply Polaris is the vendor’s fully managed, cloud-based database as a service offering. At the time of this research (June 2024), Imply Polaris was available on AWS and support for Azure was in beta. Imply Hybrid provides an option whereby the control plane is hosted by Imply and the data remains in the customer’s virtual private cloud instance on AWS. Imply Enterprise is the customer-hosted option, which can be deployed in any of the three public clouds or on-premises.
According to the vendor’s documentation, since the Apache Druid database includes time-based optimizations and design choices, common use cases for which the platform is particularly suited include analysis of clickstream data, time-series data, log and telemetry data, and application performance metrics. Business intelligence and reporting workloads are also supported. These practical applications span a range of industries and verticals. Analysis of event-driven data can support risk and fraud analysis; analysis of log and telemetry data can provide network performance monitoring; and analysis of web and mobile data can support advertising and personalization use cases.
Sonar Chart Overview
Imply is positioned as a Leader on the Platform Play side of the Radar. It has strong scores across the board in a number of the decision criteria and one of the most comprehensive range of deployment options.
Kinetica: Kinetica Cloud, Kinetica Developer Edition, Kinetica Enterprise Edition
Solution Overview
Kinetica provides a distributed, columnar, real-time analytical database offering. It makes heavy use of vector processing and SIMD instructions, and the vendor describes its basis in vector processing as the key to its offering’s achieving query performance at scale. Kinetica makes use of tiered storage, which, according to the vendor, allows it to achieve balance between performance and cost. Kinetica also makes use of what it calls a lockless architecture, which allows queries to be executed against data in real time simultaneously as data is being loaded or streamed into the database. While other databases often lock tables during data loading so that new data is not made available for querying until it’s been fully loaded, in a lockless architecture, there are no such restrictions, and data can be queried as soon as it is ingested in the database.
Strengths
Beyond vector processing and a lockless architecture, Kinetica makes use of other storage and analytics optimizations, including columnar storage and database sharding/partitioning. Kinetica can ingest data from streaming data sources such as Kafka and Confluent, and from object stores such as GCS and AWS S3. Other sources include the Kinetica File System (KiFS) and the Hadoop Distributed File System (HDFS). Data can also be ingested from other sources through JDBC drivers, either user-configured ones, or prepackaged, third-party JDBC drivers available through CData. Kinetica supports materialized views, which can be refreshed manually, on change, on query, or at predetermined intervals. Kinetica supports connections to BI tools such as Tableau and Power BI. Kinetica also supports analytics of geospatial, time-series, and graph data, as well as vector search workloads.
Challenges
While Kinetica includes a bundled data visualization tool called Reveal, and can connect to other third-party applications including business intelligence applications such as Tableau and Power BI via ODBC/JDBC drivers, it appears that Kinetica may not have as many deep or native integrations as some other offerings in this report.
Purchase Considerations
Kinetica provides a number of deployment options. Kinetica Cloud is the vendor’s cloud-based, fully managed service, hosted by Kinetica and designed to simplify setup and administration. Kinetica is also available through the AWS Marketplace with pay-as-you-go pricing. Kinetica Developer Edition can be run locally in a prebuilt Docker container that allows developers to explore the essentials of Kinetica by including the platform itself along with preloaded examples and datasets. Kinetica Enterprise Edition allows customers to manage their own installation of Kinetica with a great degree of control over configuration and administration.
Although not specifically included under any of the decision criteria, it’s worth mentioning that Kinetica recently released a SQL GPT natural language querying function powered by a native, fine-tuned LLM. This feature allows users to ask questions to the database in natural language, including questions that involve complex time-series, graph, geospatial, and vector search functions to answer.
The real-time analytics use cases that Kinetica can support span a wide range of industries and categories. The vendor’s website lists examples in the financial services industry, such as fraud detection, portfolio risk monitoring and analysis, and real-time pricing. The vendor also describes how Kinetica’s support for geospatial data types is leveraged by insurance companies for real-time geospatial database implementations, which helps them to evaluate risk and predict losses from weather events. Additionally, according to the vendor, real-time data from automobile sensors collected and analyzed in Kinetica can provide information for car manufacturers to improve fuel efficiency, automobile design, and engine performance.
Sonar Chart Overview
Kinetica is positioned as a Leader on the Platform Play side of the Sonar, attesting to its broad feature set and support for a wide range of use cases.
Materialize: Materialize
Solution Overview
Materialize is a Postgres-compatible relational database built for performing analytics on real-time data, written in Rust and built around the Timely Dataflow and Differential Dataflow data processing frameworks. Materialize was created with the goal of performing analytical queries on streaming data with low latency. Materialize represents its platform as offering a blend of the advantages of data warehouses (including SQL compatibility) with the speed of streaming data platforms, without what it sees as the disadvantages of each: the expense and latency of batch analytics and the complexity and high engineering costs of streaming data platforms.
Strengths
Storage optimizations in Materialize include indexing and database sharding. Central to Materialize’s solution are materialized views. The vendor says its materialized views differ from traditional ones because they can store results for more complex queries, and they can be kept up to date as data changes. With its emphasis on incrementally maintained materialized views, analytics preprocessing is one of Materialize’s key strengths. Additionally, Materialize can ingest data from relational databases as well as streaming data sources. Materialize’s compatibility with PostgreSQL lends it the ability to integrate with other elements of the Postgres-compatible ecosystem, including SQL clients and other tools.
Challenges
This report captures a snapshot of each vendor’s overall product evolution and journey. At the time the research phase of this report was being conducted, Materialize was actively developing capabilities to support schema evolution, but those were not yet generally available.
Purchase Considerations
Materialize is deployable on AWS. Besides the limited free trial, pricing tiers include an on-demand, pay-as-you-go tier with billing based on compute credits used, and a prepaid capacity-based pricing tier.
According to the vendor’s website, Materialize is suited for any analytics use case across industries and verticals that requires fresh, up-to-date data; specific examples of these include transaction monitoring and fraud detection, automation and alerting, real-time dashboards and apps, personalization, recommendations, dynamic pricing, and functioning as a “real-time feature store” for machine learning use cases.
Sonar Chart Overview
Materialize is positioned as a Challenger on the Platform Play side of the Sonar. It provides a well-rounded offering and supports a breadth of workloads and use cases; however, it did not score as well in some of the decision criteria we evaluated.
MotherDuck: MotherDuck
Solution Overview
MotherDuck provides a serverless, database-as-a-service commercial offering centered around the open source DuckDB RDBMS. In developing its commercial offering for DuckDB, MotherDuck has partnered closely with DuckDB Labs, whose team includes the co-creators of DuckDB and which supervises and provides consulting services for open source DuckDB.
With its serverless architecture, MotherDuck’s managed offering is designed for ease of use and administration; users should be able to get started with it easily and don’t have to worry about configuring or provisioning resources. MotherDuck provides a DuckDB software development kit (SDK) that allows users to extend the capabilities of the base DuckDB instance with MotherDuck’s enhancements via code in Python or a CLI. MotherDuck possesses a web UI, components of which include notebooks, a SQL IDE, and a data catalog. MotherDuck caches query results in a results panel that allows users to interactively sort, filter, and pivot data. MotherDuck is currently deployable in AWS.
Strengths
For storage and analytics optimizations, the underlying DuckDB is columnar, and it makes use of in-memory and vector processing, primary and secondary indexing, data compression, and segmentation and partitioning. DuckDB has native data ingestion capabilities from sources such as Postgres, MySQL, and SQLite. MotherDuck’s platform ecosystem, which also benefits from the DuckDB ecosystem, includes integrations from the products and platforms in these categories: data ingestion, data quality, data science & AI, data transformation, business intelligence, data orchestration, and reverse ETL.
A key component of MotherDuck’s offering is its “dual execution” mode. In this query execution mode, the users’ own laptops function as local nodes that are leveraged by MotherDuck in tandem with its own nodes in the cloud. The theory behind this, according to the vendor, is that although customers possess huge amounts of big data, they still interact with smaller portions of that data at any time. MotherDuck’s theory is thus that it can move the smaller amounts of data closer to where the user is to reduce latency, improve performance, and reduce costs.
Challenges
At the time of this research, the vendor reported that MotherDuck was deployable only on AWS. Additionally, while the vendor said its offering currently provided specialized limited caching capabilities, additional capabilities in the realm of analytics preprocessing remained in development.
Purchase Considerations
Beyond its unlimited free trial, MotherDuck’s pricing model is usage-based, with charges incurred for storage and compute. Its dual execution mode lets the vendor say it strives to make the best use of local computing resources, which it does not charge for, to provide additional cost benefits for its customers.
With its basis in DuckDB, MotherDuck is suited for real-time analytics use cases across industries and verticals, including advertising, e-commerce, and finance. With its dual execution mode, MotherDuck’s solution lends itself well to edge use cases, for which data is queried in place at the edge of the network.
Sonar Chart Overview
MotherDuck is designated a Fast Mover to reflect its roadmap and pace of development. It is also positioned as a strong Challenger and is projected to become a Leader in the near future—despite having only moved into general availability in June 2024—reflecting the rapid development of its managed offering for DuckDB.
SingleStore: SingleStore Helios, SingleStore Self-Managed
Solution Overview
SingleStore, formerly known as MemSQL, supports both in-memory rowstores for transactional workloads and disk-based columnar storage for analytical workloads. The vendor describes its solution as being able to blend the best of transactional and analytic workloads, supporting hybrid transactional analytical processing (HTAP) and straight OLAP and Online Transactional Processing (OLTP) workloads. In addition to transactional and analytic workloads, SingleStore can also support analysis of geospatial data, time-series data, and storing and searching vector data together from its single platform. MemSQL was originally designed as an in-memory, row-oriented database. The vendor then added columnar, disk-based storage as an option, which allowed the database to handle OLAP and purely analytical workloads. This combination of transactional and analytical capabilities led the vendor to rebrand to “SingleStore,” which is intended to highlight its view of itself as a single database that can handle both classes of workloads.
Strengths
For data ingestion, SingleStore can load data through its native “pipelines” component from real-time sources and from cloud object stores. When creating a table in SingleStore, users can choose from either “columnstores” (on-disk column-oriented stores) or relational “rowstores” (typical relational database tables). Columnstores make use of sharding, indexing, and sort keys, and bring the typical advantages of column-oriented storage to the data stored in them: faster, more efficient scanning and data compression for analytic workloads. Data in rowstores is “lock-free,” meaning the table is not locked while data is being inserted but rather available for querying as soon as it lands in the table, reducing latency for high-concurrency workloads. SingleStore is horizontally scalable: additional nodes can be added to allow the database to scale to handle the needs of increasing workloads. SingleStore Notebooks provides a native interface within SingleStore, based on and extending the capabilities of Jupyter Notebooks, that allows data engineers, data scientists, app developers, and others to work with their data in SQL and Python from a familiar interface.
Challenges
SingleStore prides itself on being able to handle the challenging demands of some of the largest enterprises. As such, it has the potential to be “overkill” for the workloads of smaller businesses. However, SingleStore has certainly made itself accessible to any size of business or workload, including through its free trial version of its cloud offering, which provides customers the ability to experience the capabilities of its offering without being locked in to any type of upfront commitment.
Purchase Considerations
SingleStore Helios is the vendor’s fully managed, cloud-based database-as-a-service offering. Helios is deployable on any of the three public clouds. SingleStore is also available as a Snowflake Native App on the Snowflake Data Cloud Marketplace. SingleStore Self-Managed is the vendor’s customer-managed solution, providing all the core functionality of SingleStoreDB. It can be hosted on bare metal deployments, Kubernetes, or in a customer’s virtual private cloud. SingleStore also provides a free, trial version of its cloud offering for customers interested in trying out the capabilities of its offering without a lengthy or expensive commitment.
SingleStore’s architecture and capabilities make it suitable for many types of workloads, including those that require processing and facilitating low-latency queries over real-time data, across industries and verticals. The vendor says that some of its key verticals include financial services and fintech, adtech, marketing and sales, streaming media and telecom, retail and e-commerce, and healthcare, and that its full list of supported verticals includes many others.
SingleStore also supports working with vector data, which allows users to rely on SingleStore to store and search vector embeddings. SingleStore’s vector database capabilities allow it to support semantic search, hybrid vector and full-text search, image recognition, video surveillance and security, retrieval augmented generation (RAG), and more.
Sonar Chart Overview
SingleStore is positioned as a Leader on the Platform Play side of the Sonar and close to the center of the Feature Play/Platform Play axis, which is reflective of its breadth of features and its balanced blend of support for transactional and analytical workloads.
StarRocks: StarRocks
Solution Overview
StarRocks is an open source, columnar, massively parallel processing database that makes use of vector processing and SIMD instructions. StarRocks can query data loaded from batch and real-time sources. StarRocks can also query data stored in external sources through catalogs, which allow it to access the metadata of those sources and query data stored in Delta Lake, Hudi, or Iceberg formats. According to the project’s documentation, StarRocks makes use of partitioning and bucketing to improve scanning efficiency and concurrency.
Strengths
Storage and analytics optimizations include massively parallel processing architecture, an execution engine written in C++ that makes use of vector processing and SIMD optimizations, columnar storage, partitioning and indexing, and a cost-based query optimizer. StarRocks says these optimizations enable it to query data organized via a star or snowflake schema directly, without denormalization of the data. StarRocks’ asynchronous materialized views support an algorithm that provides the ability to rewrite queries against base tables into queries against a corresponding materialized view with precomputed results, without the need to modify the query statement. Separation of storage and compute lets the database scale to meet workload needs. StarRocks describes its primary key indexes as powering its ability to support real-time upserts. StarRocks says its data caching framework helps speed queries that require data from external storage by minimizing I/O from these external sources. StarRocks can integrate with a range of tools such as DBeaver, observability tools such as Datadog, and business intelligence tools such as Hex, Apache Superset, Metabase, and Tableau Desktop.
Challenges
As an open source offering (see purchase considerations below), this wouldn’t be the best fit for organizations whose priorities revolved around seeking a more low-touch offering, in terms of administration and maintenance.
Purchase Considerations
In contrast to most of the other offerings in this report, StarRocks is an open source real-time analytical database, rather than a vendor’s managed offering based on an open source project. This means there will not be the same level of support for setup and maintenance as most fully managed offerings provide. However, StarRocks is an important project in the real-time analytical databases category and, for those with the technical know-how, the level of customization that an open source project would provide may be appealing.
According to the project’s website, StarRocks’ use cases for real-time analytics have included fraud analytics, observability, monitoring, and analysis of business metrics and user data.
Sonar Chart Overview
StarRocks is positioned as a Leader and on the Platform Play side of the Sonar, attesting to its strong showing across the board on the decision criteria for this report.
StarTree: StarTree Cloud
Solution Overview
StarTree, founded by the creators of Apache Pinot, provides a fully managed, cloud-native, database-as-a-service offering, StarTree Cloud, that provides all the capabilities of the underlying open source Pinot real-time analytical database, along with the vendor’s enhancements. StarTree Cloud also has two subcomponents: StarTree Data Manager, an included ingestion component from which all the offering’s data sources (including both batch and streaming data sources) can be managed in a no-code fashion, and StarTree ThirdEye, an anomaly detection and root cause analysis component that is available through an additional license.
As the vendor puts it, Apache Pinot is a real-time, distributed OLAP database that has been specifically architected for low-latency, high-efficiency analytics of real-time data with high concurrency. In addition to the capabilities open source Pinot provides, the StarTree Cloud managed offering provides further enhancements that include:
- Additional ingestion sources and support.
- Additional support for managing tables and data including additional indexes and index recommendations.
- Scalable real-time upsert capabilities: StarTree Cloud’s scalable upserts benefit from the removal of the memory limitations of the open-source Apache Pinot’s upsert capability.
- Advanced schema evolution.
- Overall ease of administration, setup, and maintenance.
Strengths
StarTree is a columnar database, optimized for storage with database sharding and a comprehensive range of index types (listed below). Shards, called “segments,” are autogenerated time-based or other column-based partitions of table data stored in a columnar format along with the dictionaries and indexes for the columns. StarTree makes use of a comprehensive list of specialized indexes: forward, inverted, range, JSON, and vector; two types of text indexes that enable text searches; timestamp and geospatial indexes; and the star-tree index. The last of these optimizes the filter-and-aggregate query pattern and functions as what the vendor calls an “intelligent materialized view,” offering the ability to partially compute pre-aggregations and configure the level of pre-aggregations versus data scans. This set of indexes is one of the special differentiating features of this offering and indicative of one of the strategies to which the vendor attributes its ability to optimize query performance. The offering provides support for ingestion of both real-time and batch data, the ability to scale to petabyte-scale workloads, replication to enable high availability, and handling for schema evolution, which allows it to flexibly respond to changes in source schema.
Challenges
While StarTree has a prebuilt connector for Superset, and connections with Streamlit, Retool, and Tableau, it doesn’t have as many deep or native connections with other third-party tools as some of the other vendors in this space. However, to StarTree’s credit, it has acknowledged this limitation and is working to improve in this area.
Purchase Considerations
StarTree can be hosted on any of the three public clouds, and in a BYOC (bring your own cloud) arrangement, where the data remains in the customer’s cloud account. StarTree recently released an unlimited Free Tier, providing a workspace for developers to try out its capabilities. Standard and premium pricing tiers, designed for customers looking to get started immediately, provide initial base amounts of storage and annual billing, which can be further increased with additional storage and compute units. StarTree also provides enterprise-level tiers that it says can support workloads up to petabyte-scale. The vendor describes its tiered storage approach, in which more frequently-accessed data is stored locally on disk and more historical data in cloud object storage, as providing a “best of both worlds” approach between speed and cost-effectiveness.
The Apache Pinot open source database that provides the foundation for StarTree’s offerings was originally developed at LinkedIn to power real-time analytics for the “who’s viewed your profile” feature. According to StarTree, as Pinot had its origin in a social media use case, Pinot was built to handle unbounded amounts of concurrent queries, and has grown and expanded, along with LinkedIn itself, to handle additional new use cases and services.
The offering now has grown to power real-time analytics across a wide range of use cases and verticals. Some additional examples include fraud detection, ride-share and food-delivery applications, personalized recommendations in e-commerce platforms or social media feeds, and system monitoring and observability.
Sonar Chart Overview
StarTree is positioned as a Leader on the Platform side of the Sonar, reflecting its strong scores across the decision criteria. It’s also designated a Fast Mover. These designations attest to its forward-thinking roadmap and pace of development, as well as the investment, effort, and resources it has devoted to building its offering.
6. Editor’s Note
At the end of June 2024, OpenAI announced its acquisition of Rockset, stating plans to leverage Rockset’s real-time analytical database offering to “power [OpenAI’s] retrieval infrastructure.” The Rockset solution is no longer commercially available, and existing customers have until September 30, 2024 to offboard.
Although the Rockset solution is no longer available for purchase, we’re including its write-up because it was a strong offering built on RocksDB, and it’s likely that the market will see similar solutions in the future. Additionally, the acquisition demonstrates the clear value of real-time analytical databases as a growing market because a very important company in the data, analytics, and AI landscape saw the value in this offering. While Rockset is not the only company that had been using RocksDB as either a component of or the foundation for its database, the number of customers that Rockset had accumulated demonstrates the success that such a solution can provide.
Rockset’s original Solution Overview, Strengths, and Challenges are included below; however, the Rockset solution is not included in the scoring tables or Sonar chart for this report.
Rockset
Solution Overview
Rockset provides a fully managed, cloud-native, serverless, real-time analytical database offering that is primarily based on document-oriented database technology but with a unique indexing strategy and architectural choices that allow it to support relational SQL queries. Rockset prizes its Converged Index, which it describes as combining elements of columnar storage, vector indexing, search indexing, and a row store, applied on top of a key-value store abstraction. Rockset also makes use of sharding and a cost-based query optimizer. The query optimizer can support joins, and it leverages elements of the different types of indexes that make up its Converged Index portfolio to optimize queries for different types of data access patterns.
Rockset also describes its Converged Index as the key to allowing it to process updates efficiently. Rockset’s “schemaless ingest” and “Smart Schema” capabilities allow it to ingest all data types without prebuilt schemas, and automatically index and generate a schema for the data to enable SQL queries over that data. In addition to structured data, this is possible for semistructured data, nested objects and arrays, and mixed types, and the platform can handle nulls.
Rockset enables real-time analytics use cases across many categories and industries. These include recommendations such as ad personalization and dynamic pricing, gaming analytics, logistics tracking, anomaly detection, and monitoring and observability. Rockset also enables generating vector embeddings from unstructured data to facilitate vector search, RAG, recommendations, semantic search, and analysis of time-series and geospatial data.
Strengths
One of Rockset’s signature analytics optimizations is the Converged Index mentioned above. Additionally, Rockset’s schemaless ingest and Smart Schema capabilities contribute to its ability to ingest and query data without needing to know in advance the schema or “shape” of the data. Rockset has a number of fully managed, native integrations with streaming data sources, object stores, relational databases, and other data source types. Rockset can ingest data from streaming and batch data sources through different ingestion modes.
Rockset provides a number of analytics preprocessing capabilities, including incrementally updated materializations and the ability to aggregate data as it is ingested through transformations called Rollups. In the area of client/tool connectivity, Rockset includes integrations with a number of client libraries, adapters, and tools, including native integrations with a number of visualization tools such as Grafana, Superset, Power BI, Tableau, and Redash. Additionally, Rockset is enriching its database with integrations that it says will help it support next-generation RAG generative AI applications, recommendations, and anomaly detection workloads. To this end, Rockset also natively integrates with LangChain, LlamaIndex, and Feast Feature Store.
Challenges
As a serverless platform, this wouldn’t be the best fit for customers with requirements to keep data on-premises.
7. Near-Term Roadmap
Real-time analytical databases are fundamentally tasked with ascertaining the current state of an organization and its data at any particular moment. Upon this foundation, decision-makers can ask informed questions, make plans, and respond to events in a timely and efficient way, and developers can build engaging, relevant, and responsive apps. Real-time analytical databases essentially function as a window into the state of an organization’s data at any given time, allowing access to timely and relevant insights into whatever applications or dashboards are built upon them.
Real-time analytical databases support a host of structural and analytical optimizations to provide fresh data to as many users as needed. Besides many different types of indexing, database sharding, parallelization, columnar storage, and vector processing/SIMD operations, they also often have native support for semistructured data and other data types such as geospatial data. This level of data type support helps them facilitate processing this data and making it available for querying with the low latency required for real-time workloads and applications. These databases continue to build out the depth and breadth of support for different data types, including open table formats and unstructured data. In doing so, they also increase the overlap between this category and others, such as data warehouses, data lakehouses, time-series databases, and vector databases.
From disaster response and emergency services to supply chain management, system monitoring, and gaming analytics, use cases can run the gamut of operational situations, services, and industries. Furthermore, offerings in this space have started to embrace aspects of vector databases, including support for storing vector embeddings, facilitating semantic search, retrieval-augmented generation, image recognition, text recognition and analytics, and other workloads related to generative AI. Some vendors are also bringing the benefits of these technologies to bear for data analysts, who are the users of their databases, through LLM-powered features or integrations such as natural-language querying interfaces. This development reflects the latest installment in a trend of the overall broadening scope of these databases, addressing a growing array of workloads from adjacent categories, including time-series analysis, data observability, anomaly detection, and now, vector embedding storage and search.
8. Analyst’s Outlook
Whether organizations are seeking an initial investment in or an enhancement of an existing solution for real-time analytics, they should begin by evaluating the current landscape of solutions available in the market. The current vendor landscape in real-time analytical databases consists entirely of pure-play solutions, whose sole focus is real-time analytics. All offer strong solutions purpose-built for real-time analytics and able to support a broad range of workloads.
In fact, the array of workloads continues to broaden, as a number of these vendors add support to their platforms for additional capabilities from adjacent categories, including data warehouse, data lakehouse, data observability, time-series, and vector database functionality. A decent proportion of vendors provide a managed offering with a basis in an open source database, and the range of offerings in this category spans the full spectrum from managed services to customer-managed solutions.
Given the variety of vendors represented in the real-time analytical database market, it’s important for potential buyers to do some strategizing up front to ensure they have clear goals regarding their intended objective in adopting such a database. A real-time analytical database has powerful analytical potential, and although this category of databases is one that appears to have few limits on the types of analytical workloads it can be used for, it’s not a platforming decision that should be taken lightly. All the vendors in the category are pure plays, so customers should carefully consider existing infrastructure, namely data sources and existing cloud provider(s), so they can choose a solution that harmonizes well with what they already have.
Since there’s a significant range in the level of managed support for these offerings, potential buyers should also consider their organizational makeup and data privacy requirements to determine whether they’d prefer a relatively hands-free, managed, SaaS solution, or whether they need or prefer one that is self-managed and/or deployable on-premises. Additionally, to the extent this information is available, customers should consider the roadmaps of potential solutions. For example, a number of platforms have brought in generative AI-powered features such as natural language querying, and a number of offerings have embraced aspects of vector databases.
These questions provide a starting point for customers evaluating these solutions, but they don’t reflect all the considerations customers should keep in mind. Potential customers will benefit most from performing some up-front preparation, taking stock of their own particular goals, pain points, use cases, current systems, and workflows that will allow them to best evaluate currently available real-time analytical database options.
As the importance of use cases requiring real-time analytics grows, platforms from categories adjacent to the real-time analytical databases category, such as more traditional analytical platforms and streaming data platforms, may add some of the features and optimizations that are currently characteristic of real-time analytical databases. This poses a challenge to current real-time analytical database vendors, as the majority of these vendors are pure-play specialists.
Additionally, as this report was going to press, AI startup OpenAI announced its acquisition of real-time analytical database vendor Rockset. While this acquisition validated the real-time analytical database category in a high-profile way, it also set the stage for potential future acquisitions, and the precedent established could certainly give a potential customer some pause.
However, to counterbalance these cautions, a number of these platforms are based on an open source database or technology, either as the foundation for their platform or for a core component of their offering. These open source technologies possess wide ecosystems and are bolstered by the support of their existing robust communities, which serve to guarantee the continuation and development of the technologies outside of any particular vendor’s existence.
9. Report Methodology
A GigaOm Sonar report analyzes emerging technology trends and sectors, providing decision-makers with the information they need to build forward-looking—and rewarding—IT strategies. Sonar reports provide analysis of the risks posed by the adoption of products that are not yet fully validated by the market or available from established players.
In exploring bleeding edge technology and addressing market segments still lacking clear categorization, Sonar reports aim to eliminate hype, educate on technology, and equip readers with insight that allows them to navigate different product implementations. The analysis highlights core technologies, use cases, and differentiating features, rather than drawing feature comparisons. This approach is taken mostly because the overlap among solutions in nascent technology sectors can be minimal. In fact, product implementations based on the same core technology tend to take unique approaches and focus on narrow use cases.
The Sonar report defines the basic features that users should expect from products that satisfactorily implement an emerging technology, while taking note of characteristics that will have a role in building differentiating value over time.
In this regard, readers will find similarities with the GigaOm Key Criteria and Radar reports. Sonar reports, however, are specifically designed to provide an early assessment of recently introduced technologies and market segments. The evaluation of the emerging technology is based on:
- Core technology: Table stakes
- Differentiating features: Potential value and key criteria
Over the years, depending on technology maturation and user adoption, a particular emerging technology may either remain niche or evolve to become mainstream (see Figure 3). GigaOm Sonar reports intercept new technology trends before they become mainstream and provide insight to help readers understand their value for potential early adoption and the highest ROI.
Figure 3. Evolution of Technology
10. About Andrew Brust
Andrew Brust has held developer, CTO, analyst, research director, and market strategist positions at organizations ranging from the City of New York and Cap Gemini to GigaOm and Datameer. He has worked with small, medium, and Fortune 1000 clients in numerous industries and with software companies ranging from small ISVs to large clients like Microsoft. The understanding of technology and the way customers use it that resulted from this experience makes his market and product analyses relevant, credible, and empathetic.
Andrew has tracked the Big Data and Analytics industry since its inception, as GigaOm’s Research Director and as ZDNet’s original blogger for Big Data and Analytics. Andrew co-chairs Visual Studio Live!, one of the nation’s longest-running developer conferences, and currently covers data and analytics for The New Stack and VentureBeat. As a seasoned technical author and speaker in the database field, Andrew understands today’s market in the context of its extensive enterprise underpinnings.
11. About GigaOm
GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.
GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.
GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.
12. Copyright
© Knowingly, Inc. 2024 "GigaOm Sonar for Real-Time Analytical Databases" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.