This GigaOm Research Reprint Expires Dec 1, 2024

GigaOm Radar for Scale-Out File Storagev4.0

1. Summary

Despite the growing use of object storage, file storage remains one of the most popular ways to store data, both on-premises and in the cloud. Scale-out file storage is becoming the default choice for most organizations for several reasons:

  • Scale-out file storage can expand quickly while increasing throughput.
  • File systems are typically accessed via familiar, established network protocols, like network file system (NFS) and server message block (SMB), making them still the data storage system of choice for a large number of workloads, including big data analytics, artificial intelligence/machine learning/deep learning (AI/ML/DL), high-performance computing (HPC), and more.
  • Modern file systems are much more scalable than in the past, combining a familiar user interface (UI) and authentication methods with performance and scalability.
  • Legacy applications continue to drive demand for file storage. Usually written to work with POSIX-compliant file systems, the cost of refactoring such applications to benefit from object storage may outweigh the benefits, thus making file storage a preferred option.
  • Modern scale-out solutions are mature and flexible, with most of the complexity now hidden behind the scenes. In the end, managing a large scale-out system is less time-consuming than managing several scale-up systems.
  • Solutions that support data mobility across different environments are becoming increasingly important for executing properly on hybrid IT strategies, and scale-out file storage systems are easy to implement on cloud virtual machine (VM) instances. GigaOm’s recently published “Key Criteria Report for Evaluating Cloud File Storage Solutions” highlights the growing demand for sophisticated file services both on-premises and in the cloud.

Unstructured data accounts for up to 90% of what is stored in enterprise infrastructures. That’s why storage that’s scalable and fast enough to manage interactive workloads is crucial for responding adequately to business needs. Enterprises don’t want to trade scalability and performance for the data services and flexibility they usually get from traditional scale-up network-attached storage (NAS) solutions. With the advent of multicloud capabilities, users want the flexibility to move data to where it’s needed, further increasing the demand for advanced data services. At the same time, users want solutions ready to respond to increasing regulatory requirements, data governance tasks, and risks coming from a growing number of security threats, including ransomware.

This expansion of the IT mission is why scale-out storage systems are much more balanced than in the past and encompass enterprise considerations like scalability, flexibility, efficiency, security, and performance.

This GigaOm Radar report highlights key scale-out file storage vendors and equips IT decision-makers with the information needed to select the best fit for their business and use case requirements. In the corresponding GigaOm report “Key Criteria for Evaluating Scale-Out File Storage Solutions,” we describe in more detail the capabilities and metrics that are used to evaluate vendors in this market.

This is our fourth year evaluating the scale-out file storage space in the context of our Key Criteria and Radar reports. All solutions included in this Radar report meet the following table stakes—capabilities widely adopted and well implemented in the sector:

  • File protocols
  • Data services
  • Tiering
  • Secure operations
  • System management

How to Read this Report

This GigaOm report is one of a series of documents that helps IT organizations assess competing solutions in the context of well-defined features and criteria. For a fuller understanding, consider reviewing the following reports:

Key Criteria report: A detailed market sector analysis that assesses the impact that key product features and criteria have on top-line solution characteristics—such as scalability, performance, and TCO—that drive purchase decisions.

GigaOm Radar report: A forward-looking analysis that plots the relative value and progression of vendor solutions along multiple axes based on strategy and execution. The Radar report includes a breakdown of each vendor’s offering in the sector.

2. Market Categories and Deployment Types

For a better understanding of the market and vendor positioning (Table 1), we assess how well solutions for scale-out file systems are positioned to serve specific market segments and deployment models.

For this report, we recognize the following market segments:

  • Enterprise: Optimal solutions in this category have a strong focus on flexibility, data services, and features that improve security and data protection. Scalability is another big differentiator, as is the ability to deploy the same service in different environments.
  • High performance: Optimal solutions are designed for specific workloads and use cases, such as big data analytics, AI/ML/DL, and HPC. The key differentiators in this area are performance, scalability, and GPUDirect support.

In addition, we recognize two deployment models:

  • Hardware appliance: Provided as self-contained physical devices, these appliances include all the components necessary to deliver scale-out file storage capabilities. These devices are fully supported by their vendors, and other than the management of the platform, all the customer needs to take care of is applying hotfixes or patches. This deployment model delivers simplicity at the expense of flexibility.
  • Software-defined storage (SDS): These solutions are meant to be deployed on commodity servers on-premises or in the cloud, allowing organizations to build hybrid or multicloud scale-out file storage infrastructures. These solutions provide more flexibility in terms of deployment options, cost, and hardware.

Table 1. Vendor Positioning: Market Segment and Deployment Model

Market Segment

Deployment Model

Enterprise High Performance Hardware Appliance Software-Defined Storage
Cohesity
DDN
Dell Technologies
Hammerspace
IBM
NetApp
Nutanix
OSNexus
Panasas
Pure Storage
Quantum
Qumulo
Quobyte
Scality
ThinkParQ
VAST Data
WEKA
3 Exceptional: Outstanding focus and execution
2 Capable: Good but with room for improvement
2 Limited: Lacking in execution and use cases
2 Not applicable or absent

For this evaluation, we looked at offerings in a binary way, rating vendors (++) if they support that market segment and deployment model and (-) if they do not.

3. Key Criteria Comparison

Building on the findings from the GigaOm report, “Key Criteria for Evaluating Scale-Out File Storage Solutions,” Tables 2, 3, and 4 summarize how each vendor included in this research performs in the capabilities we consider differentiating and critical in this sector.

  • Key criteria differentiate solutions based on features and capabilities, outlining the primary criteria to be considered when evaluating a scale-out file storage solution.
  • Evaluation metrics provide insight into the non-functional requirements that factor into a purchase decision and determine a solution’s impact on an organization.
  • Emerging technologies show how well each vendor takes advantage of technologies that are not yet mainstream but are expected to become more widespread and compelling within the next 12 to 18 months.

The objective is to give the reader a snapshot of the technical capabilities of available solutions, define the perimeter of the market landscape, and gauge the potential impact on the business.

Table 2. Key Criteria Comparison

Key Criteria

Object Storage Integration Public Cloud Integration Flash Memory Optimizations AI/ML-Based Analytics & Management Data Management Ransomware Protection Kubernetes Support GPUDirect Support
Cohesity
DDN
Dell Technologies
Hammerspace
IBM
NetApp
Nutanix
OSNexus
Panasas
Pure Storage
Quantum
Qumulo
Quobyte
Scality
ThinkParQ
VAST Data
WEKA
3 Exceptional: Outstanding focus and execution
2 Capable: Good but with room for improvement
2 Limited: Lacking in execution and use cases
2 Not applicable or absent

Table 3. Evaluation Metrics Comparison

Evaluation Metrics

Scalability Flexibility Performance Efficiency Upgradeability Ease of Use
Cohesity
DDN
Dell Technologies
Hammerspace
IBM
NetApp
Nutanix
OSNexus
Panasas
Pure Storage
Quantum
Qumulo
Quobyte
Scality
ThinkParQ
VAST Data
WEKA
3 Exceptional: Outstanding focus and execution
2 Capable: Good but with room for improvement
2 Limited: Lacking in execution and use cases
2 Not applicable or absent

Table 4. Emerging Technologies Comparison

Emerging Technologies

NVMe-oF & NVMe/TCP Flash Innovations
Cohesity
DDN
Dell Technologies
Hammerspace
IBM
NetApp
Nutanix
OSNexus
Panasas
Pure Storage
Quantum
Qumulo
Quobyte
Scality
ThinkParQ
VAST Data
WEKA
3 Exceptional: Outstanding focus and execution
2 Capable: Good but with room for improvement
2 Limited: Lacking in execution and use cases
2 Not applicable or absent

By combining the information provided in the tables above, the reader can develop a clear understanding of the technical solutions available in the market.

4. GigaOm Radar

This report synthesizes the analysis of key criteria and their impact on evaluation metrics to inform the GigaOm Radar graphic in Figure 1. The resulting chart is a forward-looking perspective on all the vendors in this report based on their products’ technical capabilities and feature sets.

The GigaOm Radar plots vendor solutions across a series of concentric rings, with those set closer to the center judged to be of higher overall value. The chart characterizes each vendor on two axes—balancing Maturity versus Innovation and Feature Play versus Platform Play—while providing an arrow that projects each solution’s evolution over the coming 12 to 18 months.

Figure 1. GigaOm Radar for Scale-Out File Systems

It’s important to note that there is a higher count of vendors compared to last year, as this year’s report combines both enterprise-focused and high-performance-focused solutions while last year’s split these into separate Radars. This has impacted the positioning of some vendors compared to last year. Specifically, they are no longer evaluated only against their peers in an enterprise or high-performance capacity, but across all vendors in both. This is particularly impactful for the performance evaluation metric.

As you can see in Figure 1, there are five groups of vendors spread across the four areas of the radar. Three of those groups are in the right hemisphere of the Radar, which covers Platform Play solutions.

The first group consists of Outperformers in the Innovation/Platform Play quadrant, with Cohesity, NetApp, Pure Storage, Qumulo, VAST Data, and WEKA. Cohesity’s solution focuses on capacity-oriented enterprise workloads and offers a thorough set of capabilities across data management, cyber resiliency, and data protection, with a strong pace of execution and a good roadmap. NetApp continues to capitalize on the successful execution of its vision and strategy, with strong innovation, new QLC NetApp AFF C-Series appliances launched in 2023 that improve its competitiveness in the enterprise segment, and outstanding data management and ransomware capabilities. Pure Storage has expanded its FlashBlade portfolio with the capacity-oriented, QLC-based//E series in June 2023. The company is now capable of delivering on performance, capacity, and density across both the enterprise and high-performance markets. Qumulo is back among the Outperformers thanks to improvements across several areas, including a first-party Azure offering and a strong roadmap ahead with expected improvements in storage efficiency and multitenancy. VAST Data has an excellent modern flash platform built for massive scale and uncompromising performance. The company has made multiple improvements with data management-related features; however, it now needs to focus on cloud integration. WEKA continues to focus on high-performance and next-gen workloads. It has a strong multicloud presence and a dynamic roadmap that will address some of the evaluated key criteria in the future.

The second group, still in the Innovation/Platform Play quadrant, comprises three Fast Movers: Dell Technologies, Hammerspace, and Nutanix. Dell Technologies offers remarkable AI-based analytics, Kubernetes support, and ransomware protection capabilities on its PowerScale platform, which supports both enterprise and high-performance use cases. Hammerspace provides one of the best global namespace implementations, enabling standards-based global high-performance file access and automated data orchestration to bridge storage silos from any vendor in on-premises, multisite, and hybrid cloud use cases. It has significantly improved its support for high-performance implementations, thanks not only to the acquisition of RozoFS in 2023, but also to innovations in the Linux kernel that enabled a client-based parallel file system architecture for NFS. Nutanix offers a very complete scale-out file storage solution targeting enterprise workloads, including laudable data management and ransomware protection capabilities. The solution has seen improvements in availability and resiliency in 2023, and it can be deployed in AWS and Azure.

The third group, in the Maturity/Platform Play area, includes DDN and IBM. DDN maintains its strong focus on AI and HPC workloads with an updated Lustre-based Exascaler EXA6 appliance, which delivers scalability, performance, and multitenancy—key capabilities for these types of workloads—and a cloud-based solution branded EXAScaler Cloud which runs natively on AWS, Azure, and GCP. IBM Spectrum Scale continues to demonstrate its relevance with steady improvements, including snapshot immutability, ransomware protection, and containerized S3 access services for high-performance workloads, with concurrent file and object access.

The left hemisphere, which covers Feature Play solutions, holds two additional groups. The first one, in the Innovation/Feature Play area, consists of Panasas, Quantum, and Quobyte. Panasas is launching new appliances in November 2023 and has a compelling roadmap ahead, with a medium-term vision to enable cloud-native deployments. The company is expanding its feature set through in-house development and tactical partnerships, and it is moving toward the Platform Play area. Quantum is an Outperformer this year, with continued improvements to its StorNext solution and the launch of a new product, Myriad, which addresses new use cases. Myriad’s feature set is currently limited but will gradually improve thanks to an ambitious roadmap and faster release cycles. Quobyte offers linear scalability, a good architecture, and multiprotocol access, but it’s missing several of our key criteria (either included in the report last year or added this year).

The group in the Maturity/Feature Play quadrant includes OSNexus, Scality, and ThinkParQ. OSNexus continues to work on a strong architectural foundation with a great implementation of storage tiers and massive scalability; support for GigaOm’s key criteria remains very limited, but the company has a stronger roadmap this year. Scality’s solution is laser-focused on large, sequential, massively parallel workloads; while the solution may seem limited in capabilities, it truly excels in its focus area, even with some high-performance workloads. ThinkParQ BeeGFS continues to deliver a scalable, lightweight, and highly tunable file system for HPC workloads, now with GPUDirect support. Despite conscious architectural decisions that limit the applicability of GigaOm’s key criteria, the solution is still very relevant to the HPC community.

In reviewing solutions, it’s important to keep in mind that there are no universal “best” or “worst” offerings; there are aspects of every solution that might make it a better or worse fit for specific customer requirements. How a solution aligns with customer needs and context is an important purchase consideration. Prospective customers should consider their current and future needs when comparing solutions and vendor roadmaps.

Inside the GigaOm Radar

The GigaOm Radar weighs each vendor’s execution, roadmap, and ability to innovate to plot solutions along two axes, each set as opposing pairs. On the Y axis, Maturity recognizes solution stability, strength of ecosystem, and a conservative stance, while Innovation highlights technical innovation and a more aggressive approach. On the X axis, Feature Play connotes a narrow focus on niche or cutting-edge functionality, while Platform Play displays a broader platform focus and commitment to a comprehensive feature set.

The closer to center a solution sits, the better its execution and value, with top performers occupying the inner Leaders circle. The centermost circle is almost always empty, reserved for highly mature and consolidated markets that lack space for further innovation.

The GigaOm Radar offers a forward-looking assessment, plotting the current and projected position of each solution over a 12- to 18-month window. Arrows indicate travel based on strategy and pace of innovation, with vendors designated as Forward Movers, Fast Movers, or Outperformers based on their rate of progression.

Note that the Radar excludes vendor market share as a metric. The focus is on forward-looking analysis that emphasizes the value of innovation and differentiation over incumbent market position.

5. Vendor Insights

Cohesity

Cohesity’s SmartFiles is a sophisticated software-defined storage (SDS) solution that merges seamlessly with high-performance systems to create highly efficient two-tier architectures. Drawing from the core technology of the Cohesity platform, SmartFiles supports a multitude of workloads prevalent in enterprise environments. It communicates through various protocols like SMB, NFS, and S3. Notably, this system enables simultaneous data access through both file and object protocols, making it an adaptable solution for enterprises.

Cohesity Helios offers a dedicated solution for managing data that provides a comprehensive view of the environment’s health and security posture. It enables analytics-driven data management through separate dashboards for data consumption and performance analytics. The solution also analyzes third-party NAS systems for data usage patterns and automatically tiers data if required.

Data management remains one of the key differentiators for Cohesity. In fact, it has implemented an extensive series of features in this area aimed at simplifying data mobility, protection, and security. SmartFiles includes remote data replication capabilities, automated tiering between different storage systems and the cloud, transparent archiving functionality, data migration, and finally, sophisticated ransomware protection, which benefits from Cohesity’s platform-level advanced security features.

Cohesity’s ransomware protection uses ML to detect attacks early by monitoring data changes against standard patterns and measuring abnormal activity against the established baseline. It’s built on immutable snapshots and augmented with Fort Knox, a secure, cloud air-gap storage solution provided as a service. A strong zero-trust multifactor authentication (MFA) module with quorum-based approval for sensitive environmental actions is another distinguishing feature.

In addition, customers can access Cohesity’s user behavior analytics (UBA). This capability detects risky user behaviors by identifying indicators of data exfiltration, tampering, deletion, and more. It also audits user file activities with interactive log search.

This approach is further enhanced by Cohesity DataHawk, an add-on to Smart Files that provides automated threat intelligence by simplifying threat detection through an ML-based engine. DataHawk offers highly curated and managed indicator of compromise (IOC) threat feeds to help identify impacted backup snapshots, servers, virtual machines, and files in the event of a cyberattack. DataHawk supports data discovery and classification for finding and identifying regulated data such as PII, HIPAA, and PCI and includes many classifiers and predefined policies with ML-based pattern matching and recognition. Furthermore, it is extensible to leading SIEM and SOAR solutions, and more, via third-party integrations. Kubernetes support is still provided through a container storage interface (CSI) driver.

Strengths: Cohesity has a rich feature set with many improvements in security and data management. The solution benefits from a unique positioning thanks to the convergence of data security, data management, and data protection, making it optimally suited to cover a broad range of use cases and enterprise requirements.

Challenges: Despite a very rich feature set, Cohesity SmartFiles still does not support Kubernetes workloads fully.

DDN

DDN EXAScaler appliances provide scale-out file storage capabilities through a parallel file system based on Lustre, with a fast, hyper-converged data storage platform in a package that is easy to deploy. DDN Exascaler’s fast parallel architecture enables scalability and performance, supporting low-latency workloads and high-bandwidth applications such as GPU-based workloads, AI frameworks, and Kubernetes-based applications. EXAScaler systems are deployable on all NVMe flash or hybrid nodes (NVMe + HDD); in May 2023, the company introduced QLC-based appliances that also support EXAScaler.

EXAScaler offers a front-end S3 object storage implementation, and it can store objects to any S3 back end via DataFlow, a data management platform tightly integrated with EXAScaler. Although it’s a separate product, most DDN users rely on DataFlow for platform migration, archiving, data protection, data movement across the cloud, repatriation, and more.

Besides physical appliances, a cloud-based solution branded EXAScaler Cloud runs natively on Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). Customers can quickly obtain it from each cloud provider’s marketplace. Features such as “cloud sync” enable multicloud and hybrid data management capabilities within EXAScaler for archiving, data protection, and bursting of cloud workloads.

The product offers data security features such as role-based access control (RBAC), secure multitenancy, and encryption. Lustre’s capabilities around changelog data and audit logs are built into the EXAscaler product, giving customers better insights into their data. The solution also supports immutable snapshots.

DDN offers a native EXAScaler CSI driver to support Kubernetes environments, and it has developed a unique EXAScaler parallel client that can mount the file system within a container, enabling full parallel data paths for containerized applications.

Built with AI and HPC workloads in mind, DDN excels in GPU integration, having accomplished the first NVIDIA GPUDirect integration. The EXAScaler client is deployed into the GPU node, enabling remote direct memory access (RDMA) and the monitoring of application access patterns from the GPU client all the way to the disk, providing outstanding levels of workload visibility. DDN now also provides reference architectures to support NVIDIA DGX Pod and DGX SuperPOD.

Strengths: DDN EXAscaler is based on the Lustre parallel file system and offers a scalable and performant solution that gives its customers a secure and flexible system with multitenancy, encryption, replication, and GPU integration capabilities.

Challenges: Data management is still limited in DDN.

Dell Technologies

Dell Technologies provides scale-out file storage services through its PowerScale solution (a successor of the Isilon platform), a distributed file system that scales from 11 TB to 186 PB in a single namespace with fast node addition. PowerScale provides auto-balancing capabilities as well as simultaneous multiprotocol access to the same data through file, object, or Hadoop protocols.

With PowerScale, Dell Technologies dissociated the file system, OneFS, from the appliance; organizations can deploy PowerScale either as an SDS solution or pre-installed on purpose-built appliances. Appliances support a broad range of storage media with NVMe TLC and QLC flash, as well as serial-attached SCSI (SAS) flash. Dell also offers hybrid and archive-oriented appliances.

OneFS supports a broad range of data services, including smart quotas, deduplication and inline compression, smart storage pools that benefit from policy-based data tiering, and automated client load-balancing through SmartConnect. Another service, SyncIQ, allows multiple data replication topologies to be configured–not only for data movement, but also for high availability and disaster recovery use cases.

Currently, the solution is adjacent to cloud (it can be deployed either on-premises or in a colocation facility), but it supports policy-based cloud tiering to AWS and ECS. PowerScale is available as a native Google Cloud service operated by Dell Services. Since May 2023, customers can also use Dell APEX File Storage for AWS, a cloud offering powered by PowerScale.

PowerScale systems can be managed through its own management interface; however, the solution also seamlessly integrates with CloudIQ, Dell Technologies’ AI-based analytics platform that provides comprehensive data services, automation, and AIOps capabilities.

Dell offers a balanced set of ransomware protection capabilities in its PowerScale platform, combining proactive detection (delivered by Superna Ransomware Defender) and immutable snapshots with additional recovery orchestration capabilities. Ransomware Defender proactively monitors these platforms for threats and can trigger automated response scenarios such as locking users out, taking snapshots, and suspending scheduled replication or copy jobs. Dell also offers a smart air gap capability. The PowerScale OneFS operating system includes a comprehensive set of built-in, enterprise-grade security features, including write once read many (WORM) SmartLock compliance.

The PowerScale platform supports Kubernetes integration through Dell’s Container Storage Modules (CSM), a regularly updated open-source suite of modules developed for Dell EMC products. CSM covers storage support (through CSI drivers) and other capabilities such as authorization, resiliency, observability, snapshots, and replication. It supports a broad range of integrations with other container environments, such as OpenShift and Docker.

PowerScale includes NFSv3 support and allows GPUDirect connectivity through the NFS-over-RDMA protocol. Relevant PowerScale nodes also include Mellanox ConnectX-based NICs that enable RDMA over Converged Ethernet (RoCE) connectivity, forging a direct I/O path that eliminates CPU-induced latencies.

Strengths: Dell PowerScale offers a complete portfolio that supports a broad spectrum of use cases ranging from high performance to deep archive storage. Strong capabilities in data management, cyber resiliency, and Kubernetes are evident.

Challenges: Integration with object storage aside, cloud support is currently limited to Google Cloud. However, Dell APEX File Storage for AWS was recently launched to address this gap.

Hammerspace

Hammerspace software creates a high-performance Parallel Global File System with enterprise NAS data services, and data orchestration to place data where needed, across existing or new storage from any vendor. This helps organizations overcome the siloed nature of hybrid and cloud file storage by providing a single global namespace regardless of a site’s geographic location or whether storage provided by any vendor is on-premises or cloud-based and by separating the control plane (metadata) from the data plane (where content data resides). It is compliant with several versions of the NFS and SMB protocols and includes RDMA support for NFSv4.2.

The Hammerspace solution is software-defined and can scale up or out as needed to accommodate even extreme performance requirements for HPC or other resource-intensive use cases, such as AI training, checkpointing, and inferencing workloads.

The solution lets customers automate using objective-based policies that enable them to use, access, store, protect, and place data around the world via a single global namespace, and users have no need to know where the resources are physically located. They simply see the file system via industry-standard file protocols. Data orchestration for workflows or other data services are automated as background operations, and they may even apply to live data while it is in use without user interruption. The system is monitored in real time to see if data and storage use are in alignment with policy objectives, and it performs automated background compliance remediation in a seamless and transparent way.

Integration with object storage is a core capability of Hammerspace: data can be replicated or saved to the cloud as well as tiered automatically on object storage, thereby reducing the on-premises data footprint and leveraging cloud economics to keep storage spend under control. From a user perspective, the file is still visible via the same file share, even as it is non-disruptively orchestrated across different storage types, physical sites, or the cloud based upon policy requirements or to stage files close to compute resources for job runs.

The solution can be managed through a new management UI, which offers a high degree of customization and a data mobility viewer that shows where the data is flowing to in the environment. REST APIs and CLI commands are also supported.

The product is based on the intelligent use of metadata across file system standards and includes telemetry data (such as IOPS, throughput, and latency) as well as user-defined and analytics-harvested metadata, allowing users or integrated applications to rapidly view, filter, and search the metadata in place instead of relying on file names. Hammerspace also supports user-enriched metadata through the Hammerspace Metadata Plugin. Custom metadata will be interpreted by Hammerspace and can be used not only for classification but also to create data placement, disaster recovery, or data protection policies.

Hammerspace’s cyber-resiliency strategy relies on native immutability features, monitoring, and third-party detection capabilities. Mitigation functions include undelete and file versioning, which allow users to revert to a file version not affected by ransomware-related data corruption. Hammerspace’s ability to automate data orchestration for recovery is also a core part of the Hammerspace feature set.

The solution supports Kubernetes through a CSI plug-in that presents Hammerspace’s global namespace to clusters and pods. Containerized workloads can access data globally, regardless of whether it is local to the cluster, on-premises but remote, or located in the cloud.

Hammerspace supports NVIDIA GPUDirect via both pNFS and RDMA implementations.

Strengths: Hammerspace’s Parallel Global File System offers a very balanced set of capabilities with replication and hybrid and multicloud capabilities.

Challenges: Built-in, proactive ransomware detection capabilities are currently missing.

IBM

IBM provides scale-out file storage capabilities for high performance workloads via a robust and proven SDS solution: IBM Storage Scale, a high-performance offering based on IBM’s global parallel file system (GPFS). The solution is very popular within the HPC community, and IBM also positions Storage Scale as an optimized solution for AI use cases.

Two key features of IBM Storage Scale are its scalability and flexible architecture. The product can handle several building blocks on the back end: IBM NVMe flash nodes, Red Hat OpenShift nodes, capacity, object storage, and multivendor NFS nodes. The solution offers several file interfaces, such as SMB, NFS, POSIX-compliant, and HDFS (Hadoop), as well as an S3-compatible object interface, making it a versatile choice for environments with multiple types of workloads. Data placement is taken care of by the IBM Storage Scale clients, which spread the load across storage nodes in a cluster. The solution offers a single namespace and migration policies that enable transparent data movement across storage pools without impacting the user experience.

IBM Storage Scale supports remote sites and offers various data caching options as well as snapshot support and multisite replication capabilities. The solution includes policy-driven storage management features that allow organizations to automate data placement on the various building blocks based on the characteristics of the data and the cost of the underlying storage. A feature called transparent cloud tiering allows users to tier files to cloud object storage with an efficient replication mechanism.

Storage Scale’s management interface provides monitoring capabilities for tracking data usage profiles and patterns. Comprehensive data management capabilities are provided through an additional service, IBM Watson Discovery.

IBM Storage Scale includes a snapshot retention mechanism that prevents snapshot deletion at the global and fileset level, effectively bringing immutability and basic ransomware protection capabilities to the platform. Early warning signs of an attack can be provided by IBM Storage Insights or IBM Spectrum Control. Both solutions can analyze current I/O workloads against a previous usage baseline and help provide indications that an attack is in progress. Organizations can set up alerts that indicate an attack may be happening by combining multiple triggers.

IBM provides Kubernetes support through the IBM Storage Scale CSI driver. This driver can present existing storage to Kubernetes clusters, and it also supports dynamic creation of directory-based or fileset-based volumes. It supports spanning volumes across multiple file systems or on remotely-mounted file systems. Snapshot support is included, and volumes can be provisioned from snapshots as well.

The solution supports GPUDirect.

Strengths: IBM Storage Scale offers a robust and scalable architecture that remains popular across several verticals, particularly in science-related projects. The solution offers multiple enterprise-grade capabilities and will cater to organizations looking to support diverse storage needs in a unified high-performance platform.

Challenges: Though IBM Watson Discovery provides excellent data management capabilities, it is an add-on solution that incurs an extra charge. Advanced analytics capabilities are limited and must be developed.

NetApp

NetApp offers a scale-out file system solution based on its ONTAP technology, available with many different deployment options. The company introduced two new appliances last year: the FAS9500 and AFF A900, both of which offer in-chassis non-disruptive upgrades with more than 50% performance improvement over the previous generation, along with support for NVMe-oF and NVMe/TCP. They were complemented in 2023 with the AFF A150 (starting at 7.6 GB) and the new QLC-based AFF C-Series appliances, both providing unified block, file, and object protocol support.

NetApp delivers a seamless experience across on-premises and public cloud environments with BlueXP, a unified control plane that comprises multiple storage and data services delivered via a single SaaS-delivered multicloud control plane.

Among services offered in NetApp BlueXP, customers can find not only Cloud Volumes ONTAP (CVO), based on NetApp’s ONTAP technology, but also first-party services on hyperscalers, such as AWS (Amazon FSx for NetApp ONTAP), Azure (Azure NetApp Files), and Google, with the recently added Google Cloud NetApp Volumes.

BlueXP also supports a host of other data services, such as observability, governance, data mobility, tiering, backup and recovery, edge caching, and operational health monitoring. NetApp recently added a sustainability dashboard to BlueXP, showing power consumption (kWh), direct carbon usage (tCO2e), and heat dissipation (BTU). It also shows carbon mitigation percentages and potential gains from recommended actions (such as enabling caching, deduplication, and so forth).

Integration with object storage is a key part of the solution, and policy-based data placement allows automated, transparent data tiering on-premises with NetApp StorageGRID or in the cloud with AWS S3, Azure Blob Storage, or Google Cloud Storage, along with the ability to automatically recall requested files from the object tier. Object storage integration also extends to backup and disaster recovery use cases with NetApp BlueXP Backup and Recovery.

Data management capabilities are enabled by consistent APIs that allow data copies to be created as needed. BlueXP also offers strong data analytics features in all scanned datastores, with no requirements for them to be ONTAP-based. This service provides insights around data owners, location, access frequency, and data privileges, as well as potential access vulnerabilities, with manual or automated policy-based actions. Organizations have the ability to generate compliance and audit reports such as data subject access requests (DSARs). HIPAA and GDPR regulatory reports also can be run in real time on all ONTAP data stores.

The BlueXP platform provides advanced security measures against ransomware and suspicious user or file activities when combined with the native security features of ONTAP storage, including immutable snapshots and advanced ransomware protection (ARP). A new feature in BlueXP is the ransomware protection dashboard, which monitors security and user behavior to help identify risks and threats and instruct users on how to improve an organization’s security posture and remediate attacks.

From BlueXP, customers can also enable the Global File Cache service for branch locations, remote sites, or regional hyperscalers’ points of presence to enable local-speed, low-latency access to centralized shares through a single global namespace with full global file-locking capabilities.

Strengths: NetApp provides broad portfolio options to implement and consume a modern scale-out file system solution based on ONTAP with a complete enterprise-grade feature set, flexible deployment models, and ubiquitous service availability across public clouds. NetApp BlueXP offers next-level management and orchestration capabilities complemented by a host of SaaS-based data services, simplifying data storage and data management at scale, regardless of the chosen deployment model.

Challenges: NetApp’s rich and comprehensive ecosystem (in terms of management and advanced services) may be intimidating. To address this concern, NetApp is progressively consolidating its management and advanced data services capabilities to simplify the user experience.

Nutanix

Nutanix provides software-defined scale-out data services through Nutanix Unified Storage, a distributed storage solution bringing file, object, and volume services into one single platform with a simple, capacity-based consumption model. Nutanix Unified Storage supports, among other protocols, NFS, SMB, and S3 access for file and object applications on single or multicluster deployments. Nutanix Unified Storage is software-defined and can be deployed either as a standalone storage-only solution or as a fully integrated HCI solution.

Nutanix Cloud Clusters (NC2) also provides a frictionless multicloud solution that allows organizations to run Nutanix Unified Storage in the public cloud and with seamless data mobility between on-premises, cloud, and edge, all managed through a single pane of glass. Currently, NC2 is supported on both AWS and Azure, and an HPE GreenLake offering is also available.

Nutanix Unified Storage offers flexible disaggregated scaling linearly in terms of both capacity and performance, and it can support clusters with up to 48 physical nodes per cluster, delivering up to tens of PBs in capacity and GB/s of performance. It supports all media types, ranging from NVMe flash all the way to hard drives.

Nutanix Unified Storage includes a files metrosync feature with automatic failover for zero-RPO and near-zero-RTO data protection, Smart Sync data mobility and data consolidation from multiple edge sites, and a versatile secure file share snapshot and replication capability for quick restores, protection from cyber-attacks and disaster recovery. Worth noting is that a built-in NFS and SMB migration capability is now available with Move 5.0 for migrating VMs and file storage from any third-party file storage.

Nutanix Unified Storage is managed through Prism Central, Nutanix’s centralized management console. Prism Central enables management across multiple locations and public clouds. The platform is built for simplicity and offers a modern interface to enable easier day-to-day management, faster deployment, and reduced maintenance activity.

Nutanix offers a compelling SaaS-based data security and global data management solution called Nutanix Data Lens (NDL). The solution helps proactively assess and mitigate unstructured data security risks by identifying anomalous activity, auditing user behavior, and adhering to monitoring requirements. An AI/ML-based anomaly detection engine monitors data activities such as potential ransomware or malware attacks and takes proactive steps to lock-down the system to prevent further spread of the malware. The latest NDL release provides a Threat Containment Window to proactively detect and block threats, defend against further damage, and send alerts to begin a one-click recovery process within 20 minutes of exposure. NDL supports multiple clusters distributed geographically, including in a public cloud, and provides a single pane of glass for insights across all deployments. Furthermore, NDL is capable of baselining normalized cluster behavior across thousands of deployments to provide better anomaly detection capabilities.

Nutanix Unified Storage also provides persistent storage to Kubernetes clusters through Nutanix’s unified CSI driver, which is deployed with every Kubernetes cluster and natively integrates with Nutanix Volumes, Nutanix Files, and Nutanix Objects. The driver supports dynamic NFS share creation and integrates with Prometheus to provide metrics around Kubernetes storage consumption.

Strengths: Nutanix Unified Storage delivers impressive unified data services that meet expectations at almost every level, consolidating file, object, and block storage. It offers multiple flexible deployment options and provides compelling ransomware protection and data analytics with storage tiering.

Challenges: Although a very complete solution, Nutanix Unified Storage could benefit from expanding its public cloud footprint beyond AWS and Azure. Nutanix is, however, working on roadmap improvements in this area.

OSNexus

OSNexus offers QuantaStor, a software-defined scale-out file system built on the open source Ceph platform that can be deployed on industry-standard servers through reference configurations. The solution offers unified block, file, and object storage capabilities on top of its storage grid technology, a globally distributed control plane that can be managed as a single entity from anywhere.

The solution can be deployed on-premises or in the cloud using a virtual storage appliance. It supports all major media types such as NVMe/serial-attached SCSI (SAS), flash, and hard disk drives (HDDs). It also supports QLC-based systems that combine NVRAM and QLC 3D NAND. In fact, the company had a working persistent memory (PMEM)-based solution before Intel Optane was retired and has the expertise to implement a compute express link (CXL)-based solution once the PMEM based on the CXL specification becomes broadly available.

QuantaStor allows administrators to configure the way media tiers are to be used (data, metadata, or write log) to take the best advantage of each media’s inherent capabilities. The solution also supports the NVMe-oF protocol and integrates with Western Digital OpenFlex systems. The solution is often deployed to support large media archives, data lakes, backups, and HPC workloads. Performance focus is thus placed primarily on throughput, especially aggregate throughput.

The company has been working with Seagate to develop a solution in which Quantastor runs on top of Seagate Corvault systems, providing 16+2 erasure coding at the Corvault layer as well as the Quantastor layer, significantly improving data durability.

QuantaStor provides object storage and cloud integration via a NAS gateway, which provides access to cloud-based S3 buckets using the SMB and NFS protocols. A feature called backup policies allows data replication and movement to cloud-based object storage. When files are moved, stubs are left behind and enable access to the cloud-based object.

To manage QuantaStor deployments, organizations can take advantage of a globally distributed management platform that is available on all nodes and can be accessed from anywhere. Although the management platform UI remains the preferred method of access for OSNexus’s customer base, users can also leverage REST APIs, a CLI, and a Python client interface to automate operations. In addition, QuantaStor can integrate with Grafana.

A feature called “report schedules” provides basic data management capabilities, with information about the places where capacity is being used, what folders are top consumers, and other capacity-related metrics. The feature also provides a health report that flags systems with potentially risky configuration settings.

OSNexus implements file-level immutability and immutable snapshots. It also integrates S3 object locking to enforce data governance and meet compliance requirements. Improvements such as policy-based immutable snapshots and proactive, real-time identification of ransomware attacks are on the roadmap and expected in 2024.

Kubernetes support relies on Ceph open-source modules that are included with Kubernetes distributions. GPUDirect support is absent, but it can be achieved with third-party hardware offload solutions.

Strengths: This is a solidly engineered solution that takes advantage of Ceph and wraps it around enterprise-grade data services. One of the strongest value points of QuantaStor is its flexibility, both in deployment models and in terms of supported configurations and media types.

Challenges: Areas needing improvement include AI/ML-based analytics and basic data management.

Panasas

Panasas delivers scale-out file storage capabilities through the PanFS parallel file system, which is deployed on Panasas ActiveStor appliances: ActiveStor Flash (an all-flash appliance using NVMe solid state drives [SSDs]), ActiveStor Ultra (combining SATA SSDs and HDDs), and the ActiveStor UltraXL appliance, a variant of the ActiveStor Ultra capable of supporting larger capacities. Panasas is popular among HPC users but is suitable for other use cases as well. Although technically a software-defined storage solution, PanFS is sold bundled with hardware appliances. It can be expected that the company will work on making PanFS hardware agnostic to enable native cloud deployments in the future. Hardware appliance refreshes are expected in November 2023.

PanFS consists of a POSIX-compliant parallel file system that can scale to thousands of nodes and hundreds of petabytes and uses an object store as its underlying data storage mechanism. PanFS provides a single namespace that can contain multiple volumes. Each volume can be configured as needed in terms of data services capabilities, access control, quotas, snapshot schedules, and more.

PanFS supports multiple storage tiers: each storage node comprises NVMe SSDs, SATA SSDs, and HDDs; however, QLC flash is not yet supported because of endurance-related concerns. Automated tiering and metadata placement are handled in the background and are thus invisible to users. Panasas implements per-file, node-based software erasure coding. This unique capability increases system reliability during rebuilds, notably when new nodes are added to the scale-out storage system. Similarly, striping is also done per file, reducing the likelihood of multiple failures impacting the same file.

The solution is based on a proprietary object storage architecture for storage or retrieval of data from PanFS. Panasas provides data management capabilities through PanMove Advanced, a solution based on Atempo’s Miria data management platform that enables data movements to object stores, including tiering to the cloud. Panasas has added a front-end S3 interface to PanFS with the recent release of PanFS 10. The solution does not yet support multiprotocol access. Another feature, PanMove Sync, enables parallel data replication between two Panasas clusters.

Storage analytics and data management capabilities are now available through PanView Analytics (also based on Atempo Miria), whose engine is embedded in PanFS for faster scanning. PanMove Advanced and PanView Analytics are add-on modules that customers can license directly from Panasas. A CLI is also available for power users.

Ransomware protection is basic and includes support for immutable snapshots and SELinux-related security contexts. Currently, neither RBAC nor in-flight encryption are available, but they are included on Panasas’ roadmap.

Kubernetes support is available starting with PanFS 10. GPUDirect support is on the roadmap; however, Panasas says there is no particular demand from its customer base for this capability. In the meantime, Panasas offers an alternative solution (DirectFlow) for organizations that want this functionality.

Panasas also offers a new appliance for edge deployments branded ActiveStor Edge. It leverages PanFS and is delivered in a reduced footprint to fit edge hosting requirements.

Strengths: PanFS offers a proven on-premises scale-out file system that is particularly appreciated by the HPC community. The solution provides a robust and scalable file system implementation with automated tiering and support for new media types. The company is making progress across most areas through an astute combination of built-in capabilities and tactical partnerships.

Challenges: GPUDirect support is missing. Cyber-resiliency capabilities and some basic security requirements such as RBAC and in-flight encryption need to be prioritized.

Pure Storage

Pure Storage proposes a unified fast file and object (UFFO) storage solution through its FlashBlade appliances portfolio. The company has expanded its portfolio in 2023 and now offers two solutions: the FlashBlade//S, built for high-performance use cases such as cloud-native apps, rapid restore, AI/ML, and a content store with online access; and the FlashBlade//E, a solution introduced in 2023 that focuses on capacity-driven use cases with lower performance requirements.

Built for massive scalability, both the FlashBlade//S and //E models are engineered to suit storage-dense systems with great energy efficiency, thanks to an all-QLC design based on proprietary and highly dense DirectFlash Modules.

FlashBlade //E implements a slightly different physical architecture than FlashBlade //S: unlike the FlashBlade//S, the FlashBlade//E doesn’t independently scale performance and capacity because of the dissociation between compute-only and compute+storage designs for manufacturing (DFMs), but the focus on low-performance workloads and distributed file manager population rules do not have any adverse performance impact.

Similarly, FlashBlade also excels in high-performance object use cases like AI analytics. FlashBlade supports asynchronous replication between FlashBlade systems and from FlashBlade to cloud-native storage such as AWS S3. While FlashBlade natively supports NFS, SMB, and S3 protocols from the same system, it currently doesn’t offer simultaneous access to the same data through either file or object protocols.

Advanced management is provided by Pure1, Pure Storage’s management, analytics, and support platform, along with monitoring and reporting of issues. Pure1 offers a tool that gives customers the ability to estimate storage costs for Evergreen//One. Organizations can consume Pure Storage products and services directly, including storage as a service (STaaS) with the Evergreen//One solution.

The solution implements several cyber-resiliency capabilities starting with immutable SafeMode snapshots, which locks snapshots and prevents their deletion. SafeMode snapshots can also be used to create a read-only protected snapshot of a full backup, including the backup and associated metadata catalogs as well. Other enhanced immutability features include multiple-party authentication processes for sensitive snapshot policy changes as well as additional mechanisms to ensure destroyed volumes or snapshots can be recovered and to prevent accidental eradication of file systems or snapshots. In addition, fleet-level data protection assessments can be made to ensure all appliances are adequately protected. From a detection standpoint, Pure1 includes anomaly detection capabilities that can be leveraged to identify and proactively alert on anomalous and potentially malicious activities. Finally, the company also offers an add-on ransomware recovery SLA for its Evergreen//One offering.

FlashBlade systems provide advanced Kubernetes support capabilities thanks to a deep integration of Portworx, Pure Storage’s flagship Kubernetes storage solution. Customers can use a cost-free, FlashBlade-specific version of Portworx Essentials for which the node count limit has been lifted. Organizations can start their cloud-native journey with Portworx Essentials directly on top of FlashBlade without having to plan for additional investments, and they can seamlessly upgrade to Portworx Enterprise later as they advance through their journey and need to scale Kubernetes services.

Strengths: With the introduction of FlashBlade //E earlier this year, Pure Storage now offers comprehensive coverage for a broad spectrum of use cases across varied performance tiers, backed by the latest innovation in QLC flash. The solution includes excellent AIOps capabilities (Pure1), Kubernetes support (Portworx), and a compelling STaaS model.

Challenges: The solution doesn’t support concurrent object and file access, though this is not a major challenge for general-purpose, enterprise scale-out file system use cases.

Quantum

Quantum now offers two scale-out file storage solutions: StorNext, and the recently introduced Myriad (generally available as of November 2023).

StorNext is massively scalable and supports a broad choice of hardware appliances and performance tiers, with a focus on video post production and large file archives. StorNext’s architecture is virtualized and containerized; it starts at 10 TB but can support a file system of up to 18 EB.

StorNext supports a variety of media tiers, including NVMe flash. It allows flexible composition of storage tiers by combining multiple media types in various pools, then allows operators to define data movement criteria to achieve optimal cost efficiencies.
StorNext supports object storage as a secondary storage tier, as a hot archive or as cold data storage. It provides S3 bucket import capabilities, the ability to tag objects with additional StorNext-related metadata (object location and name in the source file system), and supports a broad set of on-premises object stores. Quantum’s CatDV solution integrates with StorNext and can index large sets of rich media data to make it available for subsequent searches. Object storage support also extends to the cloud with support for multiple services on AWS, Azure, and Google. StorNext can be deployed in AWS via the marketplace and uses EBS as its storage backend.

Management capabilities are unified across all Quantum solutions for management and administrative operations. The SaaS-based Quantum Cloud-Based Analytics solution shares its environment view with Quantum’s support organization for remote support purposes, and it offers a comprehensive view of the managed environment, remote monitoring, and automated support.

Ransomware protection and immutable snapshots are not supported by StorNext.

Although based on a containerized architecture itself, the solution has no particular Kubernetes support capabilities and doesn’t support GPUDirect either.

The recently introduced Myriad offering was developed taking into account Quantum customer feedback and focuses on rapid data recovery for mission-critical data (Oracle, SAP HANA, Microsoft SQL Server); modern data lakes for analytics and AI (Elastic, Spark, Splunk); VFX (Autodesk Flame and ShotGrid); and animation rendering.

Myriad’s architecture is based on microservices orchestrated by Kubernetes, and runs on top of industry-standard hardware. Components consist of load balancer nodes, storage nodes, and a deployment node, all connected via a 100 GbE NVMe fabric; storage nodes use NVMe flash.

Myriad currently supports NFS, and will introduce SMB and S3 support in the future. Initial storage services include zero-impact snapshots, instant clones, deduplication, and compression. Replication and tiering are on the roadmap. The initial release of Myriad does not include any support for ransomware protection, Kubernetes workloads, or GPUDirect. Quantum has plans to eventually run Myriad natively in the cloud.

Strengths: Quantum offers massive scalability and a broad choice of appliances and storage tiers, with comprehensive object storage support and increasing cloud integrations. The launch of the new Myriad solution underscores the vitality of the company and its appetite for innovation.

Challenges: The lack of immutable snapshots on StorNext, a foundational capability for ransomware protection, is concerning. Myriad was recently launched and is not yet proven in the field, so initial capabilities are limited but should rapidly ramp up.

Qumulo

Qumulo has developed a software-defined, vendor-agnostic scale-out file system that can be deployed on-premises, in the cloud, a hybrid of the two, or even delivered through hardware vendor partnerships. The solution provides a comprehensive set of enterprise-grade data services branded Qumulo Core. These handle core storage operations (scalability and performance), data replication and mobility, security, ransomware protection, data integration, and analytics.

From a performance and capacity perspective, the solution scales linearly, providing a single namespace with limitless capacity that supports billions of large and small files and can use nearly 100% of available storage through efficient erasure code techniques. It also supports automatic data rebalancing when nodes or instances are added. The namespace enables real-time queries and metadata aggregation, significantly reducing search times. Qumulo supports NVMe flash, serial ATA (SATA)/SAS SSDs, and HDDs.

Data protection and replication, as well as mobility use cases, are well covered and include snapshots and snapshot-based replication to the cloud, continuous replication, and disaster recovery support with failover capabilities. Qumulo SHIFT is a built-in data service that enables bidirectional data movements to and from AWS S3 object stores with built-in replication, including support for immutable snapshots, and it provides organizations with more flexibility and better cost control.

Qumulo offers cloud file storage services through Cloud Q, a set of solutions designed specifically for the cloud that leverage Qumulo Core services. Organizations can deploy Cloud Q through their preferred public cloud marketplace (the solution supports AWS, Azure, and GCP) or deploy Qumulo as a fully managed SaaS offering on Microsoft Azure. AWS Outposts is also supported and a comprehensive partnership with AWS is in place that includes WAF certification and AWS Quick Start. Qumulo is expanding its delivery models through STaaS partnerships with HPE GreenLake and others.

Qumulo can be managed through an on-cluster or a cloud-based interface. It also includes a comprehensive set of REST APIs that allows users to perform proactive management and automate file system operations. The solution comes with a robust data analytics engine that provides real-time operational analytics (across all files, directories, metrics, users, and workloads), capacity awareness, and predictive capacity trends, with the ability to “time travel” through performance data. With the Qumulo Command Center, organizations can also easily manage their Qumulo deployments at scale through a single management interface.

Advanced security features include read-only snapshots that can be replicated to the cloud, and audit logging to review user activity. WORM (immutable) snapshots and snapshot-locking capabilities (protection against deletion) were implemented in 2023.

Qumulo is working on implementing data compression, extending its global namespace with geographically distributed deployments, and adding native S3 protocol support. Multitenancy capabilities are incrementally developed and should be fully available by the end of 2023.

Qumulo offers standard Kubernetes support via the Kubernetes NFS persistent storage.

Strengths: Qumulo offers a comprehensive scale-out file system solution that is simple to manage and implement. It has rich and complete data services combined with a broad choice of deployment models, including seamless hybrid and cloud-based deployments, making it one of the most flexible solutions currently available.

Challenges: Although the solution is very comprehensive, some important features are currently still on the roadmap. Better data reduction and multitenancy are among them.

Quobyte

Quobyte offers a software-defined, scale-out file storage solution based on a parallel distributed POSIX-compliant file system. The solution scales linearly in capacity and performance, providing full mesh communication with up to hundreds of thousands of clients and thousands of Quobyte servers. It provides a single namespace common to all interfaces (Linux, Windows, S3, macOS, HDFS, NFS, and so on) and allows file and object access to the same datasets.

The solution is engineered to support always-on operations. As a result, usual maintenance activities such as software updates, node addition/removal, hardware replacement, policy reconfiguration, and data movements are non-disruptive. Quobyte clusters support heterogeneous server configurations that can vary in specifications, generation, capacity, and models. Because Quobyte presents a single namespace, it pools local media on servers and offers transparent migration and tiering on top of NVMe, SSDs, and HDDs. In 2023, the company saw growth in the number of use cases in the genomics area.

Quobyte offers multiprotocol access and now supports tiering to object storage on AWS and any S3-compatible object store, whether in the cloud (for example, with Azure or Wasabi) or on-premises. Multiple object storage tiers are supported and data movement is handled by Quobyte’s policy-based tiering engine. The solution supports two modes: a copy mode, by which the data source is maintained; and a tiering mode, by which the data is effectively moved to object storage. Organizations can deploy Quobyte in the cloud on AWS, Google, and Oracle platforms directly through the marketplace. However, availability on Azure is still on the roadmap.

Although the solution follows an API-first approach, it can be managed also through an extensive web-based UI and through command-line tools, and it can integrate with Prometheus. Quobyte provides real-time performance analytics but does not boast any particular data management insights.

Quobyte supports immutable snapshots; however, the solution does not include any proactive ransomware detection capabilities yet.
The solution supports Kubernetes through a CSI plug-in that provides volumes with quotas, snapshots, an access key, and multitenancy support. The Quobyte solution can be deployed itself through containers on Kubernetes, and the company provides a Helm chart to simplify installation and updates of Quobyte in containerized environments. Recently, the company also announced a partnership with SUSE to provide deeper integration capabilities with Rancher.

Strengths: Quobyte offers linear scalability and great flexibility on top of a robust architecture. Organizations will appreciate its multiprotocol support as well as its non-disruptive operations model. The feature set has improved over last year, adding AWS cloud integration and better object storage support.

Challenges: Proactive ransomware detection capabilities and AI-based analytics are currently absent.

Scality SOFS

Scality SOFS (for scale-out file system) provides a scale-out file system implementation based on Scality RING. Each Scality RING on which Scality SOFS runs offers a global namespace, making global metadata searches possible within that RING. The architecture decouples interface connectors and storage nodes with fully distributed metadata. POSIX metadata and internal data indexes permanently reside on flash media for improved performance, and a distributed lock manager ensures consistent views of the data state across the distributed RING.

The solution supports unlimited volumes with no size limits. Volumes are logical constructs that enable policies like volume protection, versioning, and multisite replication. Volume protection automatically makes data read-only after a set period, providing WORM semantics for compliance and protection against ransomware threats.

Scality SOFS is designed for high throughput and sequential workloads such as backup targets, data archive, media management, data lakes, medical imaging data retention, and HPC storage systems. It’s ideal for scenarios where scaling and aggregate throughput or sequential access are essential.

A cloud-based SOFS solution is built for Azure, which has a POSIX layer integrated into an object-based storage architecture. It uses Azure CosmosDB for metadata storage and can be deployed via stateless VM images in Azure Cloud, enabling scalability by adding more VMs. With support for SMB, NFS, and FUSE protocols, it offers high throughput, making it ideal for sequential access and high-throughput workloads.

SOFS allows concurrent data access from multiple storage connectors, including SMBv3, NFSv4, Linux FUSE, and REST. It supports unlimited volumes and files, offers multitenancy by using different volumes (namespaces providing logical data separation), and integrates a distributed lock manager to ensure consistent views. A volume protection feature allows data to become read-only automatically after a specific period (which can be configured up front) and implements WORM semantics. Soft and hard quotas can be set up at the volume, user, and group levels.

Management of infrastructure and SOFS volumes is performed through the RING Supervisor GUI, which provides standard analytics capabilities. The GUI also offers Grafana export of metrics and insights into the infrastructure. Scality Cloud Monitor is a recently introduced web portal that enables remote monitoring of RING instances and provides health metrics and alerts for anomalies. It is hosted on Elastic Cloud and leverages the Elastic toolset for AI/ML-driven system observability of RING/SOFS metrics, providing insights into system state and anomalous events.

The solution has multiple security features, such as RBAC, file system versioning, volume protection, and encryption at rest. It also offers protection against ransomware through immutable volumes and volume protection. Additional measures include asynchronous replication gap timing and the ability to prevent files on a target from being erased.

Kubernetes support is currently absent; customers can, however, export SOFS volumes via SMB and/or NFS and use the standard Kubernetes CSI plug-ins to present those volumes to their Kubernetes environment. Due to the specificity of the solution, GPUDirect capabilities need to be supported and relevant. Roadmap capabilities include multiple replications to the cloud.

Strengths: A robust scale-out file system solution designed explicitly for sequential and throughput-intensive workloads, Scality SOFS provides an excellent cloud implementation. It uses inexpensive object storage to deliver massive scalability, parallelism, and aggregated throughput.

Challenges: The solution’s niche focus and limited capabilities make it unsuitable for general-purpose deployments. The cloud edition of Scality SOFS currently supports only Microsoft Azure.

ThinkParQ

ThinkParQ, a spin-off company from the Fraunhofer Center for High Performance Computing, developed BeeGFS, a flexible, scalable, and robust scale-out file system targeted at performance-oriented environments such as HPC, AI, and DL.

BeeGFS supports multiple processor architectures (X86, ARM, OpenPower, and so on) and runs on top of various Linux distributions. It can be deployed on top of various file systems such as EXT, XFS, ZFS, and others, provides file services via the NFS and SMB protocols, and operates in user space. The solution also provides a broad set of connectivity options, including InfiniBand, and enables NVMe connectivity over RDMA, RDMA over Converged Ethernet (RoCE), and transmission control protocol (TCP).

BeeGFS is designed around a scalable architecture with metadata servers, storage servers, monitoring systems, and one management host per file system. The solution can be deployed on any type of commodity hardware or layered on top of block storage. Although BeeGFS is used traditionally as scratch space for high-performance computing systems, the solution can also be used for persistent data storage and includes tiering capabilities with storage pools as well as data mirroring (through buddy groups).

Due to its open-source distribution heritage, BeeGFS is available in two flavors: a free community edition and a paid enterprise edition. Resiliency, quota enforcement, access control lists, and storage pools features are available only on the enterprise edition.

The solution offers no native object storage integration capabilities, but organizations can use third-party software to enable object storage support. BeeGFS can be used natively in Oracle Cloud Infrastructure (OCI), Microsoft Azure, and AWS.

BeeGFS is usually integrated with HPC cluster managers such as Bright Cluster Manager as well as grid schedulers such as SLURM and other tools. In addition, BeeGFS offers a monitoring service based on InfluxDB that integrates into Grafana. Since version 7.4.0, monitoring has been extended to also cover system metrics of the systems BeeGFS runs on top of.

In terms of security, BeeGFS relies on encryption and security mechanisms implemented at the operating system and storage layers. Since last year, the solution has seen significant improvements from a security hardening perspective, and security features that were previously optional are now enabled by default.

BeeGFS provides Kubernetes support through a CSI plug-in that can either allow containers to access existing datasets or request ephemeral or persistent high-speed storage served by BeeGFS. The CSI driver can be managed via the BeeGFS CSI driver operator to automate deployment or perform reconfiguration and update activities. It also seamlessly integrates with Red Hat OpenShift.

GPUDirect support has been included in BeeGFS since version 7.3.0, and all BeeGFS nodes must be configured for RDMA. As with other aspects of the BeeGFS solution, tuning principles also apply to GPUDirect.

Strengths: BeeGFS provides a flexible and scalable architecture, can be deployed across multiple architectures and operating systems, and offers great tuning options that meet the expectations of HPC teams. It also supports broad connectivity protocols and can be deployed in the cloud.

Challenges: Although competitive in the HPC arena, the solution lacks capabilities that would make it useful as a general-purpose scale-out file system.

VAST Data

VAST Data offers a massively scalable storage architecture built around new flash media technologies that can deliver exabyte-scale file and object storage. VAST Data created the solution using a Disaggregated and Shared Everything (DASE) storage model that enables customers to scale capacity and performance independently and to expand both to exabyte-scale. The solution is deployed as a managed service, and a Layer 3 engineer is assigned to each customer as a “copilot.”

VAST Data provides a universal storage plane that delivers file and object capabilities across a single high-performance storage tier backed by high-capacity QLC flash, ultra-low-latency, high-endurance, and high-throughput storage-class memory. The solution leverages NVMe-oF and introduces a disaggregated, shared-everything design composed of VAST server containers running the logic of VAST (providing a global namespace accessible through NFS, NFS over RDMA, SMB, S3, and containers) and VAST NVMe enclosures providing high-density flash storage (combining SCM and QLC flash). Those components interconnect through low-latency 100 GbE or InfiniBand. An efficient implementation of data writes (via large sequential stripes) significantly reduces media wear on QLC drives and allows VAST Data to provide a 10-year guarantee on QLC flash durability.

The solution supports native replication, backups to S3-compatible object storage, support for NFSv4, new snapshot policies (including indestructible snapshots that can be unlocked through MFA), and telemetry/call-home data. It is worth noting that the “snap to object” feature allows data replication to any S3-compatible object store, whether in the cloud (AWS and Azure) or on-premises. The data is compressed and stored in large objects for greater efficiency.

The solution offers a modern cloud-based management platform with data flow visualization that helps users to understand how data moves across their system. Other features include capacity use projections and a dynamic wheel of data usage. A full REST API is also available.

Security features include granular RBAC (organized around multiple realms of management), data-at-rest encryption, ransomware protection, and additional auditing capabilities.

VAST Data’s Kubernetes CSI driver, used by over 30% of its customers, offers NFS over RDMA support for better performance. It uses storage pools that can be assigned to specific containers or clusters, supporting policy-based communication and QoS. It also has a dedicated plug-in for Red Hat OpenStack called Manilla.

GPUDirect support is available with an early implementation that takes advantage of VAST Data’s NFS multipathing capabilities, using all ports available in a GPU and using them at maximum speed, then upstreaming them in the Linux NFS client. This multipathing makes the solution particularly suited to GPU-based workloads.

VAST has been introducing many improvements and enhancements in the last year. The VAST DataStore is already a comprehensive enterprise storage solution updated with data protection and security features. The VAST Data Catalog is a metadata index that resides in the VAST DataBase. The VAST DataBase combines an exabyte-scale namespace with a tabular database. VAST DataSpace excels in element-level locking and contains a unique cache integrity architecture. And more improvements are coming in the next year.

Strengths: VAST Data offers a Kubernetes CSI driver that provides support for NFS over RDMA and storage pools that can be allocated to specific clusters or containers. The driver supports policy-based communication and QoS and includes a plug-in for Red Hat OpenStack called Manilla.

Challenges: Other than object storage integrations, cloud support capabilities are currently limited.

WEKA

WEKA Data Platform is a software-defined storage solution that provides all-flash-array performance. It can be used on-premises, in the cloud, or both. WEKA is deployed as containers and supports a range of deployment options. It consists of a single data platform with mixed workload capabilities and supports multiple protocols such as SMB, NFS, S3, POSIX, GPUDirect, and Kubernetes CSI. All deployments can be managed through a single console.

The solution is highly versatile and specifically designed to cater to demanding environments that require low latency, high performance, and cloud scalability. It finds applications in various fields, such as AI/ML, life sciences, financial trading, HPC, media rendering and visual effects, electronic design and automation, and engineering DevOps.

This solution offers a scale-out file storage system that spans various performance tiers with automatic scalability in the cloud. It supports various workloads and uses NVMe flash performance. The solution also supports HDDs in the capacity tier through object store integration and manages tiering across these two media types. It can support QLC NAND, but current flash economics do not provide any particular advantage to using this media type.

The WEKA Data Platform namespace expands into S3 object storage with dynamic data tiering, which automatically pushes cold data to the object tier from the NVMe flash tier. The platform also features snap-to-object for backup, archive, and asynchronous mirroring. It works with AWS, GCP, Azure, and OCI, allowing you to pause or restart a cluster, protect against single availability zone failure, or migrate file systems across regions. The WEKA platform can be deployed directly from the AWS, Azure, Google Cloud, and Oracle Cloud marketplaces, or it can use a certified WEKA software deployment on AWS Outposts. The WEKA platform also supports bring your own license (BYOL) for AWS, Google Cloud, Azure, and Oracle Cloud.

The WEKA Data Platform provides data management capabilities, including creating data copies through snapshots for DevOps use cases. However, WEKA plans to enhance its metadata tagging and querying capabilities. Third-party engines can analyze data and augment metadata of datasets residing on the platform.

WEKA’s monitoring platform captures and provides telemetry data and allows deep dives into certain metrics down to file system calls. The cloud-based monitoring service, WEKA Home, collects telemetry data and provides proactive support in case of detected issues. WEKA APIs cover all possible operations on the platform, enabling seamless integration with other software systems.

WEKA’s monitoring platform captures telemetry data and provides in-depth metrics analysis. Proactive support is offered through WEKA Home, while API integrations are made possible through comprehensive WEKA APIs covering all platform operations.

The WEKA Data Platform supports a broad Kubernetes ecosystem and integrates with Rancher, HPE Ezmeral, and Red Hat OpenShift. The Kubernetes CSI plug-in supports manual and dynamic volume provisioning. Further, direct quota integration is supported on a per-pod or per-container level.

WEKA supports GPUDirect and is an NVIDIA-certified partner. The company has demonstrated outstanding throughput even on single GPU server/mount points.

Strengths: WEKA has architected a robust, flexible, and massively scalable file system that is particularly well adapted for high-performance use cases. It offers automated tiering and a rich set of services via a single platform that eliminates the need to copy data through various dedicated storage tiers. Its single namespace encompassing file and object storage reduces infrastructure sprawl and complexity to benefit users and organizations alike.

Challenges: WEKA’s strong focus on performance and scalability has overshadowed organizations’ growing need for data analysis. Thanks to a dynamic roadmap, the company has acknowledged this as an area needing improvement and is working on enhancing its capabilities across most of the evaluated critical criteria.

6. Analysts’ Take

The scale-out file storage market is very active. Roadmaps show a general trend toward expanding hybrid-cloud use cases, implementing AI-based analytics, rolling out more data management capabilities, and strengthening ransomware protection.

The rise of hybrid-cloud use cases for scale-out file storage can be seen in two ways:

  1. Integration with object storage is being looked at with greater scrutiny. Platforms that integrate with object storage offer better opportunities for cost optimization, and some offerings are capable of analyzing demand or access patterns of certain data sets and subsequently automating data movement to either local or cloud-based object tiers, including long-term retention.
  2. Organizations want to bring data closer to cloud-based workloads, but also to weave cloud locations into their distributed data fabric or ensure certain data sets are served from specific cloud regions.

Today, almost all solutions support several storage tiers, such as NVMe flash, SAS/SATA SSDs, and HDDs. For the high-performance market, support for at least NVMe Flash is imperative. Some vendors are looking beyond these storage options and incorporating new flash media types, such as storage-class memory for improved performance and QLC NAND to improve capacity. There is a clear distinction between solutions that embrace new technologies (and some that are outright built upon them) and others whose approach is more cautious.

Two issues around storage-class memory and QLC flash were concerning last year. The issue about storage-class memory still remains: architectures based on Compute Express Link (CXL) are emerging, but it may take time to test and qualify those technologies for optimal use with scale-out file systems. However, the other issue around QLC NAND’s low durability is now less of a concern. Flash vendors are steadily improving QLC flash reliability regardless of the media form factor (proprietary or industry-standard EDSFF modules). Nevertheless, implementation of flash media wear control techniques at the architectural level still remains a key differentiator from a media durability perspective.

For the enterprise segment, data management capabilities are becoming crucial. As the focus shifts away from storage and moves toward extracting the value of data, organizations critically require data insights to consume scale-out file storage services optimally. Those capabilities tie into the earlier points around cloud and object storage integration: with data management capabilities, organizations can make better-informed decisions on data placement.

A non-negligible volume of modern high-performance workloads rely on massively parallel computational capabilities that can no longer be satisfied by processor-based computing. Those workloads rely on GPU computing, and in architectures where GPU and storage are capable of delivering staggering parallel I/O throughput at very low latencies, the CPU becomes a bottleneck. Storage solutions capable of supporting NVIDIA’s GPUDirect protocol are thus essential for those cutting-edge use cases. Nevertheless, not all solutions support GPUDirect as of today, and several vendors propose alternative solutions such as pNFS over RDMA.

Kubernetes integration should not be overlooked either. Container-based workloads, while not yet predominant in enterprises, are becoming the default way of developing new applications, and most modern workloads (including GPU-based computing and AI/ML/DL) use Kubernetes. Scale-out file systems are a treasure trove through which existing datasets can be reused and fed into these modern, container-based workloads to identify potentially new outcomes and extract more valuable findings.

Security measures are now vital, and thankfully, these capabilities have been strengthened over the last year, with many vendors improving their support of RBAC, data encryption (in flight and at rest), and platform certifications (for example, with FIPS-140-2), particularly those vendors seeking to sell their solutions to governments. Ransomware protection is picking up the pace as most solutions now offer immutable snapshots as a foundational capability. From there, two pathways of innovation are possible: the low-hanging fruit is the implementation of policy-based control mechanisms to lock immutable snapshots, followed by the implementation of multiple-administrator validation to change immutability policies. These should become commonplace in the next 18 to 24 months. The next step in ransomware protection is the implementation of advanced AI-based analytics to detect anomalies, proactively alert on them, and eventually thwart attacks stemming from ransomware threats.

Finally, it is interesting to note that most of the vendors are back to a more innovative pace compared to last year, both in terms of execution, delivered features, and roadmaps. This highlights the dynamic nature of the scale-out file storage market, both in terms of demand and use cases.

7. Methodology

For more information about our research process for Key Criteria and Radar reports, please visit our Methodology.

8. About GigaOm

GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise and business initiatives. Enterprise business leaders, CIOs, and technology organizations partner with GigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming their business. GigaOm’s advice empowers enterprises to successfully compete in an increasingly complicated business atmosphere that requires a solid understanding of constantly changing customer demands.

GigaOm works directly with enterprises both inside and outside of the IT organization to apply proven research and methodologies designed to avoid pitfalls and roadblocks while balancing risk and innovation. Research methodologies include but are not limited to adoption and benchmarking surveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technical benchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from early adopters to mainstream enterprises.

GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOm connects with engaged and loyal subscribers on a deep and meaningful level.

9. Copyright

© Knowingly, Inc. 2023 "GigaOm Radar for Scale-Out File Storage" is a trademark of Knowingly, Inc. For permission to reproduce this report, please contact sales@gigaom.com.