NVMe/TCP: Bridging Traditional and Next-Generation Data Center Architectures

I’ve been writing a lot lately about NVMe/TCP and other enabling data center technologies. I used the approach taken by Lightbits Labs as an example, thanks to the ability of this solution to combine efficiency and modern protocols on commodity hardware, as well as its ability to leverage modern data center accelerators. I like solutions that offer multiple options to the user, especially now that IT needs additional flexibility to respond to an increasing number of challenges.

New Year, New Challenges

Last year was crazy. The COVID pandemic had a huge impact on IT budgets and enterprise organizations were forced to shift investments to keep up with the new user necessities. Most organizations were initially caught flat-footed, struggling to support work-from-home initiatives and to drive critical digital transformation processes. We saw an initial halt of IT investment, followed quickly by a sharp acceleration and diversification in spending—albeit, always with a keen eye on cost savings and efficiency.

As a sector, we moved quickly from an initial adoption phase to production for new platforms like Kubernetes, even as many enterprises still struggle to properly manage hybrid storage infrastructures providing resources to different types of environments. For example, Kubernetes requires fast resource provisioning, many small volumes, and persistent data storage that may last only a few minutes at a time. On the other hand, we maintain slower moving physical and virtual environments where the capacity of the single volume is important while the lifespan of the single storage volume is much longer. Combining these two needs is all but easy, especially when we add parameters, such as:

  • Size of the infrastructure
  • Performance and latency
  • Infrastructure configurability (or composability)

The last item is particularly important in modern hybrid scenarios, especially if the IT organization doesn’t know how quickly its infrastructure will change over time.

Building Flexible Storage Infrastructures

IT infrastructure teams are struggling with the limitations of traditional architectures. In the past, resources were consolidated in a single scale-up or scale-out system, but today we have technology and protocols like NVMe/TCP that enable us to design storage infrastructures that take advantage of the resources available across the entire data center and see them as local.

Thanks to NVMe/TCP, it is now possible to build hyper-scalable and high-performance storage systems with commodity hardware, without the resource and topology limitations imposed by traditional solutions. It’s like having the flexibility and efficiency of public cloud storage, on premises and for organizations of all sizes. In fact, the main benefit of the added flexibility is to join the performance advantages of direct access storage with the efficiency and optimizations of storage area networks.

To give you a better understanding about this kind of next generation infrastructure design and its benefits let’s talk about its common application. The majority of modern applications, like NoSQL databases or AI and big data frameworks, are architected to work in a scale-out fashion. High-availability mechanisms embedded in the application and every node in the cluster directly access their own resources without relying on a traditional shared storage system. This enables simplicity, performance, and scalability while negatively impacting resource utilization. With a flexible, and more configurable, storage infrastructure every node has access to network resources like they were local, while the storage system optimizes capacity utilization and improves overall availability. NVMe/TCP takes this a step further thanks to its ability to run on standard network interfaces, simplifying the entire hardware stack and associated costs. At the end of the day, the goal is to achieve the best of both worlds in terms of performance, resource utilization, and cost savings.

Enhanced flexibility and Kubernetes

Kubernetes is becoming the de facto standard for deploying most of the applications I mentioned above. It is an orchestrator that simplifies deployment and management of complex container-based applications. Kubernetes allocates resources depending on application needs and releases them when they are no longer necessary. This means that a single Kubernetes cluster is a highly volatile environment with thousands of containers and storage volumes continuously spun up and down.

The benefits of the enhanced flexibility introduced by modern NVMe/TCP-based infrastructures are even more visible in a Kubernetes environment. When multiple applications are consolidated in a single cluster, local storage utilization increases and becomes a source of concern for the potential risk of data loss or long rebuilding processes that may impact performance consistency and service availability. In enterprise environments this is even more of an issue, as lift and shift migrations can bring forward traditional applications not designed with embedded availability and data protection mechanisms.

Kubernetes now offers a standard interface to deal with storage systems (the container storage interface or CSI), and many vendors have already developed the necessary plug-ins to make their storage compatible with the orchestrator. Associating the benefits of this improved infrastructure configurability with a CSI plug-in enables users to build extremely scalable next-generation infrastructures able to serve several types of applications concurrently while keeping TCO down.

Wrapping Up

Once again, I single out Lightbits Labs as an example, if you want to investigate further on this topic. The LightOS CSI plug-in supports Kubernetes, extending the benefits of NVMe/TCP composable storage to a larger universe of applications, and allowing for additional simplification, consolidation, and cost savings throughout the entire datacenter infrastructure. Additionally, the recent addition of snapshots and thin clones in LightOS simplifies the migration of legacy applications to Kubernetes as the persistent storage can support traditional enterprise workflows as well as database devops and snapshot-based backup integration.

NVMe/TCP is a great protocol to bridge traditional data centers to the coming era. Its flexibility enables organizations to plan for fast-evolving scenarios while keeping costs down. When correctly implemented, NVMe/TCP enables the user to support multiple configuration layouts on different generations of Ethernet networks. Organizations can start small, protect their existing investments and grow over time while helping IT respond swiftly to new business needs.