I recently spent four days with the world’s leading vendors of storage technologies in Santa Clara, CA, at the SDC18 (Storage Developers Conference 2018) held by the SNIA (Storage Network Industry Association).
I congratulate the SNIA for the organization of such a high-level event year after year. I learn so much at this conference, and it is an efficient way to stay up-to-date with the storage industry. No nonsense, no salesmen, just engineers promoting their technologies.
The focus has shifted
This year, it seems that the majority of attendees were either hardware vendors or hyperscalers. A minority of the presentations were about theory, general code libraries or high-level software solutions. Meanwhile, the majority of the presentations focused on specific standard. All were excellent. I last attended in 2016, and the difference is obvious. To be honest, I admit I only focus on presentations of technologies, and do not consider here SNIA efforts for standardization, community, etc.
Mostly, software topics covered focused on libraries for high throughput and low-level latencies, dedicated to a specific technology. These APIs and standards provide a low-level service, quite far from the requirements of an application that is ready to migrate from traditional file-oriented storage.
This suggests that hardware is taking a step forward, and that high-level applications are barely starting to be adapted on top of those specific backends. We can expect a surge in activity on the software side for the next few years. E.g. with higher-level “wrapper” API that would emerge, software solution unifying the standards behind higher-level semantics, etc.
Buzzwords are born at SDC. No one seems to be intentionally trying to hype new ideas, but the large number of quality people at the conference leads to the spread of these new terms.
Storage Disaggregation (at rack-scale). In order to optimize the TCO of a platform, it is now more common to disjoin storage devices from the CPUs that control them. Enabled by decreasing interconnection costs, this disaggregation lets operators scale either the CPU/RAM or the storage device, depending on the needs of the workload. But more important, exposing the devices over the network doesn’t require to mount them on a CPU that would act as a gateway (and a bottleneck).
RDMA techniques, OS bypass. The idea is not new, but it has reached an inflection point. Stacking DB software (with logs and GC) on a filesystem (with logs and GC) on a storage device (with logs and GC) is not the best idea when seeking low and deterministic latencies and optimal bandwidths. Instead of mounting storage devices in the OS that controls them, expose the devices to the application. A similar technique is available to handle network operations.
Co-Optimization (software/hardware). With RDMA techniques available, enabling both storage and network management within the application, it becomes possible to optimize this management for the workloads of the application. As the logic is moved to the application level, hardware can be cheaper because fewer components are needed. It is another trade-off between human and hardware costs.
Computational storage, computing at the edge. This is the presence of either FPGA or low-consumption ARM CPU + RAM directly on SSD, accessible over the PCI-e bus via the NVMe protocol, using a specific namespace. Each computational element is yet another opportunity to offload simple yet processor-intensive tasks from the CPU. And this can even orchestrate data movement between PCI-e devices without passing through a CPU.
NVMe becomes the standard, accompanying the increase in PCI-e bandwidth. The NVMe-oF variant, a.k.a. NVMe-over-Fabric, bridges the NVMe protocol over traditional network technologies. This makes sense when specific NICs are available to enable RDMA techniques (over IP for iWarp, Ethernet for RoCE, InfiniBand, or Fibre Channel).
SSD Open-Channel is a new standard API (since 2015) to address SSDs with a simple instruction set that closely matches NVMe semantics. These SSDs do not provide the FTL feature, allowing for cheaper hardware.
SPKD, a de facto standard API to address NVMe drives, accessible either locally or over a Fabric. There is a great deal of activity around this.
KV drives, or SSD drives accessed via a <Key,Value> oriented API. These are not very well known, but they are worth citing in the context of OpenIO. Do you remember the Kinetic HDD? We can do something similar on NVMe that is blazingly fast. And they offer a high-level API that is much easier to manage than block devices though SPDK, with the same promise of low and deterministic latencies as wCell, and with high throughput.
Great news for OpenIO
There was no need to attend a full day of presentations to know that OpenIO is on the right path.
What makes OpenIO SDS unique is its flexibility, thanks to its extensive usage of indirections. This design trait has its roots in the choice, from day one, to avoid static placement algorithms (Chord, Ring, or Chrushmap). Instead, we use logic to load-balance the data when placing it, and a data structure to remember the locations. Thus, OpenIO SDS is able to efficiently manage various workloads in a single namespace, thanks to the enforcement of a feature-rich storage policy applied at the moment of each object’s creation.
As a result, it is easy to compose a platform and deploy storage tiers of different qualities, each suitable for a different workload, but all gathered in a single namespace. The maturity of extremely efficient technologies has reached a point that makes them affordable for OpenIO. It is a great opportunity for us to tune our perfect SDS into that thin software layer that will squeeze your flash arrays, even in the most demanding workloads!