Author: Nishant Lodha, Product Marketing, QLogic Corp

The Evolution of Disk and Fabric

Ever since IBM shipped the world’s first hard disk drive (HDD), the RAMAC 305 in 1956, persistent storage technology has steadily evolved. In the early 1990s, various manufacturers introduced storage devices known today as flash-based or dynamic random access memory (DRAM)-based solid state disk (SSD). The SSDs had no moving (mechanical) components, which allowed them to deliver lower latency and significantly faster access times.

While SSDs have made great strides in boosting performance, its interface—the 6Gb per second (Gb/s) SAS 3 bus—began to hinder further advances in performance. Storage devices then moved to the PCI Express (PCIe) bus, which is capable of up to 500 MB/s per lane for PCIe 2.0 and up to 1000 MB/s per lane for PCIe 3.0, with up to 16 lanes. In addition to improved bandwidth, latency is reduced by several microseconds due to a faster interface as well as the ability to directly attach to the chip set or CPU. Today, PCIe SSDs are widely available from an array of manufacturers.

Additionally, PCIe SSDs removed the hardware bottleneck of using the SATA interface, but these devices continued to use the Advanced Host Controller Interface (AHCI) protocol/command set, which dates back to 2004 and was designed with rotating hard drives (HDDs) in mind. AHCI addressed the need for multiple commands to read the data, but SSDs do not have this need. Because the first PCIe SSDs used the AHCI command set, they were burdened with the overhead that comes with AHCI. Obviously, in order to become more efficient, the industry had to develop an interface that eliminated the limits imposed by AHCI.

It wasn’t only the overheads of AHCI that challenged PCIe SSD adoption, though, since each SSD vendor provided a unique driver for each operating system (OS), with a varying subset of features, creating complexity for customer looking for a homogeneous high speed flash solution for enterprise data centers.

To enable an optimized command set and usher in faster adoption and interoperability of PCIe SSDs, industry leaders have defined the Non Volatile Memory Express (NVMe) standard. NVMe defines an optimized, scalable command set that avoids burdening the device with legacy support requirements. It also enables standard drivers to be written for each OS and enables interoperability between implementations, reducing complexity and simplifying management.

SCSI and NVMe Differences

While the SCSI/AHCI interface comes with the benefit of wide software compatibility, it cannot deliver optimal performance when used with SSDs connected via the PCIe bus. As a logical interface, AHCI was developed when the purpose was to connect the CPU/memory subsystem with a much slower storage subsystem based on rotating magnetic media. As a result, AHCI introduces certain inefficiencies when used with SSD devices, whose characteristics are far closer to behaving like DRAM than like spinning media.

The NVMe device interface has been designed from the ground up, capitalizing on the low latency and parallelism of PCIe SSDs, and complementing the parallelism of contemporary CPUs, platforms and applications. At a high level, the basic advantages of NVMe relate to its ability to exploit parallelism in host hardware and software, manifested by the differences in command queue depths, efficiency of interrupt processing, and the number of un-cacheable register accesses, etc., resulting in significant performance improvements across a variety of dimensions.

NVMe Deep Dive

The NVMe interface is defined in a scalable fashion such that it can support the needs of Enterprise and Client (e.g., consumer devices) in a flexible way. NVM Express has been developed by an industry consortium, the NVM Express Workgroup. Version 1.0 of the interface specification was released on March 1, 2011 and a version 1.2 released on Nov. 12, 2014. Expansion of the standard to include NVMe over Fabrics (including Fibre Channel) was completed in June, 2016.

Today more than 100 companies participate in the definition of the interface, which some very impressive characteristics:

  • Architected from the ground up for this and next generation Non-Volatile Memory to address Enterprise and Client system needs
  • Developed by an open industry consortium, directed by a 13 company Promoter Group
  • Architected for on-motherboard PCIe connectivity
  • Capitalize on multi-channel memory access with scalable port width and scalable link speed

As a result of the simplicity, parallelism and efficiently of NVMe, it delivers significant performance gains vs. SCSI. Some metrics include –

  • For 100% random reads, NVMe has 3x better IOPS than 12Gbps SAS
    [1]
  • For 70% random reads, NVMe has 2x better IOPS than 12Gbps SAS [1]
  • For 100% random writes, NVMe has 1.5x better IOPs than 12Gbps SAS [1]
  • For 100% sequential reads: NVMe has 2x higher throughput than 12Gbps SAS [1]
  • For 100% sequential writes: NVMe has 2.5x higher throughput than 12Gbps SAS [1]

In addition to just IOPS and throughput, the efficiencies of command structure described above also reduce CPU cycles by half, as well as reduce latency by more than 200 microseconds than
12 Gbps SAS.

NVMe and Fibre Channel (FC-NVMe)

The Fibre Channel protocol (FCP) has been the dominant protocol used to connect servers with remote shared storage comprising of HDDs and SSDs. FCP transports SCSI commands encapsulated into the Fibre Channel frame and is one of most reliable and trusted networks in the data center for accessing SCSI-based storage. While FCP can be used to access remote shared NVMe-based storage, such a mechanism requires the interpretation and translation of the SCSI commands encapsulated and transported by FCP into NVMe commands that can be processed by the NVMe storage array. This translation and interpretation can impose performance penalties when accessing NVMe storage and in turn negates the benefits of efficiency and simplicity of NVMe.

FC-NVMe extends the simplicity, efficiency and end-to-end NVMe model where NVMe commands and structures are transferred end-to-end, requiring no translations. Fibre Channel’s inherent multi-queue capability, parallelism, deep queues, and battle-hardened reliability make it an ideal transport for NVMe across the fabric. FC-NVMe implementations will be backward compatible with Fibre Channel Protocol (FCP) transporting SCSI so a single FCNVMe adapter will support both SCSI-based HDDs and SSDs, as well as NVMe-based PCIe SSDs.

Standardization

New T11 project to define an NVMe over Fibre Channel Protocol mapping has been initiated. In August 2014, the INCITS T11 started a FC-NVMe working group and it is expected that the specification will be complete by end of calendar year 2016. Fibre Channel networks are a natural fit for shared remote access to high speed NVMe due to their trusted, lossless and high performance characteristics.

Conclusion

As next-generation data-intensive workloads transition to low-latency NVMe flash-based storage to meet increasing user demand, the Fibre Channel industry is combining the lossless, highly deterministic nature of Fibre Channel with NVMe. FC-NVMe targets the performance, application response time, and scalability needed for next generation data centers, while leveraging existing Fibre Channel infrastructures. FCIA is pioneering this effort with industry leaders, which in time, will yield significant operational benefits to data center operators and IT managers.