Part II – EMC DSSD D5 Direct Attached Shared AFA
Lets take a closer look at how EMC DSSD D5 works, its hardware and software components, how it compares and other considerations.
How Does DSSD D5 Work
Up to 48 Linux servers attach via dual port PCIe Gen 3 x8 cards that are stateless. Stateless simply means they do not have any flash or are not being used as storage cards, rather, they are essentially just an NVMe adapter card. With the first release block, HDFS file along with object and APIs are available for Linux systems. These drivers enabling the shared NVMe storage to be accessed by applications using different streamlined server and storage I/O driver software stacks to cut latency. DSSD D5 is meant to be a rack scale solutions so distance is measured as inside a rack (e.g. a couple of meters).
The 5U tall DSSD D5 supports 48 servers via a pair of I/O Modules (IOM) each with 48 ports that in turn attach to the data plane and on to the Flash Modules (FM). Also attached to the data plane are a pair of controllers that are active / active for performing management tasks, however they do not sit in the data path. This means that host client directly access the FMs without having to go through a controller which is the case in traditional storage systems and AFAs. The controllers only get involved when there is some setup, configuration or other management activities, otherwise they get out-of-the-way, kind of like how management should function. There when you need them to help, then get out-of-the-way so productive work can be done.
Pardon the following hand drawn sketches, you can see some nice pretty diagrams, videos and other content via the EMC Pulse Blog as well as elsewhere.
Note that the host client servers take on the responsibility for managing and coordinating data consistency meaning data can be shared between servers assuming applicable software is used for implementing integrity. This means that clustering and other software that can support shared storage are able to support low latency high performance read and write activity the DSSD D5 as opposed to relying on the underlying storage system for handling the shared storage coordination such as in a NAS. Another note is that the DSSD D5 is optimized for concurrent multi-threaded and asynchronous I/O operations along with atomic writes for data integrity that enable the multiple cores in today’s faster processors to be more effectively leveraged.
The data plane is a mesh or switch or expander based back plane enabling any of the north bound (host client-server) 96 (2 x 48) PCIe Gen 3 x4 ports to reach the up to 36 (or as few as 18) FMs that are also dual pathed. Note that the host client-server PCIe dual port cards are Gen 3 x8 while the DSSD D5 ports are Gen 3 x4. Simple math should tell you that if are going to have 2 x PCIe Gen 3 x4 ports running at full speed, you want to have a Gen 3 x8 connection inside the server to get full performance.
Think of the data plane similar to how a SAS expander works in an enclosure or a SAS switch, the difference being it is PCIe and not SAS or other protocol. Note that even though the terms mesh, fabric, switch, network are used, these are NOT attached to traditional LAN, SAN, NAS or other networks. Instead, this is a private “networked back plane” between the server and storage devices (e.g. FM).
The dual controllers (e.g. control plane) over see the flash management including garbage collection among other tasks, as well as storage is thin provisioned.
Dual Controllers (active/active) are connected to each other (e.g. control plane) as well as to the data path, however, do not sit in the data path. Thus this is a fast path control path approach meaning the controllers can get involved to do management functions when needed, and get out-of-the-way of work when not needed. The controllers are hot-swap and add global management functions including setting up, tearing down host client/server I/O paths, mappings and affinities. Controllers also support the management of CUBIC RAID data protection functions performed by the Flash Modules (FM).
Other functions the controllers implement leveraging their CPUs and DRAM include flash translation layer (FTL) functions normally handled by SSD cards, drives or other devices. These FTL functions include wear-leveling for durability, garbage collection, voltage power management among other tasks. The result is that the flash modules are able to spend more time and their resources handling I/O operations vs. handling management tasks vs. traditional off the shelf SSD drives, cards or devices.
The FMs insert from the front and come in two sizes of 2TB and 4TB of raw NAND capacity. What’s different about the FMs vs. some other vendors approach is that these are not your traditional PCIe flash cards, instead they are custom cards with a proprietary ASIC and raw nand dies. DRAM is used in the FM as a buffer to hold data for write optimization as well as enhance wear-leveling to increase flash endurance.
The result is up to thousands of nand dies spread over up to 36 FMs however more important, more performance being derived out of those resources. The increased performance comes from DSSD implementing its own flash translation layer, garbage collection, power voltage management among other techniques to derive more useful work per watt of energy consumed.
EMC DSSD performance claims:
- 100 microsecond latency for small IOs
- 100GB bandwidth for large IOs
- 10 Million small IO IOPs
- Up to 144TB raw capacity
How Does It Compare To Other AFA and SSD solutions
There will be many apples to oranges comparisons as is often the case with new technologies or at least until others arrive in the market.
Some general comparisons that may be apples to oranges as opposed to apples to apples include:
- Shared and dense fast nand flash (eMLC) SSD storage
- disaggregated flash SSD storage from server while enabling high performance, low latency
- Eliminate pools or ponds of dedicated SSD storage capacity and performance
- Not a SAN yet more than server-side flash or flash SSD JBOD
- Underlying Flash Translation Layer (FTL) is disaggregated from SSD devices
- Optimized hardware and software data path
- Requires special server-side stateless adapter for accessing shared storage
Some other comparisons include:
- Hybrid and AFA shared via some server storage I/O network (good sharing, feature rich, resilient, slower performance and higher latency due to hardware, network and server I/O software stacks). For example EMC VMAX, VNX, XtremIO among others.
- Server attached flash SSD aka server SAN (flash SSD creates islands of technology, lower resource sharing, data shuffling between servers, limited or no data services, management complexity). For example PCIe flash SSD state full (persistent) cards where data is stored or used as a cache along with associated management tools and drivers.
- DSSD D5 is a rack-scale hybrid approach combing direct attached shared flash with lower latency, higher performance vs. traditional AFA or hybrid storage array, better resource usage, sharing, management and performance vs. traditional dedicated server flash. Compliment server-side data infrastructure and applications scale-out software. Server applications can reach NVMe storage via user spacing with block, hdfs, Flood and other APIs.
What Happened to Server PCIe cards and Server SANs
If you recall a few years ago the industry rage was flash SSD PCIe server cards from vendors such as EMC, FusionIO (now part of SANdisk), Intel (still Intel), LSI (now part of Seagate), Micron (still Micron) and STEC (now part of Western Digital) among others. Server side flash SSD PCIe cards are still popular particular with newer NVMe controller based models that use the NVMe protocol stack instead of AHC/SATA or others.
However as is often the case, things evolve and while there is still a place for server-side state full PCIe flash cards either for data or as cache, there is also the need to combine and simplify management, as well as streamline the software I/O stacks which is where EMC DSSD D5 comes into play. It enables consolidation of server-side SSD cards into a shared 5U chassis enabling up to 48 dual pathed servers access to the flash pools while using streamlined server software stacks and drivers that leverage NVMe over PCIe.
Where to learn more
Continue reading with the following links about NVMe, flash SSD and EMC DSSD.
What this all means
EMC with DSSD D5 now has another solution to offer clients, granted their challenge as it has been over the past couple of decades now will be to educate and compensate their sales force and partners on what technology solution to put for different needs.
On one hand, life could be simpler for EMC if they only had one platform solution that would then be the answer to every problem, something that some other vendors and startups face. Likewise, if all you have is one solution, then while you can try to make that solution fit different environments, or, get the environment to adapt to the solution, having options is a good thing if those options can remove complexity along with cost while boosting productivity.
I would like to see support for other operating systems such as Windows, particular with the future Windows 2016 based Nano, as well as hypervisors including VMware, Hyper-V among others. On the other hand I also would like to see a Sharp Aquous Quattron 80" 1080p 240Hz 3D TV on my wall to watch HD videos from my DJI Phantom Drone. For now focusing on Linux makes sense, however, would be nice to see some more platforms supported.
Keep an eye on the NVMe space as we are seeing NVMe solutions appearing inside servers, storage system, external dedicated and shared, as well as some other emerging things including NVMe over Fabric. Learn more about EMC DSSD D5 here.
Ok, nuff said (for now)
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2018 Server StorageIO(R) and UnlimitedIO All Rights Reserved