SAN storage ain’t cheap. It’s probably the most expensive piece of hardware in your data center. I remember when SAN storage first came on the scene, and it was supposed to bring storage costs down compared to the sprawl of locally attached storage that we had back then. Well, costs didn’t come down, they’ve only skyrocketed.
In the early days of SAN storage, before wide spread virtualization, and before the advent of PCI-express, x86 servers running Windows and Linux couldn’t drive a lot of storage I/O, and so the SAN storage arrays deployed for Windows and Linux were typically small ones, like EMC Clariion and HP MSA. Larger arrays like the EMC Symmetrix and the HP XP were deployed for big iron UNIX systems that could drive high I/O.
As virtualization and the x86 architecture have matured, the growth of shared storage deployed for x86 systems has grown dramatically. This may have lead you to deploy a number of small storage arrays, or you may have consolidated onto large enterprise class storage arrays. The larger storage arrays like the EMC VMAX and the 3PAR P10000 arrays offer enterprise-class availability, wide striping and auto-tiering that maximize the sharing of capacity and I/O performance across all the workloads using the array.
Auto-tiering requires at least two types of disks, a fast type, like 10k or 15k Fibre Channel or SAS disks, and a slower, cheaper type like 7200 RPM SATA disks. Auto-tiering is potentially a game changer in terms of storage cost. Typically, a relatively small percentage of your data requires high-performance disk, perhaps 20%, while the rest of your data could live just fine on low performance disk. Unfortunately, it can be difficult to figure out what data to put where. Auto-tiering automatically moves blocks of data to the correct type of disk, depending on how often it’s hit. Once the hit rate has been analyzed and the blocks have been moved, performance is optimized. The cost savings is seen when you realize that you need to buy fewer fast expensive disks, and you can buy more cheap slow disks. Unfortunately, the storage vendors make you pay a pretty high price to license the auto-tiering feature, which can take a big bite out of your expected savings. This is a point you should argue vigorously with your storage salesman.
There’s another less obvious benefit of auto-tiering. SATA disks are known to have a higher failure rate and a shorter lifespan as compared to enterprise-class FC and SAS disks. With auto-tiering enabled, heavily accessed data is moved off of your SATA disks, leaving behind lightly accessed data. This leads to a more leisurely life for your SATA disks, which should increase their lifespan and lead to reduced failures.
A compliment to or indeed an alternative to auto-tiering is the use of a layer of solid state storage in front of your mechanical disks. My caching reads and writes on flash before flushing it to slower disk, very good performance can be achieved, while relying on slow SATA disk. Now flash isn’t cheap, but it’s getting cheaper as we speak.
Various storage vendors have employed flash in different ways. There are two primary form factors for flash, the disk drive form factor, where SSD disks are installed into the array, and the PCI card form factor, which can be deployed in a PCI slot within the storage array controller. The PCI form factor provides potentially higher performance, since it’s attached to the controller’s processors via a fast I/O bus. The disk form factor is attached to the controller, like your mechanical disks, through SAS or Fibre Channel cables, and are subject to the performance penalties associated with RAID. However, very large flash can be achieved by building a RAID of SSD disks, whereas the PCI form factor’s scalability is limited to a few TB.
Flash can in fact be used in an auto-tiering scenario or a cache scenario as I mentioned above. In the case of auto-tiering, a good deal of performance benefit can be had, assuming that you’ve got a small amount of very heavily hit data. For example, in a VDI environment, assuming your VDI instances are cloned from a common boot disk, that boot disk will be hit very hard during mass reboots, the so called boot storm. If that boot disk is stored in flash, the storm will likely be a non-event. Flash configured as cache will serve the same purpose, once the boot disk has been read into the flash cache, subsequent reads will be fast. I think the benefit of the cache model is that any data recently read will be cached, where flash used in auto-tiering will be limited to data that has been evaluated as needing to be on flash. The point is that more performance benefit can be had using flash as cache, allowing broader use of slow mechanical disk, which is cheaper.
Now storage vendors have deployed flash as cache in various ways. If the flash is used only as read cache, then it might provide no performance benefits for writes. However, since the cache will take some of the read load off of the disks, the disks will have more idle time to service writes, so it still provides some benefit. Still, you should fully understand how flash is implemented by the storage vendor, so that you’ll understand the performance characteristics and the cost benefit of that expensive flash before you purchase their storage array.
If you’ve got a good sized data center, you’ve probably got a fibre channel storage network running along side your Ethernet network. Your hosts have two sets of interface cards, two sets of cables running to two sets of switches, administered by two sets of engineers. There’s quite a bit of money tied up in that storage network isn’t there? What if we could get rid of it and get our data center down to one network? It can be done.
There are a number of storage protocols that can be deployed over Ethernet, including FCoE, iSCSI, NFS, and CIFS. FCoE (fibre channel over Ethernet) has reached a maturity level where it can be deployed at least between the hosts and the access layer Ethernet switches, which means that the host interfaces and extra cables can be reduced, but typically the storage traffic is then split off and sent over to the traditional fibre channel SAN where the storage is attached. To do this, the hosts must contain converged network adapters (CNA), and the access layer switches must support FCoE.
Fibre Channel Over Ethernet
FCoE encapsulates Fibre Channel frames inside of Ethernet packets, and marks the packets with a different Ether type than other Ethernet traffic, allowing the switches to treat the traffic with a higher priority. This is necessary since fibre channel traffic is very sensitive to packet loss, and Ethernet is normally quite happy to drop packets when it gets busy.
The encapsulation adds overhead to standard fibre channel traffic, which reduces performance somewhat, but FCoE is normally carried over 10G Ethernet, which is faster than 8G Fibre Channel. When all is said and done, the two provide roughly equivalent performance.
FCoE hasn’t quite matured enough to deploy end to end yet, but sometime soon we’ll be able to pass the FCoE traffic up to the data center core switches where the storage arrays, properly fitted with FCoE cards, will be attached. Today though, passing FCoE through multiple switched (multi-hop FCoE) is not quite ready for prime time.
iSCSI is an IP-based block protocol that is carried over Ethernet. Unlike Fibre Channel, which does not use the IP protocol, iSCSI traffic is encapsulated in an IP packet, which in turn is encapsulated in an Ethernet frame. This additional layer of encapsulation increases the overhead and therefore reduces performance as compared to FCoE and Fibre Channel. Still it’s a useful protocol, since many network switches may not support FCoE, and hosts may not have CNA’s installed to support FCoE. Many smaller data centers have adopted iSCSI as a means to avoid the cost of building a Fibre Channel SAN.
NFS is another way to present storage over Ethernet. Unlike FC, FCoE and iSCSI, NFS is not a block storage protocol, rather it is a file sharing protocol. The NFS server presents file shares over the network that can be mounted as shared network drives by clients and other servers. NFS storage can be used for VMware data stores and can also be used by Linux servers, to store, for example, Oracle databases, which might be a large portion of the storage footprint in your data center. Because it’s not a block protocol, a server can’t boot from an NFS volume. In cases where boot-from-SAN is desired, but NFS is in use, iSCSI is often used to provide boot volumes.
Using FCoE, iSCSI, and NFS enables the data center to be built without the need for a separate Fibre Channel SAN. There are some moderate trade-offs. For example, the storage I/O latency over Ethernet is a bit higher than over Fibre Channel, perhaps 5% higher. Fault tolerant Ethernet links may take longer to detect a failure and fail over to an alternate link than Fibre Channel links, though this can be addressed with various network configurations like EtherChannel aggregation. Even with the drawbacks, the performance of storage over Ethernet can be designed to a predicable level of performance, thereby making it a viable option that can provide significant savings.
Network Attached Storage
In large data centers, file shares are commonly hosted on a dedicated NAS device, like an EMC Celerra or a NetApp. For many, the move to NAS was done to consolidate Windows file servers and to provide more scalability and high availability for their file shares. Now that VMware support NFS, NAS arrays have become the backing storage for VMware in some data centers. Commonly, data centers contain both NAS and block storage arrays for different purposes. This is another area where savings can be found. Several NAS arrays like EMC VNX and NetApp can provide both NAS and block storage from a single array, and support the complete range of protocols and connectivity including FC and Ethernet-based protocols. These are known as unified storage arrays.
Unified arrays allow a data center to standardize on one array type, and provide a transition to an Ethernet only storage network. NetApp in particular seems to offer a really good feature set that allows arrays to be deployed in a federated group, which enables volumes to be easily migrated from one array to another, and allows easy scale-out without building huge monolithic arrays. NetApp’s deduplication feature can save a significant amount of storage without much performance impact, which translates to more savings.
Keep in mind that I think there’s still a use case for enterprise-class block storage, but I’ve come to view them like I view Ferrari’s. They’re awesome cars, but a Toyota Camry can do the job just fine. I suppose we’d all love a row of EMC VMAX or 3PAR v800’s in our data centers, but the cost is high, and many companies simply can’t afford it. The alternative? Federated mid-range unified storage arrays, like the NetApp.