I have chance to get involved into a EMC pre-sale meeting today. During the meeting, the EMC pre-sale Engineer introduced F.AS.T v1 and V2 to us. I did know what FAST it was before, but this presentation really opened my eyes and also Engineer was able to answer few of my questions abour Netapps Deduplication vs EMC Compression. I will bring details into this post. However, because there ain’t much available data in the Internet, I have to draw an ugly diagram to help me expressing my idea. I may make mistakes, please feel free to point out.
What is EMC FAST v1?
As you can see from full name of F.A.S.T, It’s about tiering your storage automatically. As you may know the transitional SAN storage contains FC disk and SATA disk. FC is fast and expensive and SATA is slow for random w/r and cheaper. As SAN administrator in the company, your job would be give right LUNs to appropriate servers to fit SLA requirement.
With F.A.S.T, it basically did following things:
1. Add EFD(Enterprise Flash Disk) layer.
As we all know, SSD (solid storage disk) is 100 times faster than FC. It has SLC(single layer cell) and MLC(multiple layer cell) two types. All SSD are short life product. So how does EMC manage to overcome these issues?
These EFD are made of SLC SSD not MLC, meaning it’s faster than MLC SSD. As you may have heard, SSD is easy to be broken. The reason for easy damaged SSD(same as your usb flash disk) is rewrite same location repeatedly. For the normal system, you write first block of flash disk, and wipe out and write again. So the first block of flash disk is used too many times and easy to damaged. EMC EFD won’t use same spot twice until it has finished all other available spots in the SSD.
Each EFD has 3 components. Cache area (fastest area), normal storage area and hotspot area. All data will write to fast cache first and then, write to normal storage. The spot of normal storage will be discarded after few times reusing and it will start to use spot in the hotspot area to avoid potential bad spot. Same thing apply to cache area, if one of spot is damaged it will start to use spot in the normal area. According to EMC, the EFD has 5 years warranty.
2. It added a virtual LUN layer
Virtual lun can isolated Host and actually storage details. Host doesn’t need to know which physical LUNs (FC,EFD,SATA) it’s operating. With virtual LUN technology, the FAST true mean can work under SAN layer.
3. Auto moving LUNs between tiers
This is what FAST for. F.A.S.T can automatically (or manually) move your LUNs to different tier. Busy and high demanding LUNs will move to fastest tier (EFD) or FC. The low priority LUNs can be shift to SATA to safe fast speed tier for SLA requirement.
What is FAST v2?
We have briefly introduced FAST v1 system as above. After EMC push this technology to it’s customers, they discovered most of customers actually bought lots of FC disks instead of SATA disk. Because FAST v1 is operating on LUN level. Everytime it moves, it has to move whole LUN which is slow and inefficient. so FAST v2 comes to alive.
FAST v2 made some big changes.
1. Let’s making pool
Well, basically, you need to create pool first. A pool is combination of different tiers resource. For example, you can make a pool which has 3xEFD, 4x FC, 5xSATA with all RAID 5. Then, you can create LUNs on this pool. The LUN will be built cross all tiers instead of sitting on one.
2. Let’s move 1GB data segment.
From FAST v1, we move whole LUN which takes long long time and also may not be effective as well. With this version of FAST, we move data with 1GB data segment as smallest operation unit. Meaning if one LUN got hit very hard, the system will use fast cache hold the data and started to move that most busy segment from SATA to EFD. Then, it will move other segments later on according to utilization of LUNs.
EMC compression vs Netapps Deduplication
I have an interesting conversation with EMC Engineer. EMC has preach block level compression to all systems instead of deduplication like NetApps did. This compression and decompression can be done on the fly. It will add about 5% performance overhead which you may not notice. However, it gives you almost 50% compress ratio comparing with deduplication ratio which is only 30% most of time. For the SP utilization, the compression will cost 5% utilization and dedup will cost around 20% CPU.
EMC is very cautious about CPU utilization on Storage. They reckon the normal utilization should be around 25% of single CPU. If one of your SP failed, then, your load will be 50% on remain CPU. They don’t want to use deduplication cost too much cpu resource at this time. At least, not with current CPU horsepower. According to them, the CPU will be much powerful in 2 years which will not only allow to do deduplication, compression, it will also allow you to directly run VMs (like WAN accelerate appliances) on it. In short, EMC is quite conservative company but it does provide awesome technology especially for long run.
Please leave your comments if you want.