Skip navigation

Tag Archives: Understanding


Vmworld is right on the corner. There ain’t much new exciting information surfacing during this waiting period. I think I can use this time to gather my energy to have a little bit dive (not very deep though) before Vmworld hit on the ground.

I’m going to talk about PSA (Pluggable Storage Architecture) as part of requirement of VCAP study list. As usual, I try to make my post as simple as I can and also very welcome to any comments.

I remember when I started to read esxcli command and concept of PSA, I was simply overwhelmed by so many different parameters and options. And just like any one, I was lost. But after I started to abstract the detail command but try to understand what these commands tell me, everything is getting clear.

Please be aware: esxcli should be only used on esx(i) ssh session. vicfg-* should be used via vMA.

PSA Concept

To manage storage multipathing, ESXi uses a special VMkernel layer, the Pluggable Storage Architecture(PSA). The PSA is an open, modular framework that coordinates the simultaneous operation of multiplemultipathing plug-ins (MPPs).The VMkernel multipathing plug-in that ESXi provides by default is the VMware Native Multipathing Plug-In (NMP). The NMP is an extensible module that manages sub plug-ins. There are two types of NMP sub plugins,Storage Array Type Plug-Ins (SATPs), and Path Selection Plug-Ins (PSPs). SATPs and PSPs can be built-inand provided by VMware, or can be provided by a third party.

Let’s be reasonable. If a VM send a SCSI command to access data on the SAN, VMkernel needs to know how to access and which path it should choose. That’s where whole PSA kicks in. The PSA is a framework. It contains different modules and their sub-modules.

Note: PSA has 3 layers. Mulitpathing Layers->SATP layers->PSP layers

From the above picture, we can see PSA module need to choose between VMWARE NMP(Vmware own multipathing module) and MPP (thirdparty multipathing plug-ins, like EMC Powerpath) first. If  you choose VMWARE NMP, please be aware you are not necessary using exclusive vmware product from this point. NMP still able to load third party sub-modules (SATP, PSP).

NMP&MPP layers

It basically decide which SATP to choose. This layer will see what kind of physical hardware (SAN) you have. EMC? NETAPP? DELL? It will load appropriate SATP, PSP to do other jobs. Please be aware you have bunch of SATP and PSP to choose. You can let NMP to decide or manually assign (or claim) new rules.

Please be aware MASK_PATH(MASK LUN in esx3.5) are consider as a NMP level plug-in.

SATP layer

Storage Array Type Plug-Ins (SATPs) run in conjunction with the VMware NMP and are responsible for arrayspecific

operations.

ESXi offers a SATP for every type of array that VMware supports. It also provides default SATPs that support

non-specific active-active and ALUA storage arrays, and the local SATP for direct-attached devices. Each SATP

accommodates special characteristics of a certain class of storage arrays and can perform the array-specific

operations required to detect path state and to activate an inactive path. As a result, the NMP module itself

can work with multiple storage arrays without having to be aware of the storage device specifics.

After the NMP determines which SATP to use for a specific storage device and associates the SATP with the

physical paths for that storage device, the SATP implements the tasks that include the following:

  • Monitors the health of each physical path.
  • Reports changes in the state of each physical path.
  • Performs array-specific actions necessary for storage fail-over. For example, for active-passive devices, it can activate passive paths.

Please be aware SATP can be thirdparty one. But I don’t have thirdparty loaded in this picture.

PSP layers

Path Selection Plug-Ins (PSPs) run with the VMware NMP and are responsible for choosing a physical path for I/O requests.

The VMware NMP assigns a default PSP for each logical device based on the SATP associated with the physical paths for that device. You can override the default PSP.

These path policy are reflected to path policy in vCenter.

I’m not going to discuss each PSP here. If you have more questions, please refer to vmware docs.

PSA command line

There are few command lines related to PSA.

esxcli, vicfg-mpath, vicfg-mpath35

so what exact does command do? What kind of information I will pull out or change?

vicfg-mpath (vicfg-mpath35 is for esx 3.5)

This command is to list all available path and all detailed information about your device as well.

It also has ability to disable a path and active a path.

esxcli is much powerful command comparing with vicfg-mpath.

You need to be aware that esxcli is much more just to adjust PSA structure. It also control network, swiscsi, vaai, and vms behavior in vmKernel leve.

Let’s take a brief look what esxcli can do.

As you can see from the picture, from PSA wise, we only focus on corestorage and nmp.

Please consider esxcli is a command which you can use to interactive information of vmKernel pool. Let’s take look what kind of information you can access.

Configuration information (or Claim Rules)

Run command from esxi server ssh connection:

esxcli corestorage claimrule list

On the left side, we have rule class. There are 3 types of rule class(MP,FILTER,VAAI). It would appear if you use my last command. In this example, it didn’t because Hardware acceleration are not enabled and neither VAAI.

Rule runs from small to big number (or lower number as Vmware prefer). Be aware from Rule 0 to 101 are Vmware reserved rule. Between 102 to 60000 user can create their own rules. After 60000, Vmware claims those rules again.

When you build a rule, you need to build a pair rule(runtime and file). The file parameter in the class column indicates rule is defined. The runtime paramter indicates that rule has been loaded into your system.

Plugin also means  module. In this example, we have NMP(Vmware module), MASK_PATH(Vmware LUN Mask) and MPP_1(thirdparty module for NewVend).

MASK Lun method has been changed. In ESX 3.5, we use vCenter to Mask LUN you don’t want hosts to see. In ESX 4, we have to use command and create MASK_PATH rule to dedicate LUNs.

Matches column is actually conditions for rules to apply. You can clearly see what kind of conditions rule 0-4 will apply and so forth. The last rule is like last rule of ISA firwall. Every other conditions which were not defined by previous rules will fall in this rule.

Device information

Have you ever wondering where you can see new UUID, vml, and other information for LUNs or devices connecting to your host? You can read my post at here or you got vicfg-mpath -l to do the job. You also can use esxcli nmp device list as well. But esxcli is starting from PSA wise.

SATP & PSP options

If you want to see what kind of SATP&PSP you have and you can choose, you can use

esxcli nmp satp list

esxcli nmp psp list

All right. I believe it should be easy to understand now. vmKernel has lots of information about PSA. You can use esxcli and vicfg-mpath to get information and modify as you want. I have to say, this is an understanding doc not reference doc. If you do want to add new path, MASK LUNs, or use different rules, you still need to check out all docs before you actually execute any commands.

Please do leave any comments if you want.

Reference:

ESXi configuration Guide

iSCSI SAN Configration Guide

https://geeksilver.wordpress.com/2010/08/09/vmware-vsphere-4-1-vs-esx-3-x-storage-identifier-understanding/

Advertisements

vSphere 4.1 has been released for a while. But It’s difficult to find an post dedicate for the difference between ESXTOP 4.1 to it’s previous version. This post will include difference and some explanation. Please feel free to leave any comments.

I have few posts regarding in ESXTOP so let’s see what’s new in version 4.1.

Note: All ESXTOP I refer in this top will apply to resxtop in the vSphere CLI or vMA as well.

What’s New in ESXTOP

First of all, Vmware finally claims ESXTOP can be used to monitor NFS protocol, which is a good news for users who have NAS. The limitation of 2TB for maximum datastorage starts to block VMFS in it’s own way. More and more company started to use NAS to act as central data center. We used to use vscsistate to check out performance of NFS, now ESXTOP finally supported it too.

CPU Power monitor

In this new version of ESXTOP, you will be able to monitor how much watts your CPU consumes.

I can quote some explanations from Vmware doc. But I am not able to explain why my idle session has %14 %LAT_C and %799 for %DMD.

Memory compression monitor

This is one of new feature vSphere 4.1 brought to us. The ESXTOP does support it.

If you want to learn more about Memory compression, please click here

New Disk renamed monitor

If you have read my previous post, you would know Vmware started to use new name convention in the v4.1. If you haven’t read it, please click here.

Since we have used new name for UUID, vml, and identify name for device. Thing changed in disk monitor. ESXTOP has Storage Adapter Panel (press d), Storage Device Panel (press u) and VM storage panel (press v) to read information like latency and I/O information. In the newer version of ESXTOP, some of meters have been removed and replaced.

Network Multicast/broadcast monitor

Just see the diagram and you will understand the change.

Interrupt panel Monitor

It’s still there. Except name has been changed to COUNT_x.

All rights. To summer this post, there are some new changes in almost all the areas. But no new panel appears and esxtop still can be used in batch and replay model as usual.

Please let me know what you think. Have fun.

Reference:

https://geeksilver.wordpress.com/2010/08/09/vmware-vsphere-4-1-vs-esx-3-x-storage-identifier-understanding/

http://www.vmware.com/files/pdf/techpaper/VMW-Whats-New-vSphere41-Performance.pdf

https://geeksilver.wordpress.com/2010/08/04/vmware-vsphere-4-1-memory-compression-understanding/


I have spent some time on collecting information for VCAP. I have downloaded the blueprint but it turns out I have too much stuff I need to catch up. so I am thinking why not sharing some of my collection while I’m searching? Here is the first post and please write to me if you have your own way to prepare VCAP.

Ok. Today topic is understanding storage identifier in vSphere 4.1. I’m not quit sure whether you have noticed the storage identifier in ESX 3.5 is a mess. It’s different from the name you give it via the SAN and it’s all different on each host. vSphere 4.1 has dramatically changed the chaos with new name and new rules. Let’s take detail look.

Please be aware that all ESX 3.x examples and discussion in this post referes to ESX (not ESXi). All ESX 4.x refers to ESXi 4.1 only since we won’t use ESX anymore in vSphere 4.1. Also, ESX 3.X will see different storage device (but same type) vs ESX 4.x.Please look the keywords not exactly label.

There are 4 different ways ESX use to label storage.

ESX 3.X

  • vmhba<Adapter>:<Target>:<LUN> or vmhba<Adapter>:<Target>:<LUN>:<Partition>
  • vml.<VML> or vml.<VML>:<Partition>
  • /dev/sd<Device Letter> or /dev/sd<Device Letter><Partition>
  • <UUID>

There are 6 different ways ESX use to label storage.

ESX 4.X

  • naa.<NAA> or  naa.<NAA>:<Partition>
  • eui.<EUI> or eui.<EUI>:<Partition>
  • mpx.vmhba<Adapter>:C<Channel>:T<Target>:L<LUN> ormpx.vmhba<Adapter>:C<Channel>:T<Target>:L<LUN>:<Partition>
  • vml.<VML> or vml.<VML>:<Partition>
  • vmhba<Adapter>:C<Channel>:T<Target>:L<LUN>
  • /dev/sd<Device Letter> or /dev/sd<Device Letter><Partition> (ESX only, not for ESXi)

Let’s take a look what ESX 3.x storage properties tell us about device name.

Vmware use vmhbax:y:z to identify a storage device. How about ESX 4.x?

Here it is. New identifier for ESX 4.x.

naa.<NAA> or eui.<EUI> to replace vmhba
vmhba was very confused in the ESX 3.X version. It could be considered as path name or device name. In the ESX 4.X, vmhba is exclusively to identify a path to LUN.

NAA stands for Network Addressing Authority identifier. EUI stands for Extended Unique Identifier. The number is guaranteed to be unique to that LUN. The NAA or EUI identifier is the preferred method of identifying LUNs and the number is generated by the storage device. Since the NAA or EUI is unique to the LUN, if the LUN is presented the same way across all ESX hosts, the NAA or EUI identifier remains the same.

mpx.vmhba<Adapter>:C<Channel>:T<Target>:L<LUN> ormpx.vmhba<Adapter>:C<Channel>:T<Target>:L<LUN>:<Partition>

Some devices do not provide the NAA number described above.  In these circumstances, an MPX Identifier is generated by ESX to represent the LUN or disk. The identifier takes the form similar to that of the canonical name of previous versions of ESX with the mpx. prefix.  This identifier can be used in the exact same way as the NAA Identifier described above but it’s for local disks identity. In other word, this is used for local device only.

vml.<VML> or vml.<VML>:<Partition>

The VML Identifier can be used interchangeably with the NAA Identifier and the MPX Identifier. Appending:<Partition> works in the same way described above. This identifier is generally used for operations with utilities such as vmkfstools.
The vml path hosts the LUN ID, GUID and partition number information and this is also stored in the volumes VMFS header. Vml construct is used by kernel to define paths to see SCSI LUN.

/dev/sd<Device Letter> or /dev/sd<Device Letter><Partition>

This naming convention is not VMware specific. It’s decided by Red Hat Linux and you won’t see that in the ESXi 4.1 since it’s replaced by mpx. This convention is used exclusively by the service console and open source utilities which come with the service console. The <Device Letter> represents the LUN or Disk and is assigned by the service console during boot. The optional <Partition> represents the partition on the LUN or disk.  These naming conventions may vary from ESX host to ESX host, and may change if storage hardware replaced.  This identifier is generally used for operations with utilities such as fdisk and dd.

Note: VMware ESXi does not have a service console; disks are refered to by the VML Identifier.

<UUID>

The <UUID> is a unique number assigned to a VMFS volume upon the creation of the volume. It may be included in syntax where you need to specify the full path of specific files on a datastore.  The UUID is generated on the initial ESX host that created the VMFS volume based on the UUID creation standards. It’s possible to have same UUID although it’s very rare.

I’m going to show you a series of command and you will see the difference between ESX 3.x and ESX 4.x.

Please leave comments if you want. Thanks

This slideshow requires JavaScript.

Reference:

UUID: http://blog.laspina.ca/ubiquitous/tag/vml

ESX storage: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1014953

UUID is not unique:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006250


There is a new feature both Microsoft Hyper-v and Vmware present in their latest version of product, that’s Memory compression. How does Memory compression work for Hyper-v is still an unknown but for Vmware vSphere 4.1, it’s pretty easy to understand. This post is dedicated for Memory Compression and I hope I can give you a easy explanation.

There are always challenge for Vmware to reclaim free host memory from VMs because Microsoft doesn’t share internal index file to tell Vmware which memory pages are free. All Memory used by VMs are always aggregates and eventually it will bust up if you overcommitted memory on your host. So Vmware develop following technologies to work around this issue.

  • Transparent Page Sharing
  • Balloning
  • Swapping
  • Memory Compression

Transparent Page Sharing

One of biggest advantage comparing is Transparent Page Sharing. Basically, vmKernal scan most memory pages of VM sitting on the same host. It divide memory page as small piece (4kB) and generate a hash signature for each 4KB block. If there are more than one identical blocks on the host physical memory, Vmware keeps one block and free rest of blocks to host. The exception is if you use hardware-assisted memory virtualization with large index table, Vmware doesn’t do page sharing till host memory is overcommitted. The block size and scan schedule are adjustable.

Ballooning

Ballooning is used via Vmware Driver installed in the OS. This driver is still in the Kernel level and will try to claim memory when your host memory is short. It pins memory pages of VM and tell Host to use those pined memory as free memory. Unlike Sharing pages, it’s not proactive action.

Swapping

Swapping is a bad bad choice. If your VM start to swap memory to physical disk, your VM performance will have dramatic impact. This is very last result of Vmware try to make VM still alive.

Memory Compression

Here it is. New technology of vSphere 4.1. It’s not something you will use everyday, it’s a technology which gives your VM another last breath before it sinks with swapping. Let’s see how it works.

What is Memory Compression?

The idea of memory compression is very straightforward: if the swapped out pages can be compressed and stored in a compression cache located in the main memory, the next access to the page only causes a page decompression which can be an order of magnitude faster than the disk access. With memory compression, only a few uncompressible pages need to be swapped out if the compression cache is not full.

Basically, when Swapping is about to happen, VM compress part of memory, which will be swapped out to physical disk, to it’s self memory space (not host memory ). It divide VM swapping memory to 4KB and tried to compress them to 2KB. If compress success, then, you can save 50% space, if it fails, VM still swap the original 4KB to physical disk.

Therefore, it may or may not work 100% for all swapping memory pages. Even it can compress everything, please be aware it’s using it’s own VM memory space not host memory, which means it can’t use all VM memory. In default, that’s 10% of VM memory can be acted as Memory compression Cache. No other options will happen to those compressed memory. Anything happens, VM needs to uncompress them first and do other action next.

Let’s see a real case.

If you notice those two lines, the gap between 28GB and 24GB memory, the memory compression kept throughput with very 6% performance difference. If we look the two bars with swapping read, you can clearly see even with the worst case, memory compression still manage to take more than half of load of Swapping and make VM live longer and operate-able.

Summary:

Memory compression give your VMs a last chance to struggle on operation level. It’s better than nothing, isn’t it?


This is a post regarding VAAI (vStorage APIs for Array Integration). I have read some posts from Chad and personally, I don’t think I can do any better job than him to explain what VAAI is. However, I do can explain it from a user angle instead of manufacture perspective.

All right, let’s hit on the road.

What is vStorage API for Array Integration(VAAI)?

Essentially, VAAI is technology that Vmware use to offload certain disk activity/jobs to SAN to save Host I/O resource and improve performance and it’s enabled by default. From user point of view, this is still nice to have feature since the tasks you can offload are following.

•Hardware-Accelerated Locking = 10-100x better metadata scaling

–Replace LUN locking with extent-based locks for better granularity

–Reduce number of “lock” operations required by using one efficient SCSI command to perform pre-lock, lock, and post-lock operations

–Increases locking efficiency by an order of magnitude

•Use Cases –Bigger clusters with more VMs –View, Lab Manager, Project Redwood –More & Faster VM Snapshotting

•Hardware-Accelerated Zero = 2-10x lower number of IO operations

–Eliminate redundant and repetitive host-based write commands with optimized internal array commands

Use Cases-Reduced IO when writing to new blocks in the VMDK for any VM,Time to create VMs (particularly FT-enabled VMs

•Hardware-Accelerated Copy = 2-10x better data movement

–Leverage native array Copy capability to move blocks

Use Cases -Storage VMotion -VM Creation from Template

What kind of benefit you can get from using VAAI?

I believe the Vmware View will gain some performance boost when it try to provision more desktops. SAN can take load off from HOST. So it should be faster for View host to create more snapshots.

The other one is Anti-virus. For example, the new Trend Macro is using VMSafe API to scan and respond. Let’s see what Trend said about it.

How Core Protection for Virtual Machines product works is we take a snapshot of the Virtual Machine via the VMsafe API and scan the snapshot when performing the full system scheduled & manual scans, if malware has been detected we then send an instruction to the real-time agent for the mitigation process to initiate. The real-time scanning agent on the other hand performs On-Access scanning on the virtual machines which subsequently initiates the mitigation process if malware is identified.

With lots of creating snapshots during the scan, VAAI is definitely help a lot to Trend or CA anti-virus software.

so are you wondering how much performance it can gain?

Let’s see what Chad said about it.

Just one example (of many)… using the Full Copy API:

  1. We reduced the time for many VMware storage-related tasks by 25% or more (in some cases up to 10x)
  2. We reduced the amount of CPU load on the ESX host and the array by 50% or more (in some cases up to 10x)(by reducing the impact during the operation and reducing the duration) for many tasks.
  3. We reduced the traffic on the network by 99% for those tasks.  Yes you read that right, 99%. During these activities (storage vmotion is what we used in the example), so much storage network traffic can be so heavy (in the example 240MBps worth of traffic), it impacts all the other VMs using the storage network and storage arrays.

And like Intel VT with vSphere 4, these improvements just “appear” for customers using vSphere 4.1 and storage arrays that support these new hardware acceleration offloads.


Where do these integrations occur?

There is nothing better than Chad’s blog, I just quote here.

To understand the dialog that goes with this, listen to the webcast (link below).   As a point of note – while all vendors are working hard to integrate with VMware (which is good – highlights the importance of VMware in customer environments), to date, as far as I know, EMC is the only single vendor to have products available that integrate witheach of the areas in green.   BTW – “co-op” means it’s an area where it’s not an integration API per se, but that there is a lot of cooperative development.

What are VAAI requirements?

Well, In general speaking, everything you purchased more than 2 years old won’t support the VAAI unless you update FLARE or firmware. As you can see from above picture, it requires “Vendor-specific VAAI SCSI command” loaded in the SAN. For users who are still using CX3 (32bit), it’s time to move on and end your current lease of SAN to get new one.

Still, quote from  Chad:

What do you need to benefit from this hardware acceleration of storage functions?

  • Well, of course, you need vSphere 4.1.  VAAI is supported in Enterprise and Enterprise Plus editions.
  • If you’re an EMC Unified (an EMC Celerra purchased in the last year – NS-120, NS-480, NS-960) or EMC CLARiiON CX4 customer, you need to:
    • Be running FLARE 30 (which also adds Unisphere, Block Compression, FAST VP aka sub-LUN automated tiering, FAST Cache and more).   You can read more about all the FLARE 30 goodness here and here if you’re interested in more detail about what’s new in that release.   FLARE 30 is going to be GA any day now…
    • Also, ESX hosts need to be configured to use ALUA (failover mode 4).   If you’re using a modern EMC Unified or CLARiiON CX4 array, using ALUA (with the round robin PSP or PowerPath/VE) with vSphere 4.0 or vSphere 4.1 is a best practice (for iSCSI, FC and FCoE).   We will be automating configuration of this shortly in the always free EMC Virtual Storage Integrator vCenter plugin, but for now it’s pretty easy to setup manually.
  • If you’re an EMC VMAX customer – it will be a bit longer – but not much, but VAAI support is in the next major Enginuity update scheduled for Q4 2010.
  • It is supported on all block protocols (FC, iSCSI, FCoE)

What are the catch?

  • When does a VAAI offload NOT work (and the datamover falls back to the legacy software codepath) if all of the above are true?
    • The source and destination VMFS volumes have different block sizes (a colleague, Itzik Reich, already ran into this one at a customer, here – not quite a bug, but it does make it clear – “consistent block sizes” is a “good hygiene” move)
    • The source file type is RDM and the destination file type is non-RDM (regular file)
    • The source VMDK type is eagerzeroedthick and the destination VMDK type is thin
    • The source or destination VMDK is any sort of sparse or hosted format
    • The logical address and/or transfer length in the requested operation are not aligned to the minimum alignment required by the storage device (all datastores created with the vSphere Client are aligned automatically)
    • The VMFS has multiple LUNs/extents and they are all on different arrays

Last but not Least:

First of all, there are other voices from Netapps. From this link, you can clearly see How Netapps work with VAAI technology. If we think about from Microsoft and Citrix angle, what are they going to do to offload snapshot and vm copy from host to storage? Will VAAI become a public protocol and APIs to be loaded by all Virtual platform?

And what happened to that missing feature of VAAI, Thin Provisioning Stun?

Reference:

http://virtualgeek.typepad.com/virtual_geek/2010/07/vsphere-41—what-do-the-vstorage-apis-for-array-integration-mean-to-you.html

http://searchvmware.techtarget.com/tip/0,289483,sid179_gci1516821,00.html

http://blogs.netapp.com/virtualstorageguy/2010/07/vmware-vsphere-vaai-demo-with-netapp.html


In my last post, I talked about Network IO Control. I think it’s better for me to discuss another interesting function of new vSphere 4.1, Storage IO Control.

Storage IO Control is really something easy to setup and leave difficult jobs to Vmware operation. However, there are quite bit information on the Internet, I’m trying to put them together and explain it to you in an easy way.

What is Storage IO Control ?

Storage I/O Control (SIOC), a new feature offered in VMware vSphere 4.1, provides a fine-grained storage control mechanism by dynamically allocating portions of hosts’ I/O queues to VMs running on the vSphere hosts based on shares assigned to the VMs. Using SIOC, vSphere administrators can mitigate the performance loss of critical workloads during peak load periods by setting higher I/O priority (by means of disk shares) to those VMs running them. Setting I/O priorities for VMs results in better performance during periods of congestion.

There are some misunderstanding here. Some people say SIOC will only kick in when threshold is breached. I believe SIOC will always work for you at all the time to make sure datastore latency close to the congestion threshold if you enable this feature.

Prerequisites of Storage IO Control

Storage I/O Control has several requirements and limitations.
  1. Datastores that are Storage I/O Control-enabled must be managed by a single vCenter Server system.
  2. Storage I/O Control is supported on Fibre Channel-connected and iSCSI-connected storage. NFS datastores and Raw Device Mapping (RDM) are not supported.
  3. Storage I/O Control does not support datastores with multiple extents. (We should try to avoid extend volumes to multiple datastore in all time and also try to use consistent block size for VAAI sake)
  4. Before using Storage I/O Control on datastores that are backed by arrays with automated storage tiering capabilities, check the VMware Storage/SAN Compatibility Guide to verify whether your automated tiered storage array has been certified to be compatible with Storage I/O Control. (I believe the latest EMC F.A.S.T is upported with this function. But you have to wait for latest FLARE 30 to make it work).
  5. All ESX Hosts connecting to the datastore which you want to use SIOC must be ESX 4.1. (You can’t enable SIOC while you have ESX 4.0 connecting to that SIOC datastore). Of course, you need to have vCenter 4.1 as well.
  6. Last but not least,  you need to have Enterprise Plus license in terms of enable this function. 😦

How does Storage IO Control work?

There are quite few blogs regarding this topic.  Essentially, you can setup share level for each single VM (actually, you setup for each VM disk) and apply limits if you have to. Those values will be used when SIOC is operating.  Please be aware that SIOC doesn’t just monitor a single point and adjust a single value to make your latency lower than dedicate thershold. It actually change multiple layers of IO flow to make it happen ( I will explain it later).

Before Storage IO Control appears

Let me quote Scott Drummonds article to explain the difference between previous disk share control and SIOC.

The fundamental change provided by SIOC is volume-wide resource management. With vSphere 4 and earlier versions of VMware virtualization, storage resource management is performed at the server level. This means a virtual machine on its own ESX server gets full access to the device queue. The result is unfettered access to the storage bandwidth regardless of resource settings, as the following picture shows.

This is from Yellow-brick:

As the diagrams clearly shows, the current version of shares are on a per Host basis. When a single VM on a host floods your storage all other VMs on the datastore will be effected. Those who are running on the same host could easily, by using shares, carve up the bandwidth. However if that VM which causes the load would move a different host the shares would be useless. With SIOC the fairness mechanism that was introduced goes one level up. That means that on a cluster level Disk shares will be taken into account.

As far as I believe, SIOC starting point is no long only manage a single host IOPS, instead, it monitors data-store wide level, Host HBA queue and VM IOPS together to dynamically monitor and adjust IO. If the flag is up, SIOC will come down and adjust multiple layers IO to make sure high disk share VMs are prioritized.

Monitoring and adjusting points:

Each VM access to it’s Host’s I/O queue. SIOC must make sure it’s base on I/O priority of VM (disk shares).

Each Logical device I/O queue of each host. The lower limit is 4 and Upper limit is minimum of (queue depth set by SIOC and queue depth set in the HBA driver ).

SIOC monitors the usage of device queue in each host, aggregate I/O requests per second from each host (be aware this is aggregation of I/O numbers and per I/O package size and divide by seconds)and datastore-wide I/O latency every 4 seconds (for individual datastore). Also throttles the device queue in each host.

How to setup Storage I/O control?

Let what I have mentioned above, it’s fairly easy to setup as long as you go through prerequisite list.

All what you need to do is to tick that box.

If you try to customize threshold, you can click Advanced button.

In terms of seeing SIOC is actually working, please go to vCenter->Datastores->select your storage->Performance. You can see none of VM disks have more than 30ms latency. (sorry, there is no work load in the picture).

Reference:

http://vpivot.com/2010/05/04/storage-io-control/

http://www.yellow-bricks.com/2010/06/17/storage-io-control-the-movie/

http://www.vmware.com/files/pdf/techpaper/vsp_41_perf_SIOC.pdf


If anyone can recall, I wrote a post about vMA 4.0 before. With new vSphere 4.1 released, vMA has released a new version 4.1 to work with new vSphere 4.1.

During the installation and configuring vMA 4.1, I have encounter multiple errors. I would like to thank William Lam’s help from the forum. If you want to read more about vMA 4.1 scripting, please follow William’s blog in the reference.

What’s New about vMA 4.1?

Apart from vMA is using new OS (CentOS) and it’s using vSphere CLI 4.1, SDK for Perl 4.1 and upgrade version of VMware Tools, the new version if vMA brought us a different way of authentication.  AD Authentication. Also there are some new commands to replace the old one. I’m going to elaborate as follow.

Download vMA 4.1

Downloading vMA 4.1 is pretty easy. Anyone can go to here to download OVF file and related documents. vMA 4.1 is able to load on both vCenter 4.0 and vCenter 4.1. You can get pretty good idea about how to install from vma_guide. However, there are some mistakes in the docs I would like to point out later.

Configuration vMA 4.1

When you first time run vMA, it will give  you a wizard to let you configure vMA. If you miss the chance, you can run

sudo system-config-network-tui

to reactive the wizard.

Join vMA, and ESX(i) into Active Directory

Concept

First of all, let’s talk about the concept behind this topic. Why do we need to join vMA and ESX(i) into AD?

The reason we join the ESX(i) into AD is to easy our management and try to use less username and passwords to control ESX(i). As you all know, vCenter is in the AD already. In default, Domain admin has rights to log on vCenter and manage it. However, ESX(i) use local user database and you have to use root every time in terms of logging and execute command.

I believe the second reason for ESX(i) to join the domain is to help domain users for vCLI access. Let’s image you can log on vMA(or use vSphere CLI and your script files) with your own domain accounts and execute commands against the vCenter and Hosts directly. No need to remember another set of username and passwords anymore. Everything will be integrated with same service account or domain user account.

Join ESXi to Active Domain

Connect to your vCenter which has ESXi 4.1 as host.

If you type your domain in the filed then click “Join domain” button, you must use “username” instead of “domain\username”.

I followed the smooth blog to configure it, I got following error. So you must not user domain\username format.

After you join the ESX(i) 4.1 to AD, you can connect ESX directly with vSphere Client and go to permission and add your domain account into local user database. For the rest, you can follow with smooth blog in reference.

Join vMA 4.1 into Active Directory

This is also pretty straight forward operation.

You log on vMA 4.1 with vi-admin account (vi-user hasn’t enabled yet, you have to do it manually). then, you type

sudo domainjoin-cli join your_domain your_domain_admin_user

then, you type password as what vma_guide indicated. But you may see following warning after you join the domain.

Those pam module are part of CentOS module and they are designed to not only join vMA to Windows AD, so does Linux AD. So it’s normal for you to see those warning.

You can use sudo domainjoin-cli query to verify as what I did.

Connect to vCenter and ESX(i) Hosts

There are two different ways you can authenticate your vMA to vCenter and Hosts.

Active Directory Authentication

Like what I have mentioned above, the concept for this one is to let your admin to log on with vMA with their own domain account and able to run commands against vCenter and Host without typing multiple times username and password. Comparing with fastpass authentication, vMA doesn’t store username and password into local vMA box. More secure in certain way. You don’t need to have extra passwords to memorize.

PreSteps:

Your vMA must joined the domain.

Your vCenter must joined the domain.

If you want to directly operate on Host without using “–vihost”, your ESX needs to join domain.

DNS host file must be preconfigured so vMA will know what your vCenter/host IPs are.

customize server list

Modify DNS host files

Well, the reason we setup DNS hosts file is we want to just type server name or host name to make it work. No one wants to type 10.163.x.x all day.

The solution is using hosts file just like what we did on lmhosts for windows.

Steps:

Open console (or connect vMA with ssh tool , like putty) of vMA.

Login as vi-admin

The host file is located at /etc

You must use “sudo chmod a+w hosts” to make hosts file writeable.

Use “sudo vi hosts” to add your vcenter and host IP

Save and quit vi

One thing I must point out is all server name must be FQDN and no exception!

customize server list

vMA needs to know how many servers you may connect to (although it can only operate on one server a time). vMA needs to know which servers you are going to use AD authentication and which servers you are going to use fastpass authentication. That’s why you need to build a server list.

You must log on with vi-admin to build server list.

To view current server list.

vifp listserver -l

You must use “-l” parameter in terms of to see authentication method.

If server you want is not in the list, make sure DNS host file has configured and you can use following command to add.

vifp addserver yourhost –authpolicy adauth (this is for AD authentication)

or

vifp addserver yourhost (this is for fastpass authentication)

If you try to add vCenter, you must use domain admin account because vi-admin doesn’t exist in vCenter unless you manually added in. For Host, you need to type root password and vMA will automatically add vi-admin users into Host.

Notice: There is a big trick here. If system prompt and ask you username and password, you can type “domain\username”. But if you want to use domain\username in the command line, you have to use “domain\\username”.

Now, you are ready to connect your server.

Steps:

1. Log in vMA with your domain admin account (normal domain account will work too!! But they don’t have rights to operate on vCenter).

2. target your server (vCenter or Host).

You must target one object to send command with. If you don’t do that, you will get error message like

“Error connecting to server at ‘https://localhost/sdk/webService&#8217;: Connection refused”

3. Send command to object

If you target to a vCenter and your command is a HOST base command, you must “–vihost your_host_name” to tell vCenter which Host you want. Also, the name must be FQDN!.

Notice: I was told from Vmware Support, if you use “–vihost” , then you will be asked to type username and password again!

If you target to a Host, you can just use command and it should work.

Here is the tricky thing. It should work and you shouldn’t type any credentials anymore. But some of users like me do get asked to type username and password again! Maybe it’s a bug of vMA 4.1. I’m investigating this matter with Vmware as I’m typing.

——————————————————————————————————————–

New Updates about this issue.
I just got call from Vmware Support and they admited this is a bug in the vMA 4.1. They will

fix this issue in the next release.

——————————————————————————————————————–

Fastpass authentication

This is old authentication method as previous version. Basically, the vMA stored your credentials in the local and you don’t need to type multiple times when you operate on Hosts and vCenter. The reason for that is vMA actually create vi-admin accounts into Hosts.

PreSteps:

DNS host file must be preconfigured so vMA will know what your vCenter/host IPs are.

customize server list

Please check above post to look for details about how to do it.

This is reference for fastpass authentication.

Steps:

1. Log in vMA with vi-admin.

2. target your server (vCenter or Host).

You must target one object to send command with. If you don’t do that, you will get error message like

“Error connecting to server at ‘https://localhost/sdk/webService&#8217;: Connection refused”

3. Send command to object

If you target to a vCenter and your command is a HOST base command, you must “–vihost your_host_name” to tell vCenter which Host you want. Also, the name must be FQDN!.

Reference:

http://communities.vmware.com/community/vmtn/vsphere/automationtools/vima

http://www.virtuallyghetto.com/2010/07/vma-41-active-directory-intergration.html

http://www.smoothblog.co.uk/2010/07/15/esxi-4-1-active-directory-integration/

http://www.virtuallyghetto.com/2010/07/vma-41-authentication-policy-fpauth-vs.html


Just this morning, I attended VCP 4 exam. The exam has 85 questions and you need to complete them in 90 minutes. The total score of this exam is 500 and passing score is 300.

This exam is purely based on single and multiple choices questions. Easy to answer without simulations.

After I finished the exam, I jumped on http://mylearn.vmware.com and try to take a peak for next level exam VCAP4.

VCAP4 has two different certificates. VCAP4- Datacenter Administration and VCAP4 – Datacenter Design. I have downloaded the blueprint for VCAP4- Datacenter Administration (which is only available data at this time). After reading this 100% lab base exam blueprint, I have to admit it almost knocks my socks off.

This exam requires you to know everything and how to configure everything. From FT, to MCSC, from vMA to troubleshooting. With vSphere 4.1 just released, I’m pretty sure it’s going to be much complicated than what it was before. Well, if it is a required to be a VCAP4, so be it.

Let’s make it work!

-Silver


This is not exactly a technology post. But I would like to use this opportunity to discuss few interesting things developing in the Vmware and virtualization world.

What’s going on with  Vmware world 2011?

I have read some content lists of Vmworld 2010. If you pay attention enough you would discover Vmware finally started to focus on optimizing their system and develop real piratical applications instead of spending time to build new infrastructure of future Operating system. If you can recall what happened last year in vmware world 2009, Vmware pushed out Cloud system but it was not well accepted by most of companies. Most people had doubts in their mind and confusing on their faces. The idea of if system is not broken, don’t touch it and also falling global economy environment ring bell much louder than anything else near IT Managers ears. So this year of Vmworld 2010, Vmware noticed how important education of Cloud and also practical applications are. They started to turning back to pick up lost customers who they left behind before and help them with vShield, Charge Back and Capacity IQ, AppSpeed.

vSphere 4.1 is out!

Oh, yes. The great 4.1 is out. It’s not 5.0 or new cloud system. It’s just a service pack according to Vmware. Like what I mentioned before, the major feature of vSphere 4.1 is turning their system and try to add more practical functions. Why? Because vSphere 4 is not that great comparing with ESX 3.5. Yes, some hardware performance get 10% up and some new features like vDS,FT or Data Recovery sounds pretty. However, is this a real necessary to “must upgrade” or it’s just another “nice to have”? If you went through the list of difference between vSphere and ESX 3.5, you would see the whole point. Please count how many exactly new features vSphere will bring to you. Yes, ESXi is better than ESX, but it’s not a must. New backup method is interesting, but current backup is doing ok with third party software. DR does sound nice. But it does also require a whole new budget and SAN level support. vDS is simply not necessary. vSS is great for normal job. View is not as good as Citirix Xen desktop. ThinApps is not bad but it can’t compete with native support from Microsoft vApp. Especially, MS has lots of experience on managing and deploying Applications. Don’t mention vShield and VMSafe, those 2 applications are not practical at all. Let’s hope vSheild 2.0 will be better.

Battle with in Cloud War

Oh, yes. There is War coming. Ms is attacking from both hardware level (selling servers with Cloud OS in built) and purely software level (just hosting your MS office application , Email and sharepoint online). Vmware shake hands with Google, EMC, Cisco and allies has been formed to fight this giant software company. Microsoft , Vmware, and Google have already hosting their cloud internally and Virtual applications will soon spreading everywhere and hit every corner on this earth. 2010 is a quiet year. But if you look up in the sky, the gunpowder colour cloud rolling from the edge of sky, bleached lighting merging and swimming in deep layers of clouds. Are you ready to choose side? Have you embrace yourself and ready for impact?

Reference:

https://vmworld2010.wingateweb.com/scheduler/catalog/catalog.jsp


I have chance to get involved into a EMC pre-sale meeting today. During the meeting, the EMC pre-sale Engineer introduced F.AS.T v1 and V2 to us. I did know what FAST it was before, but this presentation really opened my eyes and also Engineer was able to answer few of my questions abour Netapps Deduplication vs EMC Compression. I will bring details into this post. However, because there ain’t much available data in the Internet, I have to draw an ugly diagram to help me expressing my idea. I may make mistakes, please feel free to point out.

What is EMC FAST v1?

As you can see from full name of F.A.S.T, It’s about tiering your storage automatically. As you may know the transitional SAN storage contains FC disk and SATA disk. FC is fast and expensive and SATA is slow for random w/r and cheaper. As SAN administrator in the company, your job would be give right LUNs to appropriate servers to fit SLA requirement.

With F.A.S.T, it basically did following things:

1. Add EFD(Enterprise Flash Disk) layer.

As we all know, SSD (solid storage disk) is 100 times faster than FC. It has SLC(single layer cell) and MLC(multiple layer cell) two types. All SSD are short life product. So how does EMC manage to overcome these issues?

These EFD are made of  SLC SSD not MLC, meaning it’s faster than MLC SSD. As you may have heard, SSD is easy to be broken. The reason for easy damaged SSD(same as your usb flash disk) is rewrite same location repeatedly. For the normal system, you write first block of flash disk, and wipe out and write again. So the first block of flash disk is used too many times and easy to damaged. EMC EFD won’t use same spot twice until it has finished all other available spots in the SSD.

Each EFD has 3 components. Cache area (fastest area), normal storage area and hotspot area. All data will write to fast cache first and then, write to normal storage. The spot of normal storage will be discarded after few times reusing and it will start to use spot in the hotspot area to avoid potential bad spot. Same thing apply to cache area, if one of spot is damaged it will start to use spot in the normal area. According to EMC, the EFD has 5 years warranty.

2. It added a virtual LUN layer

Virtual lun can isolated Host and actually storage details. Host doesn’t need to know which physical LUNs (FC,EFD,SATA) it’s operating. With virtual LUN technology, the FAST true mean can work under SAN layer.

3. Auto moving LUNs between tiers

This is what FAST for. F.A.S.T can automatically (or manually) move your LUNs to different tier. Busy and high demanding LUNs will move to fastest tier (EFD) or FC. The low priority LUNs can be shift to SATA to safe fast speed tier for SLA requirement.

What is FAST v2?

We have briefly introduced FAST v1 system as above. After EMC push this technology to it’s customers, they discovered most of customers actually bought lots of FC disks instead of SATA disk. Because FAST v1 is operating on LUN level. Everytime it moves, it has to move whole LUN which is slow and inefficient. so FAST v2 comes to alive.

FAST v2 made some big changes.

1. Let’s making pool

Well, basically, you need to create pool first. A pool is combination of different tiers resource. For example, you can make a pool which has 3xEFD, 4x FC, 5xSATA with all RAID 5. Then, you can create LUNs on this pool. The LUN will be built cross all tiers instead of sitting on one.

2. Let’s move 1GB data segment.

From FAST v1, we move whole LUN which takes long long time and also may not be effective as well. With this version of FAST, we move data with 1GB data segment as smallest operation unit. Meaning if one LUN got hit very hard, the system will use fast cache hold the data and started to move that most busy segment from SATA to EFD. Then, it will move other segments later on according to utilization of LUNs.

EMC compression vs Netapps Deduplication

I have an interesting conversation with EMC Engineer. EMC has preach block level compression to all systems instead of deduplication like NetApps did. This compression and decompression can be done on the fly. It will add about 5% performance overhead which you may not notice. However, it gives you almost 50% compress ratio comparing with deduplication ratio which is only 30% most of time. For the SP utilization, the compression will cost 5% utilization and dedup will cost around 20% CPU.

EMC is very cautious about CPU utilization on Storage. They reckon the normal utilization should be around 25% of single CPU. If one of your SP failed, then, your load will be 50% on remain CPU. They don’t want to use deduplication cost too much cpu resource at this time. At least, not with current CPU horsepower. According to them, the CPU will be much powerful in 2 years which will not only allow to do deduplication, compression, it will also allow you to directly run VMs (like  WAN accelerate appliances) on it. In short, EMC is quite conservative company but it does provide awesome technology especially for long run.

Please leave your comments if you want.

-Silver