Skip navigation

Tag Archives: VCAP


First of all, I recommend everyone to check out this link.

http://www.kendrickcoleman.com/index.php?/Tech-Blog/vcap-datacenter-administration-exam-landing-page-vdca410.html

It has lots of good information for VCAP – DCA exam.

I’m going to put some conclusion I made from my point of view. Please let me know if there are any problems or mistakes.

Objective 4.2 – Deploy and Test VMware FT

Knowledge

Identify VMware FT hardware requirements

Identify VMware FT compatibility requirements

Skills and Abilities

Modify VM and ESX/ESXi Host settings to allow for FT compatibility

Use VMware best practices to prepare a vSphere environment for FT

Configure FT logging

Prepare the infrastructure for FT compliance

Test FT failover, secondary restart and application fault tolerance in a FT Virtual Machine

Tools

vSphere Availability Guide

Product Documentation

vSphere Client

—————————————-

Few things we need to learn for this Objective.

1. VMware FT record some of important activities from primary and pass to secondary VM to execute. It means not all activities will be past through to secondary VM.

The primary VM and secondary VM stay in sync with each other by using a technology called Record/Replay that was first introduced with VMware Workstation. Record/Replay works by recording the computer execution on a VM and saving it as a log file. It can then take that recorded information and replay it on another VM to have a replica copy that is a duplicate of the original VM.

Instead, only non-deterministic events are recorded, which include inputs to the VM (disk reads, received network traffic, keystrokes, mouse clicks, etc.,) and certain CPU events (RDTSC, interrupts, etc.). Inputs are then fed to the secondary VM at the same execution point so it is in exactly the same state as the primary VM.
The primary VM transmit Network traffic+disk traffic+overhead (20%) to secondary VM. That’s why it requires dedicated Gbit nic.

2. There lots of comparison between FT,HA and MCSC. When and where shall we use which tech?

FT is more focus on short recovering period for host failure. It doesn’t require VM to restart (like HA) and it also can work with different OS other than Microsoft(MSCS can only work for MS OS). It’s easy to setup and easy to use.

Objective 4.3 – Configure a vSphere Environment to support MSCS Clustering

Knowledge

Identify MSCS clustering solution requirements

Identify the three supported MSCS configurations

Skills and Abilities

Configure Virtual Machine hardware to support cluster type and guest OS

Configure a MSCS cluster on a single ESX/ESXi Host

Configure a MSCS cluster across ESX/ESXi Hosts

Configure standby host clustering

Tools

Setup for Failover Clustering and Microsoft Cluster Service

• Product Documentation

• vSphere Client

——————————————–

With MSCS, it has capability to monitor application level to let secondary machine to take over. It’s faster than HA and can understand how application works.

In a server cluster, each server owns and manages its local devices and has a copy of the operating system and the applications or services that the cluster is managing. Devices common to the cluster, such as disks in common disk arrays and the connection media for accessing those disks, are owned and managed by only one server at a time. For most server clusters, the application data is stored on disks in one of the common disk arrays, and this data is accessible only to the server that currently owns the corresponding application or service.

The following environments and functions are not supported for MSCS setups with this release of vSphere:

  • Clustering on iSCSI, FCoE, and NFS disks.
  • Mixed environments, such as configurations where one cluster node is running a different version of ESX/
  • ESXi than another cluster node.
  • Use of MSCS in conjunction with VMware Fault Tolerance.
  • Migration with vMotion of clustered virtual machines.
  • N-Port ID Virtualization (NPIV)
  • With native multipathing (NMP), clustering is not supported when the path policy is set to round robin.
  • You must use hardware version 7 with ESX/ESXi 4.1.

Reference:

http://vmguy.com/wordpress/index.php/archives/1019

http://www.kendrickcoleman.com/index.php?/Tech-Blog/vcap-datacenter-administration-exam-landing-page-vdca410.html

http://communities.vmware.com/message/1569858

http://www.vmware.com/pdf/vsphere4/r41/vsp_41_mscs.pdf

http://communities.vmware.com/blogs/vmroyale/2009/05/18/vmware-fault-tolerance-requirements-and-limitations

Advertisements

Causal note:

Phew, I just had good weekends with friends playing paintball. All my muscle are aching and sore this morning. I recommend everyone to try it at least once in your life. Please remember to wear protections because I just shoot a guy right in the crotch.  ho ho…

Right. Let’s back to business. As you guys may know, I’m preparing VCAP-DCA as well. There are some nice blogs on the Internet. I am trying to add some of my own understanding. Enjoy.

Section 4 – Manage Business Continuity and Protect Data

Objective 4.1 – Implement and Maintain Complex VMware HA Solutions


Identify the three admission control policies for HA

According to Yellow brick, it has Host failures cluster tolerates(Slot calculation), Percentage of cluster resource, specify a failover Host.

Identify heartbeat options and dependencies

Heartbeat is ping message from Secondary Host to Primary Host and between Primary hosts. The default failover heartbeat threshold is 15 secs.

Das.failuredectiontime settings change time limit.

Das.isolationaddress gives you additional host to ping.

Skills and Abilities

Calculate host failure requirements

Host is not able ping other primary host. Host is not able to ping gateway or other special server dedicated by das.isolationaddress.

Configure customized isolation response settings

failover heartbeat threshold is 15 secs.

Das.failuredectiontime settings change time limit.

Configure HA redundancy in a mixed ESX/ESXi environment

I don’t see any particular requirement for ESXi except if you only want to use one nic for everything. You may need to setup das.allowNetwork

Configure HA related alarms and monitor an HA cluster

Create a custom slot size configuration

Slot is maximum reservation of CPU and memory on a given host. If there aren’t any reservation of small reservation, it use 256MHZ for CPU and 0 memory for slot.

You can customize slot size by editing das.slotCpuInMHz and das.slotMemInMB.

If you have 25 CPU slots but 5 memory slot, the available number of slot for this host is 5.

In version 4.1, HA will ask DRS to make change.

If  you customize the slot, you need to be careful for special big reservation vm which may or may not work. DRS may or may not help.

Understand interactions between DRS and HA

IN the version 4.1, the HA may request DRS to move VM around in terms of freeing some resource. But a guarantee can’t be given.

Create an HA solution that ensures primary node distribution across sites

The key for HA crossing is more suitable for blade system. Need to make sure each chasse has it’s own primary HA Host.

You can use

Cat /var/log/vmware/aam/aam_config_util_listnodes.log

To find out which hosts are primary.

Another method of showing the primary node is  /opt/vmware/aam/bin/cli

Reelection occur at

  • Switch to maintenance mode
  • Disconnect from cluster
  • Remove from Cluster
  • Reconfig for HA

Analyze vSphere environment to determine appropriate HA admission control policy

If you have a dedicate Host (which has different CPU and memory), it’s more likely you would use Designated failover host.

If you don’t mind to manually calculate percentage of resource. Don’t setup unbalanced clusters percentage too low, percentage reserved can be a good option.

Amount of host failures is a transitional way. It can be inflexible and conservative.

Analyze performance metrics to calculate host failure requirements

This is a long story and best way is to check out Yellow brick, HA Deepdive.

I’m trying to summarize whatever I understand. Please tell me if I miss anything.

Analyze Virtual Machine workload to determine optimum slot size

Analyze HA cluster capacity to determine optimum cluster size

Tools

vSphere Availability Guide

• Product Documentation

• vSphere Client


Vmworld is right on the corner. There ain’t much new exciting information surfacing during this waiting period. I think I can use this time to gather my energy to have a little bit dive (not very deep though) before Vmworld hit on the ground.

I’m going to talk about PSA (Pluggable Storage Architecture) as part of requirement of VCAP study list. As usual, I try to make my post as simple as I can and also very welcome to any comments.

I remember when I started to read esxcli command and concept of PSA, I was simply overwhelmed by so many different parameters and options. And just like any one, I was lost. But after I started to abstract the detail command but try to understand what these commands tell me, everything is getting clear.

Please be aware: esxcli should be only used on esx(i) ssh session. vicfg-* should be used via vMA.

PSA Concept

To manage storage multipathing, ESXi uses a special VMkernel layer, the Pluggable Storage Architecture(PSA). The PSA is an open, modular framework that coordinates the simultaneous operation of multiplemultipathing plug-ins (MPPs).The VMkernel multipathing plug-in that ESXi provides by default is the VMware Native Multipathing Plug-In (NMP). The NMP is an extensible module that manages sub plug-ins. There are two types of NMP sub plugins,Storage Array Type Plug-Ins (SATPs), and Path Selection Plug-Ins (PSPs). SATPs and PSPs can be built-inand provided by VMware, or can be provided by a third party.

Let’s be reasonable. If a VM send a SCSI command to access data on the SAN, VMkernel needs to know how to access and which path it should choose. That’s where whole PSA kicks in. The PSA is a framework. It contains different modules and their sub-modules.

Note: PSA has 3 layers. Mulitpathing Layers->SATP layers->PSP layers

From the above picture, we can see PSA module need to choose between VMWARE NMP(Vmware own multipathing module) and MPP (thirdparty multipathing plug-ins, like EMC Powerpath) first. If  you choose VMWARE NMP, please be aware you are not necessary using exclusive vmware product from this point. NMP still able to load third party sub-modules (SATP, PSP).

NMP&MPP layers

It basically decide which SATP to choose. This layer will see what kind of physical hardware (SAN) you have. EMC? NETAPP? DELL? It will load appropriate SATP, PSP to do other jobs. Please be aware you have bunch of SATP and PSP to choose. You can let NMP to decide or manually assign (or claim) new rules.

Please be aware MASK_PATH(MASK LUN in esx3.5) are consider as a NMP level plug-in.

SATP layer

Storage Array Type Plug-Ins (SATPs) run in conjunction with the VMware NMP and are responsible for arrayspecific

operations.

ESXi offers a SATP for every type of array that VMware supports. It also provides default SATPs that support

non-specific active-active and ALUA storage arrays, and the local SATP for direct-attached devices. Each SATP

accommodates special characteristics of a certain class of storage arrays and can perform the array-specific

operations required to detect path state and to activate an inactive path. As a result, the NMP module itself

can work with multiple storage arrays without having to be aware of the storage device specifics.

After the NMP determines which SATP to use for a specific storage device and associates the SATP with the

physical paths for that storage device, the SATP implements the tasks that include the following:

  • Monitors the health of each physical path.
  • Reports changes in the state of each physical path.
  • Performs array-specific actions necessary for storage fail-over. For example, for active-passive devices, it can activate passive paths.

Please be aware SATP can be thirdparty one. But I don’t have thirdparty loaded in this picture.

PSP layers

Path Selection Plug-Ins (PSPs) run with the VMware NMP and are responsible for choosing a physical path for I/O requests.

The VMware NMP assigns a default PSP for each logical device based on the SATP associated with the physical paths for that device. You can override the default PSP.

These path policy are reflected to path policy in vCenter.

I’m not going to discuss each PSP here. If you have more questions, please refer to vmware docs.

PSA command line

There are few command lines related to PSA.

esxcli, vicfg-mpath, vicfg-mpath35

so what exact does command do? What kind of information I will pull out or change?

vicfg-mpath (vicfg-mpath35 is for esx 3.5)

This command is to list all available path and all detailed information about your device as well.

It also has ability to disable a path and active a path.

esxcli is much powerful command comparing with vicfg-mpath.

You need to be aware that esxcli is much more just to adjust PSA structure. It also control network, swiscsi, vaai, and vms behavior in vmKernel leve.

Let’s take a brief look what esxcli can do.

As you can see from the picture, from PSA wise, we only focus on corestorage and nmp.

Please consider esxcli is a command which you can use to interactive information of vmKernel pool. Let’s take look what kind of information you can access.

Configuration information (or Claim Rules)

Run command from esxi server ssh connection:

esxcli corestorage claimrule list

On the left side, we have rule class. There are 3 types of rule class(MP,FILTER,VAAI). It would appear if you use my last command. In this example, it didn’t because Hardware acceleration are not enabled and neither VAAI.

Rule runs from small to big number (or lower number as Vmware prefer). Be aware from Rule 0 to 101 are Vmware reserved rule. Between 102 to 60000 user can create their own rules. After 60000, Vmware claims those rules again.

When you build a rule, you need to build a pair rule(runtime and file). The file parameter in the class column indicates rule is defined. The runtime paramter indicates that rule has been loaded into your system.

Plugin also means  module. In this example, we have NMP(Vmware module), MASK_PATH(Vmware LUN Mask) and MPP_1(thirdparty module for NewVend).

MASK Lun method has been changed. In ESX 3.5, we use vCenter to Mask LUN you don’t want hosts to see. In ESX 4, we have to use command and create MASK_PATH rule to dedicate LUNs.

Matches column is actually conditions for rules to apply. You can clearly see what kind of conditions rule 0-4 will apply and so forth. The last rule is like last rule of ISA firwall. Every other conditions which were not defined by previous rules will fall in this rule.

Device information

Have you ever wondering where you can see new UUID, vml, and other information for LUNs or devices connecting to your host? You can read my post at here or you got vicfg-mpath -l to do the job. You also can use esxcli nmp device list as well. But esxcli is starting from PSA wise.

SATP & PSP options

If you want to see what kind of SATP&PSP you have and you can choose, you can use

esxcli nmp satp list

esxcli nmp psp list

All right. I believe it should be easy to understand now. vmKernel has lots of information about PSA. You can use esxcli and vicfg-mpath to get information and modify as you want. I have to say, this is an understanding doc not reference doc. If you do want to add new path, MASK LUNs, or use different rules, you still need to check out all docs before you actually execute any commands.

Please do leave any comments if you want.

Reference:

ESXi configuration Guide

iSCSI SAN Configration Guide

https://geeksilver.wordpress.com/2010/08/09/vmware-vsphere-4-1-vs-esx-3-x-storage-identifier-understanding/