TroubleShooting VMware ESXi

Advanced Troubleshooting of VMware ESXi Server

-var/log/VMKSUmmary.log – Host Reooted
-var/log/Boot.gz – slow boot up (check from the line where vmkernal module loaded succesfully)
-DCUI- ALT-F12 -Live troublshooting kernal line message)
-If Host does not have perssistant logging -at the boot of phase hit shift -o you will get boot prompt – to enable serial logging
-validate the esx.conf file the key word perennialResvd with respect to the NAA.ID of the LUN
slow boot issue when rdm devices are not perennial tagged

  • Hostd.prob – check if hostd service is up or down, un responsive events
    Inventory.xml – contains list of vm’s register to host
  • Powering ON VM – vmware.log (if any issue check VMware .logs located in VM directory)

-Storage Issue
Vmwarekernal.log- check for scsisense code – scsideviceio
H :0x0 D:oxcc P:ox4 valid sense code 0x5 0x24 ox45 (H- Host, D-Device. P- Plugin)

VOBD.Log- Storage(iscsi or nfs storage) and Network issues

Network conjection – pktcap -uw (run source to destination )

FDM.log- HA issue
to check host id for HA- /opt/vmware/fdm/fdm/prettyprint.sh hostlist | less
then go to the FDM.log and search for host ID ( it will show you slave or master )

2- Commands

  • ESXCLI
  • vsish (vmkernal sysinfo comamnd)
    Hardware
    vsish /bios (check bios version)
    vsish /hardwareinfo
    Network
    vsish /pnic/vmnic>stats drop transit or drop received
    vsish /portsets/vswitch>stats
    Storage
    vsish /scsifw/adapter status drop message, latencey usage
    vsish /scsifw/devices status drop message, latencey usage

Check if VM is proper allocated with vcpu or using over 100% – Use vsish command –
first get vm cartel ID
#esxcli vm process list (get cartel ID)
#ps | grep -E -i ‘vmx|vmname’ (get cartel id)
RUN- vsish
>get-sched>vcpus>cartelID>groupID (it will show group id also you can gett group id by running esxtop -v
>get /sched/groups/groupID/stats/cpustatsDIR/cpuStats
Look for Demand Entitlement Ratio- If the valus is 100 and above that means VM is getting CPU properly if value is beyond 150
and above VM require More vCPU to allocate, IF value is around 50 or 25 that means vm is over provis

VIM-CMD (register vm, power on off vm or gracefull shutdown, maintenance mode HOst etc)
VIM-CMD (register vm, power on off vm or gracefull shutdown, maintenance mode HOst etc)
VIM-CMD VMSVC/TASK_LIST ( IT WILL SHOW YOU TASK THAT ARE CURRENTLY RUUNING OR STUCK LIKE SNAPSHO CONSOLIDATION STUCK OR TOOLS INSTALLATION

–vmkFS TOOL

vmkfstool -i to consolidate snapshot (if snapshot chain broken)
vmkfstool -e (to check vm snapshot chain PID,CID in vmdks)
vmkfstool -t10 (check inode integrity of vmdk file)
vmkfstool -x to check and repair vmdk file 
    vmkfstool -x check locationof.vmdk
    vmkfstool -x repair locationof.vmdk

MEMSTATS

get Cartel ID (virtual machine stats check based on vmx cartel id)

memstats -r vm-stats -s name:balloonTGT:B=ballooned:swapTgt:swapped:memSize:mapped:active

PKTCAP -uw (network packet capture analizer) use this for vm packet drops,uplink,host
pktcap -uw –trace –vmk pktcap -uw –trace –vmk vmk0 | less
pktcap -uw –trace –uplink

ESXTOP- real time performance monitoring

3- ESXI COnfiguration File
/etc/vmware/esx.conf – hardware storage,network information
/etc/vmware/hostd/vminventory.xml – registered VM list
/etc/vmware/hostd/authorization.xml- authorize connection between vcenter and host
/etc/vmware/vpxa/vpxa.cfg vcenter and esxi connnectivity
/etc/vmware/vmkiscsid/iscsid.conf iscsi configuration file
/etc/vmware/fdm – FDM config file with clusterconfig, hostlist, vm metadata
/etc/vmware/license.cfg license config file

Leave a Reply

Your email address will not be published. Required fields are marked *