Advanced Troubleshooting of VMware ESXi Server
-var/log/VMKSUmmary.log – Host Reooted
-var/log/Boot.gz – slow boot up (check from the line where vmkernal module loaded succesfully)
-DCUI- ALT-F12 -Live troublshooting kernal line message)
-If Host does not have perssistant logging -at the boot of phase hit shift -o you will get boot prompt – to enable serial logging
-validate the esx.conf file the key word perennialResvd with respect to the NAA.ID of the LUN
slow boot issue when rdm devices are not perennial tagged
- Hostd.prob – check if hostd service is up or down, un responsive events
Inventory.xml – contains list of vm’s register to host - Powering ON VM – vmware.log (if any issue check VMware .logs located in VM directory)
-Storage Issue
Vmwarekernal.log- check for scsisense code – scsideviceio
H :0x0 D:oxcc P:ox4 valid sense code 0x5 0x24 ox45 (H- Host, D-Device. P- Plugin)
VOBD.Log- Storage(iscsi or nfs storage) and Network issues
Network conjection – pktcap -uw (run source to destination )
FDM.log- HA issue
to check host id for HA- /opt/vmware/fdm/fdm/prettyprint.sh hostlist | less
then go to the FDM.log and search for host ID ( it will show you slave or master )
2- Commands
- ESXCLI
- vsish (vmkernal sysinfo comamnd)
Hardware
vsish /bios (check bios version)
vsish /hardwareinfo
Network
vsish /pnic/vmnic>stats drop transit or drop received
vsish /portsets/vswitch>stats
Storage
vsish /scsifw/adapter status drop message, latencey usage
vsish /scsifw/devices status drop message, latencey usage
Check if VM is proper allocated with vcpu or using over 100% – Use vsish command –
first get vm cartel ID
#esxcli vm process list (get cartel ID)
#ps | grep -E -i ‘vmx|vmname’ (get cartel id)
RUN- vsish
>get-sched>vcpus>cartelID>groupID (it will show group id also you can gett group id by running esxtop -v
>get /sched/groups/groupID/stats/cpustatsDIR/cpuStats
Look for Demand Entitlement Ratio- If the valus is 100 and above that means VM is getting CPU properly if value is beyond 150
and above VM require More vCPU to allocate, IF value is around 50 or 25 that means vm is over provis
VIM-CMD (register vm, power on off vm or gracefull shutdown, maintenance mode HOst etc)
VIM-CMD (register vm, power on off vm or gracefull shutdown, maintenance mode HOst etc)
VIM-CMD VMSVC/TASK_LIST ( IT WILL SHOW YOU TASK THAT ARE CURRENTLY RUUNING OR STUCK LIKE SNAPSHO CONSOLIDATION STUCK OR TOOLS INSTALLATION
–vmkFS TOOL
vmkfstool -i to consolidate snapshot (if snapshot chain broken) vmkfstool -e (to check vm snapshot chain PID,CID in vmdks) vmkfstool -t10 (check inode integrity of vmdk file) vmkfstool -x to check and repair vmdk file vmkfstool -x check locationof.vmdk vmkfstool -x repair locationof.vmdk
MEMSTATS
get Cartel ID (virtual machine stats check based on vmx cartel id)
memstats -r vm-stats -s name:balloonTGT:B=ballooned:swapTgt:swapped:memSize:mapped:active
PKTCAP -uw (network packet capture analizer) use this for vm packet drops,uplink,host
pktcap -uw –trace –vmk pktcap -uw –trace –vmk vmk0 | less
pktcap -uw –trace –uplink
ESXTOP- real time performance monitoring
3- ESXI COnfiguration File
/etc/vmware/esx.conf – hardware storage,network information
/etc/vmware/hostd/vminventory.xml – registered VM list
/etc/vmware/hostd/authorization.xml- authorize connection between vcenter and host
/etc/vmware/vpxa/vpxa.cfg vcenter and esxi connnectivity
/etc/vmware/vmkiscsid/iscsid.conf iscsi configuration file
/etc/vmware/fdm – FDM config file with clusterconfig, hostlist, vm metadata
/etc/vmware/license.cfg license config file