ESXTOP -CPU, RAM, Memory Counters - VMwareTroubleShooting

Troubleshooting CPU , Memory , Disk with counter threshold:

CPU

%RDY – >10 -% of time a VM was waiting to be scheduled. If values between 5 and 10 % take care.Possible reasons: too many vCPUs, too many vSMP VMs or a CPU limit setting.

%CSTP– >-3 -This value is interesting if you are using vSMP virtual machines. It shows the percentage of time a ready to run VM has spent in co-deschedule state.If value is >3 decrease the number of vCPUs from the VM concerned.

%MLMTD– >-1 Counter showing percentage of time a ready to run vCPU was not scheduled because of a CPU limit setting. Remove limit for better performance.

%VMWAIT– 100-percentage of time a VM was waiting for some VMkernel activity to complete (such as I/O) before it can continue. Includes %SWPWT and “blocked”, but not IDLE Time (as %WAIT does). Possible cause: Storage performance issue | latency to a device in the VM configuration eg. USB device, serial pass-through device or parallel pass-through device

%SWPWT– >-5 Counter showing how long a VM has to wait for swapped pages read from disk. A reason for this could be memory overcommitment. Pay attention if %SWPWT is >5!

%SYS >10 Percentage of time spent by system to process interrupts and to perform other system activities on behalf of the world.Possible cause: maybe caused by high I/O VM

Disk

DAVG/cmd >-25 Latency at the device driver level Indicator for storage performance troubles

KAVG/cmd >-3 Latency caused by VMKernel Possible cause: Queuing (wrong queue depth parameter or wrong failover policy)

GAVG: >-25 GAVG = DAVG + KAVG

ABRTS/s >1 Commands aborted per second If the storage system has not responded within 60 seconds VMs with an Windows Operating System will issue an abort.

RESET/s >1 Number of commands reset per second.

NUMA Node – ESXTOP press “M” change field D,G

N%L <80 -Percentage of VM Memory located at the local NUMA Node. If this value is less than 80 percent the VM will experience performance issues.

NLMEM: VM Memory (in MB) located at local Node

NRMEM: VM Memory (in MB) located at remote Node

NMN: Numa Node where the VM is located

Memory

MCTLSZ: >-1 Amount of guest physial memory (MB) the ESXi Host is reclaiming by ballon driver. A reason for this is memory overcommitment

SWCUR: >-1 Memory (in MB) that has been swapped by VMKernel.Possible cause: memory overcommitment.

SWR/s, >-1 Rate at which the ESXi Host is writing to or reading from swapped memory. Possible cause: memory overcommitment.

CACHEUSD>-1 Memory (in MB) compressed by ESXi Host

ZIP/s >-1Values larger 0 indicate that the host is actively compressing memory.

UNZIP/s >-1 Values larger 0 indicate that the host is accessing compressed memory. Reason: memory overcommitment.

Memory State:high enough free memory available (normal TPS cycles)

clear <100% of minFree: ESXi actively calls TPS to collapse pages

soft <64% of minFree: Host reclaims memory by balloon driver + TPS

hard <32% of minFree: Host starts to swap, compress + TPS / no more ballooning

low <16% of minFree: ESXi blocks VMs from allocating more RAM +

Leave a Reply Cancel reply