delete vmkdump-files on ESXi hosts (and hidden vmkfstools options)

Recently I wanted to unmount a VMFS volume to remore it gracefully from all hosts. Unfortunatly I can’t do this, because files were still locking the volume. These files were located in [volume]/vmkdump. Trying to remove them using rm does not work because they are locked.

HOW TO find the locking host of a locked file
  • [Using lsof does not work me because the files are note listed at any of my hosts]
  • Execute: vmkfstools -D 11111111-2222-3333-4444-555555555555.dumpfile. “-D” is an hidden option that shows file metadata and locking information. You see the owner that is the locking host as an ID like: aaaaaaaa-bbbbbbbb-cccc-dddddddddddd with dddddddddddd the MAC address of a NIC of the locking host.
  • Use esxcli network nic list to list NICs and MAC addresses of the host.
  • [I am not 100% sure but I think some time ago the MAC address of the Management Console portgroup was used to identify a locking host. So when you can’t find you host by physical MAC, use esxcfg-vmknic -l to check these MACs too.]

I started to investigate the situation: These files are created by a host. Hosts use these files to dump into in case of dumps are necessary. But there is also a dump-partition created during installation, you may ask. Yes, there is! But these partitions are sized 100 MB in 5.1 and before. ESXi 5.5 creates a partition 2,5 GB in size. So upgraded ESXi 5.5 needs to allocate more space than 100 MB. Therefore they may create these files above mentioned during boot process.

How to check the partition size of you dump-partition
  1. Show partition-table of your boot-disk by using partedUtil on your host. First find out the ID of you boot device. Either in GUI by checking to boot controller and the Identifier of the device. This is an ID like naa.00000000000000000000000000000000. Or list content of /vmfs/devices/disks. You see at least 2 entries for each device/disk. For the boot device you see more entries like: naa.00000000000000000000000000000000, naa.00000000000000000000000000000000:1, naa.00000000000000000000000000000000:2, …, naa.00000000000000000000000000000000:8. Numbers at the end identifies the partition numbers.
  2. execute partedUtil getptbl /vmfs/devices/disks/naa.00000000000000000000000000000000. You get a line for each partition. Columes are: 1: partition number; 2: start-sector; 3: end-sector; 4: ID; 5 type; 6 attribute. Type vmkDiagnostic is your dump-partition.
  3. To calculate the size of you dump-partition:
    (end-sector – start-sector) * 512 / 1024 / 1024 = size (MB).
 Handle vmkdump-files
  • Check coredump files by running esxcli system coredump file list. You see absolute path and if these files are active or not.
  • To show active and configured files use esxcli system coredump file get.
Remove vmkdump-files

Execute following commands on locking host:

  • To un-configure dump-file, execute: esxcli system coredump file set -u.
  • To remove previous configured dump-file, execute: esxcli system coredump file remove -f /vmfs/volumes/volume/vmkdump/11111111-2222-3333-4444-555555555555.dumpfile
Links
  • More information on using partedUtil here
  • More about hidden options for vmkfstools (like –activehosts) here
  • Configuring ESXi coredump to file instead of partition here
  • Configuring a diagnostic coredump partition on an ESXi 5.x host here

Leave a Reply

Your email address will not be published. Required fields are marked *