After recovering from a power failure, I noticed VMs were not migrating to one of the vSphere 5 hosts in a HA/DRS cluster. That host had a status message that read “Unable to apply DRS resource settings on host”. Oddly enough, that status message would go away and reappear periodically. Trying to manually vMotion a VM to the host would fail with the helpful message “A specified parameter was not correct. ” Oh! Why didn’t you say so?! Ergh.
I tried putting the host in maintenance mode, removing it from the cluster and even removing it from vCenter Server. None of these steps helped. Time to get in the weeds!
I collected the logs from the host (“Administration|Export System Logs”) and started perusing. In the hostd.log file, this is what caught my eye:
2012-06-05T19:30:15.309Z [FFBB0B90 info ‘TaskManager’ opID=f2ea5ac7] Task Created : haTask-ha-root-pool-vim.ResourcePool.createResourcePool-125613723
2012-06-05T19:30:15.310Z [35AB7B90 error ‘ResourcePool ha-root-pool’ opID=f2ea5ac7] Duplicate name ‘Server Virtualization’
2012-06-05T19:30:15.310Z [35AB7B90 info ‘Default’ opID=f2ea5ac7] AdapterServer caught exception: vim.fault.DuplicateName
2012-06-05T19:30:15.310Z [35AB7B90 info ‘TaskManager’ opID=f2ea5ac7] Task Completed : haTask-ha-root-pool-vim.ResourcePool.createResourcePool-125613723 Status error
Okay, so now I have something to go on, the resource pools didn’t get removed properly and couldn’t be recreated because of a duplicate name. This is easy enough to fix.
- Enable and start the DCUI, SSH and ESXi Shell
- Either get on the console or connect via SSH to the shell as root (Alt+F1 to get to tech support mode)
- Run this command
# mv /etc/vmware/hostd/pools.xml /tmp
- Return to the DCUI (Alt+F2 from console, use “dcui” from SSH), logon as root
- Restart management Agents (under “Troubleshooting Options”)
- Stop SSH & ESXi Shell