VMware is great! It offers wonderfull functionality, like vmotion, High Availability, Dynamic Resource Scheduling and much more, right from a Graphical User Interface that any 11 year-old can understand.
I really mean that. But beware! This wonderful GUI allows you to destroy your Virtual Infrastructure without warning. So you’d better know what you are doing.
I’d like to point you to a known issue from ESX version 3.0.1 that I found still to be present in version 3.5. It’s results are not End-Of-The-World-type mayhem, but will still make the most experienced administrator break a sweat.
I’m talking about the Rescan Storage dialog:
Clicking OK while both checkboxes are checked is like playing Russian Roulette with your ESX server. Most of the times you’ll be able to have a cup of coffee while this is running. But sometimes… BANG! … your ESX host hangs and you’ll be the one doing the running. Because when ESX hangs, all vm’s hang. Which causes a lot of your users to want to call you. And they won’t want to discuss the weather.
The problem is caused by a deadlock situation on the HBA and can only be resolved by a hard reset of your server.
So how do we prevent this horrible event from ever occuring, you ask? Simple. Remember to never check more than one box at a time when using this dialog.
So, how’s the weather over there?