Recovering phantom VMrs

Ok this will be a post about something that had me baffled for 3 days…
We had an esxi server with 2 live vm’s which were a windows 2008r2 DC and a 2008r2 Exchange 2010 server.
Someone at work somehow managed to remove both the vm’s vmdk,vmx and snapshots.
So you would think this was a genuine eipc fail. The vms were no longer listed in vsphere, and in the datastore there were only Flat.vmdks(luckily!) and some delta files.

However upon investigation it turned out that the vms were still running???!!!
They were phantom vm’s. the domain was still functioning and so was exchange, but there was nothing to see in esxi.

You don’t need to be a rocketscientist to realize that this is both good and bad , if the vm is restarted for any reason then the vm and domain is toast. But it did give me the chance to think of the best recovery option. Cloning was not an option since it wasn’t listed in vsphere, nor was VeeAm or anything else.Also there were about 4 snapshots per machine, which were also deleted as a result so was the snapshot chain.

So here’s how i did it.
I’ve tested 2 options and both were successfull.

Option 1 = Take backup with backup exec, or any other decent software. Then recreate vmdk(descriptor file) and recreate vm , then restore the backup taken with BE. (Did this with the 2008 DC)

Option 2 = do a V2V conversion of the VM resulting in a new vm, with all your settings saved. (Did this with the Exchange Server 2010)

Option 1:

TAKE A BACKUP FIRST OF ALL YOUR NEEDED FILES!!! there is no guarantee that you will have a working vm after this!!

-Log into the ESX using ssh/console
-CD into the right folder (/vmfs/volumes/yourdatastore/yourphantomVM)
-to recreate a vmdk( disk descriptor file):
first you need to have the size of the FLAT file in bytes. You can check this by issueing: “du ” this will output the size in bytes.
-then enter this command to create the files:
vmkfstools -c -youramountofbytes- -a lsilogic -d thin temp.vmdk
This will create a TEMP.vmdk disk , with an lsilogic controller (this is the default one, but in your case could be buslogic or anything else, please check.)
And it will mark the disk as thin to save space. if your FLAT.vmdk is thick provisioned, then you will need to change this later.
-Now we have the needed .vmdk file (temp.vmdk) so now we can remove the temp-flat.vmdk since we don’t need it:
rm temp-flat.vmdk
-Next we need to rename the new temp.vmdk to the same name as your FLAT file.
to do so issue: “mv temp.vmdk -yourfilenamereflectingtheFLATfile-.vmdk
-Now we need to edit the descriptor file to point to the right disk.
vi yourfile.vmdk
-In case you’re not familiar with vi editor, you need to press the “insert” button to be able to edit the file and escape to exit edit mode. to save your chanes enter “:w” to exit VI enter “:q“.
-Now edit the file so it resembles:

# Disk DescriptorFile

# Extent description
RW 8388608 VMFS “YOUR FLAT FILE.vmdk”

# The Disk Data Base

ddb.virtualHWVersion = “4″
ddb.geometry.cylinders = “522″
ddb.geometry.heads = “255″
ddb.geometry.sectors = “63″
ddb.adapterType = “lsilogic”
ddb.thinProvisioned = “1″


-Now we have our new disk descriptor file, we can recreate the VM.
-Knowing you have a backup of the phantom, you can power it down.
-Log into vsphere/vcenter and create a new vm, before completion choose to edit the hardware.
-Remove the disk attached to the vm and click “add” add a new hard disk and browse to the new .vmdk you made.
-Hopefully this will boot your vm without any problems and all is well.
-If you have the same issue I had , which was you realize you could only recover the Base disk and see that the VM is in a state it was before any snapshot made. This is because the snapshots are toast as well. Therefor I hope you listened and made a backup using Backup Exec or something so you can restore the AD/system state etc.
-Now you only need to clean up a bit, because now you will have a new VM with disks residing in another folder etc, so move them to the new folder (carefully!)

Option 2:


-Remote desktop to the vm, or use another pc for the conversion
-Download and install VMware converter stand anlone client.
-Follow the wizard, choose “convert this powered on machine
-Choose your ESX as destination
-Select the apropriate options for the VM to be created (default is fine normal), only check the disks, and mark them as “Thin” if needed.
-Then run the converter.
-When it is done, log into your esx and you will see a fresh VM.
edit the vm’s settings and disconnect the network adapter(to prevent conflicts)
-Boot the VM, and check everything using the Vsphere console. When all looks swell, you can power off your Phantom VM and start using the new one!

V-79-57344-34108 – An unexpected error occurred when cleaning up snapshot volumes. Confirm that all snapped volumes are correctly resynchronized with the original volumes.

Last week I had another interesting issue with backup exec.
As you can see from the title the job ended with a :

V-79-57344-34108 – An unexpected error occurred when cleaning up snapshot volumes. Confirm that all snapped volumes are correctly resynchronized with the original volumes.

and another :

V-79-57344-65033 – Directory not found. Cannot backup directory D:\HOMEDIRS\BLA\Desktop and its subdirectories.

Needless to say that the directory ofcourse was available and backup ran fine like forever until then…

But first a little scenario of the problem:

-ESX 4 Host
-Windows 2008 R2 SP1 with shares on drive “D:”
-Backup of C: has no problem what so ever
-VSS errors filling up event log during backup of D:

So to isolate the problem I had checked every goddamn Symantec KB, every MS KB… But all to no avail.
When checking out the VCenter , nothing unusual!

But then when I went clueless, and just for the fun of it connected directly to the ESX Host instead of Vcenter, I saw a Snapshot task of the said VM running and hung at 95% for like 1 week already!!(from the backup exec VMWare backup)

So I cancelled this snapshot, but still not any luck with my D: Disk.

Then my collegue spotted something I didn’t see, for some reason we had 2 VSS providers:
-Backup Exec VSS Provider
-BEVSS Provider

As it turns out, when making backups of whole VMs, Backup exec generates an extra VSS Provider (BEVSS Provider) in the services console of windows, and when backups are complete removes it again. But because the snapshot hung, the service didn’t get removed and somehow backupexec messed up!

Now to solve this:

There is a script that get’s called after a VM Backup Job ( Post-Thawte-script.bat)
located in “c:\windows”

Run this script manually and reboot, this should fix the issue and remove the extra VSS Provider!!

more info here: