Power down the VM - if that is not possible or takes a long time - kill the VM.
For reasons that do not matter right now the VM was started with a misconfigured snapshot-chain.
If you continue to use it like this chances to recover the normal state of the VM will go down south.
Do NOT allow any filesystem checks to continue - abort any chkdsk or fschk operations.
Do NOT run any defragmentation tools.
After you have shut down or killed the VM make a copy of all vmware.logs , the vmx-file , the vmsd-file
and all small text-size vmdk-files.
Now don't act in haste.
Go away from that VM and do not touch it the next hour ... drink a coffee ... relax.
Once the danger to do anything very stupid is gone go back to the VM.
If mission critical data is missing make a full backup of the complete VM as it is now.
The next steps are best done with a copy ...
Follow this checklist :
1. - find out which vmdk was configured in the vmx-file used during last start - check vmware.log and the vmx-file
2. - find out which vmdk was used during last successful starts - check the older vmware.logs
3. - check if the vmdk that was used during last successful starts is still present
4. - repeat the first steps in case the VM uses more then one vmdk
5. - now you should know WHAT is wrong
6. - ask co-workers if they changed settings , edited the vmx-file or moved or deleted files
7. - analyse the history of this VM - , was it ever expanded, moved , imported
8. - check if any automatic backup-tools ever touched this VM
9. - now you should know WHY something went wrong - does it make sense ? - if not go back to 1.
10. - look at the involved vmx-file and vmdk-descriptions - find a way to correct them with minimal edits
11. - does one edit of the vmx plus one edit of the CID-values per vmdk fix it ? - if NO ... are you sure ?
12. - are you sure ?
13. - make your minimal invasive edits
14. - find out if any vmem files are present - remove or rename them
15. - inspect the vmsd file - adjust it or if unsure remove the file
16. - find out if any vmsn-files are present - if unsure what to do with them - remove them
17. - find out if any vmss-files are present - remove them
18. - boot the VM into a LiveCD that does not automount anything - check if state of data is the expected one
19. - optional - copy lost data from the LiveCD to a network-share
20. - boot VM from harddisk into SAFE mode
It is expected behaviour that on first start the original Operating system complains and wants to check the disks.
At this point allow it and cross your fingers ....
DO NOT SHORTCUT THIS CHECKLIST
This often happens during Snapshot-manager operations or with normal VMs that use any type of growing virtual disks.
The VMware GUI then pops up a message and tells you to free up some space , continue or abort the operation.
Do not answer this popup.
Very likely both options will fail so the only possible way is to fix the problem at the root.
Free up some space NOW - migrate other VMs to a different datastore on ESX
or move some other data to a portable USB-disk on hosted platforms.
When done - answer the popup.
The most portable format is twoGbMaxExtentSparse
This format
- can be stored on USB devices formatted with FAT32
- can be stored on one or more DVDs without need to split the file first
- can be natively used by all hosted platforms
- can be imported to ESX with vmkfstools
This vmdk-types use embedded descriptions : monolithicSparse , streamOptimized
In case you want to edit this embedded description do NOT use hex-editors.
This files can be very large so it may take a long time to load them into a hexeditor.
Also you have to be very careful not to change the size of the file while editing.
The much safer and easier ways is to extract the embedded description first.
On a Windows host this can be with the tool dsfo.exe
After you have done your edits re-inject the description with dsfi.exe
The dsfok-tools were created by Dariusz Stanislawek
homepage
The monolithicFlat and the vmfs are very similar.
They both use a *-flat.vmdk data chunk.
To use an existing *-flat.vmdk with any of this two types a small manual edit
in the description is suffiecient.
1. set createType : monolithicFlat for use with Workstation or vmfs for use with ESX
2. check the extent description: use FLAT for Workstation and VMFS for ESX
3. check the offset value : use 0 for Workstation and remove the offset for ESX
4. check the ddb.virtualHWVersion and adjust if necessary
Let's have a look at a broken snapshot-chain.
Tthe example you will see next is taken from some case I had in the german-forum recently.
The 3 descriptorfiles are taken from a corrupt snapshot-chain - the user had accidentaly moved the basedisk and had started it without the snapshots. He noticed the problem just in time and didn't do anything in the VM and powered it down at once. He had used disks of type monolithic-sparse (one piece growing)
So he had to extract the embedded descriptorfiles first - see
Here is the result
When he had moved back the basedisk to the original directory VMware complained:
can't open snapshots because basedisk has been changed!
Let's find out how VMware noticed that ...
look at the descriptions of the snapshots: each of them has a line
parentFileNameHint = "path_to_parent" - this is used to find the parent-disk when VMware launches a vmx-file that has a snapshot as disk-reference.
This paths must point from child - child - .... - child - parent.
Check if this correct in this case: yes - no problem.
Now every disk-description also uses two lines with CID-notes.
The CID is a random value that gets autocreated by VMware.
To check if any changes have been made to a parent - a child always notes the CID value of its parent.
If this is still the same next start everything is fine - if not all alarmbells ring.
Let's check this case:
the second child expects this value parentCID=06ebe4dc on his parent.
the first child has this CID=06ebe4dc and expects this parentCID=daf6cf10 on his parent
the parent has this CID=3160ee93
Noticed anything ?
Yes - the CID of the parent-disk is not the one that was expected by the child.
So this snapshot chain is broken - you can not use it in a VM any longer.
Can we fix that?
That depends - if you have done no changes to the basedisk you may be lucky and come out with a disk that is clean and readable. If you have done something like a defrag in the basedisk while you used it on its own the result can force a checkdisk at startup and can end with a complete data loss.
So whenever you do something like this don't panic - think twice before you do anything at all. First understand what has gone wrong!
Back to the example:
This is the repaired version of descriptor-files that I send back to the poster.
Unless you have misconfigured VMware to not write logs the last log contains
all data to restore a lost descriptor for a vmdk from scratch. This also applies to embedded descriptors!
Proceed like this:
1. find out which disktype you need
2. copy a sample descriptor from the examples
3. grab data from the log and replace the entries from the sample
- 3a. doublecheck the parameter adapterType against the vmx-file.
- 3b. if unsure about disk-geometry - compare against the disk-geomtry-table
A working vmdk-descriptor must declare these values:
If the vmdk uses several data chunks you must calculate or better guess the size of each chunk.
The type twoGbMaxExtentSparse usally has 4192256 sectors large chunks.
The type twoGbMaxExtentFlat usally has 4193792 sectors large chunks.
In both types the last chunk varies in size to fit to the overall capacity.
Warning: if a vmdks was created with custom size chunks or was expanded earlier you can NOT follow this rule-of-thumb for calculating the size of the several extent description lines !
Once we have all the parameters together we can now re-create the description
in case you got a good last vmware.log extracting the last used vmx-file is quite easy:
Search through the log for DICT-entries - these entries first list the preferences settings
and then it prints the vmx-parameters that were used .
The relevant section starts at keyword: CONFIGURATION - next section is USER DEFAULTS.
All entries in between are part of the vmx-file. Just copy that section and delete the time-stamp in front of the line ...
The example shows a vmware.log - the embedded vmx-file is shown in blue.
Procedure works on all platforms.
you need to wrap the arguments in quotes - compare the black embedded text with the out-put vmx listed below.
After extraction and final polishing the file should look like this.
Note that "FALSE" is the same as "false", "TRUE" is the same as "true".
The sequence of lines does not matter.
You can sort the lines alaphabetically if you like.
Undefined parameters with = "" can be skipped.
This creates a vmdk named cd-slices-disk.vmdk of the nominal size 4690 Mb with each single chunk exactly 670 Mb.
This should fit on a set of CDs easily.
The first line creates the first chunk.
The other lines simply expand the first chunk multiple times.
Comparison of both types of sparse (growing) virtual disks
monolithicSparse = a growing disk in one piece
twoGbMaxExtentSparse = a growing disk split in to pieces that have a max size of 2 Gb
The one-piece type seems to be a reasonable choice at first sight - following the simple straight logic:
one virtual disk = one file
Lets have a deeper look at the differences:
3a8082e126