help recovering from file corruption?

rebel

unread,

Aug 27, 2009, 9:42:54 PM8/27/09

to

Yesterday I discovered that one HDD partition had extensive corruption to files
and folders. Unfortunately the imaged backup is the same - corrupted. So I
have lost a lot of valuable files.

Having tried looking at some corrupted files with a hex editor, I decided that I
might as well run CHKDISK and let it do its worst. In one case, it declared a
folder corrupt and placed the entire folder into a file. At the end of its pass
it had created 524 ".CHK" files. Looking into these I found *some* had
seemingly valid contents, albeit with some crud appended to the end in many
cases. There were two such file types I recognised from the editor display. I
copied all 524 to another drive for "processing".

I wrote a quick'n'dirty QB45 prog to identify which of the 524 were of each of
these two types, and to rename them and move them to another folder. Then the
subtleties of what I really needed/wanted to do emerged.

Type 1 files - ah yes. The original pre-corruption format is ascii, with a
specific start line and end line. However the .CHK files containing them often
contain more_than_one file, separated by more crud. I was preparing to read and
write out each file on a line-by-line basis, but the crud contains random
characters which include delimiters AND EOF's. Faced with EOF's, LINE INPUT#
errors with "input past EOF" and I'm not sure if I can just RESUME NEXT and
carry on looking for the (ascii) start line of the second, third etc
file_within_the_file. Similarly, if I use INPUT$(n,var$) I may not capture the
identifying strings correctly. I don't want to resort to binary reads and
building strings one character at a time unless there is no alternative.

IrfanView announced that Type 2 was a .JPEG file with an incorrect extension -
renaming the extension worked and the image was intact despite the
aforementioned "crud". But out of over 4000 images originally on the partition
it is difficult to place these back where they belong without the date
information. Fortunately these are "Exif" image files which contain the
camera's date/time in the file header. Question: Rather than the .CHK file's
creation date/time, how (from DOS, and preferably from a QB prog) can I change
the dates on these files to match that contained within the file? I can do each
file manually using Xtreme, but that is incredibly tedious and error-prone, so
I'd prefer a programmatic solution, but I'm no whizz at batch files so QB45
would be the tool of choice.

I'd welcome any constructive suggestions on each of the above. TIA

Todd Vargo

unread,

Aug 27, 2009, 10:19:55 PM8/27/09

to

Don't beat yourself up. I would just use Jhead to set the file date/time to
the stored date/time.

http://www.sentex.net/~mwandel/jhead

--
Todd Vargo
(Post questions to group only. Remove "z" to email personal messages)

rebel

unread,

Aug 27, 2009, 11:59:03 PM8/27/09

to

On Thu, 27 Aug 2009 22:19:55 -0400, "Todd Vargo" <tlv...@sbcglobal.netz> wrote:

>Don't beat yourself up. I would just use Jhead to set the file date/time to
>the stored date/time.
>
>http://www.sentex.net/~mwandel/jhead

Wow! That's one handy little utility. Took a whole five minutes to download,
read the usage notes and restamp all the target files. Many thanks for that
one.

Now for the other type of files ...

rebel

unread,

Aug 28, 2009, 8:26:42 AM8/28/09

to

I found that programmatically clearing the crud hurdle was a PITA, and wound up
using cut and paste from within Wordpad. The difficulty was in reading one
character at a time and assembling strings when a character matched, far too
fiddly.

Todd Vargo

unread,

Aug 28, 2009, 9:39:56 PM8/28/09

to

rebel wrote

>
> I found that programmatically clearing the crud hurdle was a PITA, and
wound up
> using cut and paste from within Wordpad. The difficulty was in reading
one
> character at a time and assembling strings when a character matched, far
too
> fiddly.

A bit of explanation on the "crud". The file system if made of clusters,
usually in some multiple of 512 bytes (typically 4KB to 32KB depending on
OS). As files are created and deleted, the data remains on the HD but only
the file pointer is actually destroyed. So for a 5 byte file for example, a
full cluster is used by that file. The file system keeps track of which
cluster(s) hold the file and how many bytes are actually used. The remaining
bytes in the cluster is called slack space. Slack space holds residual data
from some other file that was deleted some time ago.

So what does all this mean you ask? Well, when scandisk/chkdsk saves a .chk
file, it does not know the original file name or size so it saves the entire
cluster (file data and slack). Depending on original file size, multiple
.chk files can make up larger files but the .chk files are not guaranteed to
be in the correct order. The .chk files are usually garbage so consider
yourself lucky to have salvaged any files at all.

rebel

unread,

Aug 29, 2009, 2:33:40 AM8/29/09

to

On Fri, 28 Aug 2009 21:39:56 -0400, "Todd Vargo" <tlv...@sbcglobal.netz> wrote:

>rebel wrote
>>
>> I found that programmatically clearing the crud hurdle was a PITA, and
>wound up
>> using cut and paste from within Wordpad. The difficulty was in reading
>one
>> character at a time and assembling strings when a character matched, far
>too
>> fiddly.
>
>A bit of explanation on the "crud". The file system if made of clusters,
>usually in some multiple of 512 bytes (typically 4KB to 32KB depending on
>OS). As files are created and deleted, the data remains on the HD but only
>the file pointer is actually destroyed. So for a 5 byte file for example, a
>full cluster is used by that file. The file system keeps track of which
>cluster(s) hold the file and how many bytes are actually used. The remaining
>bytes in the cluster is called slack space. Slack space holds residual data
>from some other file that was deleted some time ago.
>
>So what does all this mean you ask? Well, when scandisk/chkdsk saves a .chk
>file, it does not know the original file name or size so it saves the entire
>cluster (file data and slack). Depending on original file size, multiple
>.chk files can make up larger files but the .chk files are not guaranteed to
>be in the correct order.

Yes, I was aware of that. Interestingly, some of the crud contained XP
installation stuff, and XP has never been near this box (or HDD) at any time.

>The .chk files are usually garbage so consider
>yourself lucky to have salvaged any files at all.

I *do* consider I was lucky - 38 .jpgs and 46 files of another type recovered
intact, and one file recovered in nearly complete that can be rebuilt.

Todd Vargo

unread,

Aug 30, 2009, 11:03:49 PM8/30/09

to

rebel wrote:
> Interestingly, some of the crud contained
> XP installation stuff, and XP has never been near this box (or HDD)
> at any time.

Apperantly, someone inserted an XP installation disk and canceled at some
point during the installation process. <shrug>