Data recovery project

rick-...@uiowa.edu

unread,

Nov 3, 2009, 6:08:23 PM11/3/09

to

According to Peter Nortons Programmers Guide to the IBM PC, " In DOS
versions 1 and 2, the first available cluster is always allocated to
the file. Later versions of DOS select clusters by more complicated
rules that we won't go into."

Well, lets go in to those rules. anybody? Specifically for ms dos 6.

This may help me with the data recovery project.

I'm writing QB45 programs to try to recover lost data from and older
pc (386).

An untrained user deleted the current database files and proceeded to
use the computer for a day, generating new db files with that day's
worth of data.

My first step was a backup of valid files to another drive.

My next step was Norton's data recovery software version 6. I had
little luck with the unerase command. None of the unerased files had
useable data. Next up was a search for related data. When I tried to
save a found sector, Norton decided to save the entire search area to
my external zip drive, not the one cluster/sector. So I now have a
82,777 kb file, pretty much the entire drive stored in 1 file on a
pentium 4 machine to work with.

Given the amount of valid data still on the drive and some 20 backups
of the same data on the same drive, using norton disk editor to try to
find and recover the data would be very tedious. There are some 5,000
unique records in the valid data, repeated 20 times over in the
backups.

The valid data is 7 lines of text, repeated any number of times. Each
person data has their own file with a minimum of 1 set of 7 lines,
repeated anywhere from 0 to 550 times in that file. The first line is
always the software version number "1.07" followed by the same set of
data but with differing values and lengths of those values, one line
each with a cr/lf at the end of each line. The data files do not have
an end of file marker. I assume the software relied on the file size
in the (sub) directory entries when appending new obs to the files.

My next step was to write a QB45 program that changed all non printing
bytes to spaces. I think that will help avoid crashes due to some
control characters.

Since the file now contains every byte of the hard drive, it has
'lines' of data that exceed QB's maximum variable length capabilities
and I get stack error when trying to use LINE INPUT to read each
line. I suppose I could read the file 1 byte at a time and construct
short strings, lookin for the '1.07' string that is the start of valid
data.

Another thought is to use the fixed sector/cluster size and split the
file in to 8,000 cluster sized files. Then search each file to see if
it might contain valid data and go from there.

Given the way dos 'deletes' files, lost valid data could reside in the
unused portions of currently assigned clusters/sectors or it could be
in available clusters. Heh, I can see old Business school files in
those areas too but no swiss bank account numbers. Gee, those
business people are boring.

An example of valid data:
1.07
11-Nov-2009 5:00.00 PM
John Smith
0
1234.5678
123456
1

Well I'm off to a funeral, back in a couple days. This will give me
something to think about.

Rick

Auric__

unread,

Nov 3, 2009, 6:50:41 PM11/3/09

to

On Tue, 03 Nov 2009 23:08:23 GMT, rick-...@uiowa.edu wrote:

> An untrained user deleted the current database files and proceeded to
> use the computer for a day, generating new db files with that day's
> worth of data.

First things first: shoot that user.

--
I find any story of seduction that involves flatulence highly suspect.

DOS Guy

unread,

Nov 3, 2009, 8:09:59 PM11/3/09

to

"rick-...@uiowa.edu" wrote:

> I'm writing QB45 programs to try to recover lost data from and
> older pc (386).

How can anything being done on a 386 these days be important enough to
embark on such a recovery task?

ArarghMai...@not.at.arargh.com

unread,

Nov 3, 2009, 9:58:53 PM11/3/09

to

On Tue, 3 Nov 2009 15:08:23 -0800 (PST), "rick-...@uiowa.edu"
<rick-...@uiowa.edu> wrote:

<snip>

>
>This may help me with the data recovery project.
>

<snip>
Unless there are backups, my guess is that you are probably hosed.

I once spent several weeks recovering data from a toasted drive, many,
many years ago.

It was a dinky little ST-225 - which is a 20 MEG drive.

--
ArarghMail911 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the extra stuff from the reply address.

H-Man

unread,

Nov 5, 2009, 1:06:39 PM11/5/09

to

Well, from your post I'm guessing that you are simply looking for the
"1.07" that defines valid data.

If this is the case I'd proceed by opening the file for random access. Read
the data in managable chunks and use INSTR() to find the valid data. In
this case make sure the chunks overlap by at least the length of the valid
data string so that you don't miss it. Keep track of the data chunk you are
working on, and the returned INSTR() value and you'll have your file byte
index of the start of valid data. Have your program save all indexes of the
valid data. If you know the record length the go back and extract the valid
data to a file, maybe in a CSV format so that you can work with it. If you
don't know the record length you'll need to find some rule that the valid
data follows, or you'll need to extract everything manually.

--
HK

rick-...@uiowa.edu

unread,

Nov 5, 2009, 1:06:44 PM11/5/09

to

> How can anything being done on a 386 these days be important enough to
> embark on such a recovery task?

This particular pc is dedicated to timing events in an environment
where it can't tollerate missing a single count. It has a custom ISA
bus expansion card that does the timing and interfaces with the
sensors. All previous attempts to install and run with Windows failed
as Windows is usually to buzy doing other usless tasks and that takes
precious cycles away from the timing effort. This 386/25 mhz is fast
enough to keep up. Earlier and slower 286's could not always keep up
and would occasionally crash. If you look at XP processes when you
have a dos app running, it's far worse. Windows wants 99% of the
processor time to monitor potential interupts from the dos app.

Yeah, I'll love to shoot the abuser but I think the better approach is
to write more fool proof db software which I have done.

It's windows and the GUI's that need faster processors. Avoid those
and a 386 or a 486 can do anything. You certainly don't need a
pentium to run your toaster. A 8086 can handle those tasks for far
less money. There are probably more pokey little processors being
made these days for devices than ever. Surely far more than are going
in computers.

rick