Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Question: recovering overwritten data

22 views
Skip to first unread message

Patrick Juola

unread,
Apr 22, 1997, 3:00:00 AM4/22/97
to

In article <5jies8$8...@mozo.cc.purdue.edu> jl...@math.purdue.edu writes:
>A friend of mine argues that it's impossible to retrieve data
>from magnetic storage media (i.e., a hard drive) if that data has
>already been overwritten, but I've heard of cases in which faint
>magnetic traces could be detected with special equipment in order
>to do just this. I know this isn't really the right place for this
>to go, but it's sort of (if you stretch real hard) on topic.
>
>I'll rephrase: Is it possible for someone to access plaintext from
>a hard drive even if I've explicitly overwritten each bit of the file
>with some pseudo random bit? If so, about how many writes would be
>appropriate to erase the data completely?

Yes, it's possible. How many times you should overwrite your data
depends on how good your magnetic detectors are. Bruce Schneier
recommends (_Applied Cryptography_, 8.9) overwriting seven times
and admits that even that doesn't erase it "completely." Ultimately,
it's another risk-cost tradeoff.

-kitten

James Lee

unread,
Apr 22, 1997, 3:00:00 AM4/22/97
to

A friend of mine argues that it's impossible to retrieve data
from magnetic storage media (i.e., a hard drive) if that data has
already been overwritten, but I've heard of cases in which faint
magnetic traces could be detected with special equipment in order
to do just this. I know this isn't really the right place for this
to go, but it's sort of (if you stretch real hard) on topic.

I'll rephrase: Is it possible for someone to access plaintext from
a hard drive even if I've explicitly overwritten each bit of the file
with some pseudo random bit? If so, about how many writes would be
appropriate to erase the data completely?

Thanks,

--
James Lee <jl...@math.purdue.edu> - http://www.math.purdue.edu/~jlee
Key Fingerprint = 2F 06 48 C4 8F 80 C1 43 4C A7 AA 6A 8C C9 40 19
PGP mail encouraged - Key ID: 1024/222952E9 - Finger for public key

Fons Adriaensen

unread,
Apr 22, 1997, 3:00:00 AM4/22/97
to

In article <5jig8n$i...@news.ox.ac.uk> pat...@gryphon.psych.ox.ac.uk (Patrick Juola) writes:

> >I'll rephrase: Is it possible for someone to access plaintext from
> >a hard drive even if I've explicitly overwritten each bit of the file
> >with some pseudo random bit? If so, about how many writes would be
> >appropriate to erase the data completely?
>

> Yes, it's possible. How many times you should overwrite your data
> depends on how good your magnetic detectors are. Bruce Schneier
> recommends (_Applied Cryptography_, 8.9) overwriting seven times
> and admits that even that doesn't erase it "completely." Ultimately,
> it's another risk-cost tradeoff.
>

If you want to keep a file for a long time, you can use the following
procedure.

Encrypt you data before writing it to disk, and store the key with it.
The encryption is only to randomise the data, and doesn't have to be
strong. Overwrite the file each hour with a new version encrypted with a
different (random) key.

To delete the file, overwrite it 10 times with random data, as suggested
before.

If the random data do not erase the file completely, there will be traces
from many versions interfering with each other, making recovery much more
difficult.

--
F.A.

Roger Fleming

unread,
Apr 23, 1997, 3:00:00 AM4/23/97
to

jl...@math.purdue.edu pondered long before writing:
[...]


>I'll rephrase: Is it possible for someone to access plaintext from
>a hard drive even if I've explicitly overwritten each bit of the file
>with some pseudo random bit?

The answer to question 1 is yes, but not with normal hardware. The thing is
that the "ADC" connected to the read head has only 1 bit resolution. However,
the magnetised domain is actually at some intermediate value, dependant on its
previous history of reads and writes. I have read of a simple experiment where
the read head was connected to a 16 bit ADC, the results fed into a commercial
digital signal processing package, and hey presto, once overwritten data
reappears.
More sophisticated processing, or fancier hardware, is believed to be able
to go much deeper. Other effects which can be exploited include track
misalignment (one track may not be written _exactly_ over the top of an older
one, although it's close enough for normal reading) and 3 dimensional memory
(the surface of the medium may only be showing a current bit value - at least
to 1 bit resolution - but older bits may still be present deeper in the
medium). TTBOMK, these effects - unlike the improved bit resolution method -
have not been publicly demonstrated, but it is generally agreed that they
could be made to work with sufficiently sophisticated hardware.
So th brief summary is:
1. The average software hacker can't do it, period. You must have special
hardware.
2. A well equipped hardware engineer should be able to read through one
overwrite without too much trouble.
3. People like government intelligence services can probably read through
quite a few, if they think it's worth their effort.

> If so, about how many writes would be
>appropriate to erase the data completely?

Estimates vary wildly, because we don't really know. One oft quoted standard
specifies overwriting once with 0s, once with 1s, then a third time with
(IIRC) pseudo random data. As Peter Gutman points out in his rather in depth
analysis of this subject (sorry, can't remember where it lives), things like
RLL encoding make it a definitely non-trivial task to translate this simple
software model into what actually happens on the drive. It also depends
on the type of drive. Peter came up with a procedure that is pretty likely to
produce very good security on nearly all drives; it requires not 3, or 7, but
31 overwrites, and very specifically calculated overwrites at that.

As a final note, it is claimed that the Australian government standard for
declassifying magnetic media containing secret information, requires the
complete physical destruction of the disk.

----------------------------------------------------------------------
| Please remove antispam before replying. |
| Speaking only for myself. |
| "Reorganisation is a splendid method of producing |
| the illusion of progress whilst creating confusion, |
| inefficiency and demoralisation." |
| -Petronius Arbiter, AD60. |
----------------------------------------------------------------------

Thor Arne Johansen

unread,
Apr 23, 1997, 3:00:00 AM4/23/97
to

James Lee wrote:
>
> A friend of mine argues that it's impossible to retrieve data
> from magnetic storage media (i.e., a hard drive) if that data has
> already been overwritten, but I've heard of cases in which faint
> magnetic traces could be detected with special equipment in order
> to do just this. I know this isn't really the right place for this
> to go, but it's sort of (if you stretch real hard) on topic.
>
> I'll rephrase: Is it possible for someone to access plaintext from
> a hard drive even if I've explicitly overwritten each bit of the file
> with some pseudo random bit? If so, about how many writes would be

> appropriate to erase the data completely?
>
> Thanks,
>
> --
> James Lee <jl...@math.purdue.edu> - http://www.math.purdue.edu/~jlee
> Key Fingerprint = 2F 06 48 C4 8F 80 C1 43 4C A7 AA 6A 8C C9 40 19
> PGP mail encouraged - Key ID: 1024/222952E9 - Finger for public key


Very few things are impossible, but I would say it is close to
impossible.

I never read or heard about any experiments where they actually recover
overwritten data.

Disk drives operate on the limits of what the heads, media and
electronics can handle. When you overwrite a data pattern, you supress
the old pattern by about 30dB, and you also introduce nonlinearities.

This makes it next to impossible to recover the data by processing
the readback signal.

The fact that it is possible to measure the overwrite ratio means
that there is information left after overwriting or erasing. But
detecting the presence of a signal does not mean we can recover the
information present in the signal.

However, being in the data recovery business, I would be very interested
in talking to anyone who claim to have recovered overwritten data, or
have access to technology which makes this possible. :)


--
Thor Arne Johansen +47 62 81 00 99 Direct
Department Manager, R&D +47 62 81 01 00 Switch Board
Ibas Laboratories, Norway +47 62 81 01 10 Fax
http://www.ibas.no

Kaz Kylheku

unread,
Apr 23, 1997, 3:00:00 AM4/23/97
to

In article <5jjsv2$k...@falcon.pacit.tas.gov.au>,

Roger Fleming <ro...@nospam.police.tas.gov.au> wrote:
>2. A well equipped hardware engineer should be able to read through one
>overwrite without too much trouble.

Heck, I have been able to _hear_ the previous data of an audio tape that
was recorded without the proper bias adjustments for the high remanence
characteristic of the tape. The improperly set bias wasn't able to completely
degauss the previous track.

>As a final note, it is claimed that the Australian government standard for
>declassifying magnetic media containing secret information, requires the
>complete physical destruction of the disk.

With today's low HD prices, I wouldn't give a second thought to this bit of
common sense from down under. However, if some agency is breathing down your
neck, you may not have enough time to do this (nor to wipe files for that
matter).

The best thing is to never store any sensitive plaintext on your hard disk at
all. Use a cryptographic filesystem with a strong cipher. When you no
longer need the data, by all means destroy the disk. But in the event of an
enemy raid, you still have some measure of security since the opponent
has to spend time cryptanalyzing your drive. In choosing the cryptosystem,
assume that the ciphertext will be available for analysis to someone somewhere
somehow until the day you die (and beyond). Even if you destroy the
drive, you can't be absolutely sure that someone doesn't have a copy of some of
its contents stashed away.

Jason Dufair

unread,
Apr 23, 1997, 3:00:00 AM4/23/97
to

Is a cryptographic filesystem a reality or just theory at this point. I
am not aware of such a thing for NT nor Unix. Pointer please, if you
have them.

John Savard

unread,
Apr 24, 1997, 3:00:00 AM4/24/97
to

Thor Arne Johansen <th...@ibas.no> wrote:

>I never read or heard about any experiments where they actually recover
>overwritten data.

>Disk drives operate on the limits of what the heads, media and
>electronics can handle. When you overwrite a data pattern, you supress
>the old pattern by about 30dB, and you also introduce nonlinearities.

Although I do remember hearing about someone recovering some
overwritten data, up to 10 overwrites on some tracks, this was back in
the days of 1103 type disk drives.

The media is not the bottleneck, since we're talking about looking at
oscilloscope traces, so that old data that is 30 dB down _is_
recoverable, as a variation in the pulse shape for the main signal.

A completely different kind of electronics - highly sensitive,
low-noise, and linear analog electronics would be used. And the drive
would have to be physically disassembled in a clean room. For
exploiting misalignment, techniques like exploiting the effects of
magnetism on the reflection of polarized light would be involved -
with no electronics. The bits would be picked up using an _optical
microscope_.

I tend to agree that, in general, very little overwritten data could
be recovered - some of the immediately previous contents, and perhaps
any data that wasn't overwritten for a long time. But, if someone is
willing to spend thousands of dollars recovering the contents of one
hard disk, and has millions of dollars of equipment to do it with, I
wouldn't consider it reasonable to completely eliminate the
possibility of success. (Although you're very probably right that with
today's disk drives, it's less likely than with disk drives using
older technology at much lower densities.)

John Savard

Paul Leyland

unread,
Apr 24, 1997, 3:00:00 AM4/24/97
to

Jason Dufair <*fu...@iquest.net*> writes:

> Is a cryptographic filesystem a reality or just theory at this point. I
> am not aware of such a thing for NT nor Unix. Pointer please, if you
> have them.


I can't speak for NT, but a Unix pointer is:

ftp://ftp.ox.ac.uk/pub/crypto/misc/cfs.1.3.shar.gz

Paul
--
Paul Leyland <p...@oucs.ox.ac.uk> | Hanging on in quiet desperation is
Oxford University Computing Services | the English way.
13 Banbury Road, Oxford, OX2 6NN, UK | The time is gone, the song is over.
Tel: +44-1865-273200 Fax: 273275 | Thought I'd something more to say.
PGP KeyID: 0xCE766B1F

Thor Arne Johansen

unread,
Apr 24, 1997, 3:00:00 AM4/24/97
to

John Savard wrote:
>
> Thor Arne Johansen <th...@ibas.no> wrote:
>
> >I never read or heard about any experiments where they actually recover
> >overwritten data.
>
> >Disk drives operate on the limits of what the heads, media and
> >electronics can handle. When you overwrite a data pattern, you supress
> >the old pattern by about 30dB, and you also introduce nonlinearities.
>
> Although I do remember hearing about someone recovering some
> overwritten data, up to 10 overwrites on some tracks, this was back in
> the days of 1103 type disk drives.
>
> The media is not the bottleneck, since we're talking about looking at
> oscilloscope traces, so that old data that is 30 dB down _is_
> recoverable, as a variation in the pulse shape for the main signal.

I'm afraid I don't understand?

Noise from the media, heads and electronics will still be present if
you digitize the signal and show it on an oscilloscope?

Furthermore, the S/N ratio is typically 30-40 dB, so the overwritten
signal to noise ratio will be 0-10dB. Also the overwrite process will
introduce nonlinearities, so it will be necessary to use quite
agressive equalization to recover the data. This will cause the noise
to be more correlated with the data, which in turn makes it harder to
detect the data.

Yet another difficulty is the spindle motor jitter. Because of this it
is not safe to assume that the new bits will align with the old bits. In
other words even for older peak detect channels you have transitions
that are partial easy and partial hard (with and against the old
magnetization).

>
> A completely different kind of electronics - highly sensitive,
> low-noise, and linear analog electronics would be used. And the drive
> would have to be physically disassembled in a clean room. For
> exploiting misalignment, techniques like exploiting the effects of
> magnetism on the reflection of polarized light would be involved -
> with no electronics. The bits would be picked up using an _optical
> microscope_.
>

The electronics in a disk is not very noisy. Of course if you apply
exotic technologies such as cryogenic cooling you can improve the S/N,
but you will still have noise from the heads and media.

As far as using the misalignment due to non-repeatable spindle runout
there is at least one problem with that, namely edge erasure. At the
track edges there will be a band that is effectively erased, and AFAIK
this will happen even if the disk has guard bands.

> I tend to agree that, in general, very little overwritten data could
> be recovered - some of the immediately previous contents, and perhaps
> any data that wasn't overwritten for a long time. But, if someone is
> willing to spend thousands of dollars recovering the contents of one
> hard disk, and has millions of dollars of equipment to do it with, I
> wouldn't consider it reasonable to completely eliminate the
> possibility of success. (Although you're very probably right that with
> today's disk drives, it's less likely than with disk drives using
> older technology at much lower densities.)
>
> John Savard

As I said in my first post, very few things are impossible.

On older peak detect disks, maybe it is possible to employ high
resolution imaging techniques such as MFM imaging, or as you suggested,
Kerr effect imaging, combined with advanced signal processing, to
recover data.

And if we can make it work on one type of disks, it is reasonable to
think that it can be done for more modern disks as well (probably
costing lots more, and taking much longer time).

I guess my point is that even if it is concievable, I don't think it is
feasible, or even doable...yet.

Jot Powers

unread,
Apr 24, 1997, 3:00:00 AM4/24/97
to

In article <335F6DB9...@ibas.no>, Thor Arne Johansen <th...@ibas.no> writes:
>John Savard wrote:
>>
>> Thor Arne Johansen <th...@ibas.no> wrote:

[*snip comments to now to save bandwidth*]

Peter Gutmann presented an absolutely fascinating paper at the
6th Annual Usenix Security Symposium. Regrettably, the full
text of the paper is only available on-line to Usenix members.

The URL for the abstract and full paper for members is:

http://www.usenix.org/publications/library/proceedings/sec96/gutmann.html

My own summary from what I remember of the presentation:

o If you need security from a Fortune 500 company for your data on tape,
disk drive or floppy drive, destroy the media.
o Don't keep values in RAM for long periods of time.

(I believe Bruce Schneir has published an attack on secure cards used
by the banking system using this last little known fact, but I don't
have an URL available. It's probably somewhere on http://www.counterpane.com)

-Jot

--
Jot Powers j...@tmp.medtronic.com
Unix System Administrator, Medtronic Micro-Rel
"Subtlety is the art of saying what you think and getting out of the way
before it is understood."

parker_rob

unread,
Apr 27, 1997, 3:00:00 AM4/27/97
to

Thor Arne Johansen (th...@ibas.no) wrote:

>James Lee wrote:
>>
>> I'll rephrase: Is it possible for someone to access plaintext from
>> a hard drive even if I've explicitly overwritten each bit of the file
>> with some pseudo random bit? If so, about how many writes would be
>> appropriate to erase the data completely?

>Very few things are impossible, but I would say it is close to
>impossible.

>I never read or heard about any experiments where they actually recover
>overwritten data.

>Disk drives operate on the limits of what the heads, media and
>electronics can handle. When you overwrite a data pattern, you supress
>the old pattern by about 30dB, and you also introduce nonlinearities.

>This makes it next to impossible to recover the data by processing
>the readback signal.

Writing a 0 on top of a 1 gives a weaker result than writing a 0 on top
of another 0. With a sensitive head and equipment that can see that
difference, you could read data that was underneath, perhaps even several
layers back. Ordinary drive controllers probably couldn't do this (they
are only supposed to care about what is *currently* there), but it is
supposedly possible with the right equipment.

You're not trying to pick out a faint signal lost behind a strong one.
With the formatting of disks, the bits are pretty much in exactly the
same place, so you only have to read the exact signal level with enough
precision to distinguish whether the value you read was overwritten on
a 0 or on a 1.

-Rob Parker


Thor Arne Johansen

unread,
Apr 28, 1997, 3:00:00 AM4/28/97
to

parker_rob wrote:
>
>
> Writing a 0 on top of a 1 gives a weaker result than writing a 0 on top
> of another 0. With a sensitive head and equipment that can see that
> difference, you could read data that was underneath, perhaps even several
> layers back. Ordinary drive controllers probably couldn't do this (they
> are only supposed to care about what is *currently* there), but it is
> supposedly possible with the right equipment.
>
> You're not trying to pick out a faint signal lost behind a strong one.
> With the formatting of disks, the bits are pretty much in exactly the
> same place, so you only have to read the exact signal level with enough
> precision to distinguish whether the value you read was overwritten on
> a 0 or on a 1.
>
> -Rob Parker

Again, I'm not saying it is impossible, I'm just saying it's damn hard.

People keep saying it is just a matter of detecting the current bits,
then
figuring out what the *ideal* waveform for those bits is, and
subtracting
the ideal waveform from the actual waveform, and voila, the old pattern
remains.

There are at least two problems with this: First it assumes LINEARITY,
which
is not a valid assumption under these circumstances. Second, even if the
linearity assumption holds, we will see a signal about 30dB weaker than
a
fresh signal. Knowing that the drives operate on the limits as far as
density
vs. S/N goes, I think it will be very hard to reliably detect data.

What you describe above is generally referred to as hard (0->1, 1->0)
and easy
(1->1, 0->0) transitions. This is a nonlinear process, and is very hard
to predict
for random data patterns. To complicate things even more, we have to
consider
spindle motor jitter, and different coding schemes. Because of these it
is not
valid to assume that one new bit only affects one old bit. Depending on
coding and
spindle jitter, one transition can affect as many as 4-5 bits (EEPRML).
It is
going to be a mess to unwind the observed pattern to obtain the old
pattern. And
if you consider more than two layers....

I guess the only way to convince me is to show me a successful
experiment where
they actually recover overwritten data :)

John Savard

unread,
Apr 28, 1997, 3:00:00 AM4/28/97
to

Thor Arne Johansen <th...@ibas.no> wrote:

>John Savard wrote:
>> The media is not the bottleneck, since we're talking about looking at
>> oscilloscope traces, so that old data that is 30 dB down _is_
>> recoverable, as a variation in the pulse shape for the main signal.

>I'm afraid I don't understand?

>Noise from the media, heads and electronics will still be present if
>you digitize the signal and show it on an oscilloscope?

The noise will still be present - but I did not say anything about
digitizing the signal before showing it on an oscilloscope. I am
talking about digitizing it, bit by bit, by hand, and viewing it in
all its nonlinear glory.

>Yet another difficulty is the spindle motor jitter. Because of this it
>is not safe to assume that the new bits will align with the old bits.

Which makes the old bits even easier to pick out - by eye.

>As I said in my first post, very few things are impossible.

>On older peak detect disks, maybe it is possible to employ high
>resolution imaging techniques such as MFM imaging, or as you suggested,
>Kerr effect imaging, combined with advanced signal processing, to
>recover data.

>And if we can make it work on one type of disks, it is reasonable to
>think that it can be done for more modern disks as well (probably
>costing lots more, and taking much longer time).

>I guess my point is that even if it is concievable, I don't think it is
>feasible, or even doable...yet.

I can't be sure of that - and, it looks as though, as disks become
more advanced, the time lag between the disks that are in use, and the
disks on which overwritten data can be recovered will grow.

John Savard

parker_rob

unread,
Apr 28, 1997, 3:00:00 AM4/28/97
to

Thor Arne Johansen (th...@ibas.no) wrote:
>parker_rob wrote:
>>
>> Writing a 0 on top of a 1 gives a weaker result than writing a 0 on top
>> of another 0. With a sensitive head and equipment that can see that
>> difference, you could read data that was underneath, perhaps even several
>> layers back. Ordinary drive controllers probably couldn't do this (they
>> are only supposed to care about what is *currently* there), but it is
>> supposedly possible with the right equipment.
>>
>> You're not trying to pick out a faint signal lost behind a strong one.
>> With the formatting of disks, the bits are pretty much in exactly the
>> same place, so you only have to read the exact signal level with enough
>> precision to distinguish whether the value you read was overwritten on
>> a 0 or on a 1.

>Again, I'm not saying it is impossible, I'm just saying it's damn hard.

Given that the original post (the one you had responded to) talked about
recovery of overwritten plaintext, I'm assuming that the original poster
cares about what a well-funded opponent (such as a govenerment) would be
capable of. I believe this is well within their abilities. I don't know
if the technique has been used for any commercial data-recovery.

>People keep saying it is just a matter of detecting the current bits,
>then figuring out what the *ideal* waveform for those bits is, and
>subtracting the ideal waveform from the actual waveform, and voila,
>the old pattern remains.

But that's not how it works. It does not require that one signal be
subtracted from another to recover the original. This is not one
transmission signal masking another. All that is necessary is for
the detector to be able to distinguish the difference between an
overwritten 0 and an overwritten 1 by its effect on the strength of
the magetization for encoding current data.

>There are at least two problems with this: First it assumes LINEARITY,
>which is not a valid assumption under these circumstances. Second,
>even if the linearity assumption holds, we will see a signal about
>30dB weaker than a fresh signal. Knowing that the drives operate on
>the limits as far as density vs. S/N goes, I think it will be very
>hard to reliably detect data.

It can't operate on the limit of detectibility. It has to be reliable
so that the disk will provide the correct data back after it was written.
You don't have much opportunity for retransmission (ie rewriting the
data if there was a problem); getting close enough that errors are at
all likely starts to impact performance, and I/O rate is more of a
limiting factor than storage capacity (you can always add more disks
if you need them). Also remember that normal operation expects the
data to be read in a single pass. Someone attempting to recover data
that was overwritten might easily make many passes over the same data
if it helps to read the magnetization with greater precision.

>What you describe above is generally referred to as hard (0->1, 1->0)
>and easy (1->1, 0->0) transitions. This is a nonlinear process, and is
>very hard to predict for random data patterns.

It doesn't have to be linear. You're reading a digital value not
analog, so the exact waveform is irrelevant. All that matters is that
you distinguish between the two possibilities for the overwritten bit.
If a 1 on top of a 0 is only 99% as strong a magnetization as a 1 on
top of another 1, then you only have to determine which strength it
is closer to. Since it is a permanent recording, not a transient
one-time signal, you can sample it over and over to weed out the noise.

> To complicate things
>even more, we have to consider spindle motor jitter, and different
>coding schemes. Because of these it is not valid to assume that one
>new bit only affects one old bit. Depending on coding and spindle
>jitter, one transition can affect as many as 4-5 bits (EEPRML). It is
>going to be a mess to unwind the observed pattern to obtain the old
>pattern. And if you consider more than two layers....

I'm assuming this will depend on how well the bits/bytes are sync'd
on top of the old ones. But I don't believe that a misalignment could
make recovery much more difficult. And if the encoding system is
designed so that it can tolerate the expected variation in rotation
timing, then the "old signal" of physical magnetization can be
reconstructed into the actual data bits that it encodes.

>I guess the only way to convince me is to show me a successful
>experiment where they actually recover overwritten data :)

I don't know if this has ever been done publically. I first heard of
the possibility many years ago. I believe it was in the manual for
Norton Utilities, which included a program to overwrite files, unused
space, or entire disks multiple times with data patterns designed to
obliterate the old traces. I believe it mentioned government standards
for the secure erasure of sensitive information.

This may have been more a concern for floppies, and disk technology may
have shifted in ways that makes this more difficult than it was thought
to be back then, but I certainly wouldn't count on that. Whether or
not you'll be able to do this for commercial data-recovery is another
matter.

-Rob Parker


Frank Montmarquet

unread,
Apr 28, 1997, 3:00:00 AM4/28/97
to

parker_rob wrote:
>
.> Thor Arne Johansen (th...@ibas.no) wrote:
.> >parker_rob wrote:
.> >>
.> >> Writing a 0 on top of a 1 gives a weaker result than writing a 0
on top
.> >> of another 0. With a sensitive head and equipment that can see
that
.> >> difference, you could read data that was underneath, perhaps even
several
.> >> layers back. Ordinary drive controllers probably couldn't do this
(they
.> >> are only supposed to care about what is *currently* there), but it
is
.> >> supposedly possible with the right equipment.
.> >>
.>
This problem was discussed several months ago on this group. Do a
search. I fellow from IBM claimed that a commercialy avalible machine
was produced that could go 3 layers deep, and a non-commercialy avalible
machine 7 layers!

> >There are at least two problems with this: First it assumes LINEARITY,
> >which is not a valid assumption under these circumstances. Second,
> >even if the linearity assumption holds, we will see a signal about
> >30dB weaker than a fresh signal. Knowing that the drives operate on
> >the limits as far as density vs. S/N goes, I think it will be very
> >hard to reliably detect data.
>
> It can't operate on the limit of detectibility. It has to be reliable
> so that the disk will provide the correct data back after it was written.
> You don't have much opportunity for retransmission (ie rewriting the
> data if there was a problem); getting close enough that errors are at
> all likely starts to impact performance, and I/O rate is more of a
> limiting factor than storage capacity (you can always add more disks
> if you need them). Also remember that normal operation expects the
> data to be read in a single pass. Someone attempting to recover data
> that was overwritten might easily make many passes over the same data
> if it helps to read the magnetization with greater precision.
>

When I was in collage, a while back, I noticed that there was an NMR
(nuclear magnetic reasonance) machine running night and day in one of
the labs. It was hooked up to a PDP8. The output spectrum on the tube
look like good white noise. I soon found out that someone was trying to
obtain a C13 NMR spectrum of an organic compund using only the natural
amounts of C13. By running the scan millions of times over several
months the noise canceled and the signal came through, a good specturum
was obtained. The same could be done with a disk at the limits of
detectability.

aa-2@no_spam.deltanet.com@deltanet.com

unread,
Apr 29, 1997, 3:00:00 AM4/29/97
to

In <336461B2...@ibas.no>, Thor Arne Johansen <th...@ibas.no> writes:

>parker_rob wrote:
>>
>>
>> Writing a 0 on top of a 1 gives a weaker result than writing a 0 on top
>> of another 0. With a sensitive head and equipment that can see that
>> difference, you could read data that was underneath, perhaps even several
>> layers back. Ordinary drive controllers probably couldn't do this (they
>> are only supposed to care about what is *currently* there), but it is
>> supposedly possible with the right equipment.
>>
>> You're not trying to pick out a faint signal lost behind a strong one.
>> With the formatting of disks, the bits are pretty much in exactly the
>> same place, so you only have to read the exact signal level with enough
>> precision to distinguish whether the value you read was overwritten on
>> a 0 or on a 1.
>>
>> -Rob Parker

>
>Again, I'm not saying it is impossible, I'm just saying it's damn hard.
>
>People keep saying it is just a matter of detecting the current bits,
>then
>figuring out what the *ideal* waveform for those bits is, and
>subtracting
>the ideal waveform from the actual waveform, and voila, the old pattern
>remains.

As I recall, the DoD spec for sanitizing disks which contained "secret"
material was:

1. Write all locations with 0
2. Write all locations with 1
3. Write all locations with random data, and leave it there.

As I recall, for "top secret" data, physical destruction of the disk
was required.


Roger Fleming

unread,
Apr 29, 1997, 3:00:00 AM4/29/97
to

rpa...@loc3.tandem.com (parker_rob) pondered long before writing:


>
>Thor Arne Johansen (th...@ibas.no) wrote:
>>parker_rob wrote:

[...]


>capable of. I believe this is well within their abilities. I don't know
>if the technique has been used for any commercial data-recovery.

I don't know if it has been used for commercial data recovery - it would
probably have to be _very_ valuable data, and that is _usually_ backed up 9
ways to Sunday.
[...]


>It can't operate on the limit of detectibility. It has to be reliable
>so that the disk will provide the correct data back after it was written.

No, not at all. There are applications where probabilistic recovering of
some bits would be very useful indeed to an attacker. Suppose you left virtual
memory on and your 96 bit 3 Way key got written to a swapfile. Sensibly, you
overwrite free space afterward. If an attacker finds 75% of the bits with a
16% error rate, and loses 25% of them altogether, he can optimize a brute
force attack to an effective strength of about 42 bits. Enough to make an
infeasible attack quite practical.

[...]


>>I guess the only way to convince me is to show me a successful
>>experiment where they actually recover overwritten data :)
>
>I don't know if this has ever been done publically. I first heard of

>the possibility many years ago. [...]

I am searching my archives for the reference now, but I am pretty sure someone
posted the (successful) results of such an attempt to this newsgroup.

Hal Murray

unread,
Apr 30, 1997, 3:00:00 AM4/30/97
to

In article <5k43j2$4...@falcon.pacit.tas.gov.au>, ro...@nospam.police.tas.gov.au (Roger Fleming) writes:

> >>I guess the only way to convince me is to show me a successful
> >>experiment where they actually recover overwritten data :)
> >
> >I don't know if this has ever been done publically. I first heard of

> >the possibility many years ago. [...]
>
> I am searching my archives for the reference now, but I am pretty sure someone
> posted the (successful) results of such an attempt to this newsgroup.


This may be urban legend by now, but I'm pretty sure that the early work
in this area was done at IBM many years ago. Probably over 10.


I don't know enough about disk physics to know if modern disks would
be easier or harder to recover data from.

Clearly if you write once it will be very hard for normal hardware and
software to recover anything at all. Writing several passes makes
it less likely that it will be recovered in a fancy lab.

But modern disks are complicated. Maybe the old data is still on
a track that went bad and was remapped. Maybe the old data was
written off to the side of the track just a bit... [20 years ago,
disks had an option to try reading with a slight offset in order
to recover data from tracks that had gone bad.]

If there is/was anything really really important on a disk, I'd send it
to the crusher rather than trade it in for a replacement.

Phil Ekstrom

unread,
Apr 30, 1997, 3:00:00 AM4/30/97
to

In article <33649c1...@nntp.netcruiser> sew...@netcom.ca (John Savard) writes:
>From: sew...@netcom.ca (John Savard)
>Subject: Re: Question: recovering overwritten data
>Date: Mon, 28 Apr 1997 12:50:30 GMT

>Thor Arne Johansen <th...@ibas.no> wrote:

>>John Savard wrote:
>>> The media is not the bottleneck, since we're talking about looking at


<Lots of speculation snipped>

Very good, guys, but it is speculation I hear. An urban legend, perhaps?

I repeat my question, asked here earlier: Does anyone know of anyone who will
DO any of this for a legitimate purpose for money? I looked hard some time
ago and did not find anyone. There are lots of legitimate applications for
the technology, and lots of data recovery houses. But does anyone know of...

Phil Ekstrom


Brooks Hilliard

unread,
Apr 30, 1997, 3:00:00 AM4/30/97
to

James Lee wrote:
>
> A friend of mine argues that it's impossible to retrieve data
> from magnetic storage media (i.e., a hard drive) if that data has
> already been overwritten, but I've heard of cases in which faint
> magnetic traces could be detected with special equipment in order
> to do just this. I know this isn't really the right place for this
> to go, but it's sort of (if you stretch real hard) on topic.
>
> I'll rephrase: Is it possible for someone to access plaintext from
> a hard drive even if I've explicitly overwritten each bit of the file
> with some pseudo random bit? If so, about how many writes would be
> appropriate to erase the data completely?
>
> Thanks,
>
> --
> James Lee <jl...@math.purdue.edu> - http://www.math.purdue.edu/~jlee
> Key Fingerprint = 2F 06 48 C4 8F 80 C1 43 4C A7 AA 6A 8C C9 40 19
> PGP mail encouraged - Key ID: 1024/222952E9 - Finger for public key

I had an assignment approximately two years ago where recovery of data
was involved, and some of the data that was desired to be recovered
had been overwritten. On one machine, the disk had been low-level
formatted.

A seven-figure embezzlement was involved which included some US Govt.
funds . . . but not DoD.

At the time, I contacted every data recovery service, every law
enforcement official and every industry expert that I could find. I
also posted to several newsgroups (but not this one). None of the
data recovery services claimed to be able to recover overwritten data.
Several others whom I reached claimed that they knew someone who had
recovered overwritten data, but in every instance when I contacted the
person referred to, they said it was not them, but someone else . . .
who I then contacted, etc. In short, after several months of
investigation I was never able to find ANYONE who had actually done
it.

I did, however, run up against some national security "walls" and was
led to believe that the federal spooks have equipment to do it but it
was not available for any assignment other than "national security" .
. . civilian agency funds recovery did not qualify. I was also told
that the cost of the recovery would be (to the best of my
recollection) about $50K/100MB per layer of overwriting and that the
level of accuracy decreases by 50% at each layer. In other words, if
what I learned is to be trusted (and I kind of doubt that it is) 50%
of the bits would be wrong in the first overwritten layer (the same as
guessing) 75% in the second layer, etc.

Personally, I'm very skeptical. It's all very well and good to talk
about the theory of how it might be done but, in the (slightly
modified) words of Jerry Maguire: "Show me the BITS!"

I've never seen it done and don't believe it can be done.


John Savard

unread,
Apr 30, 1997, 3:00:00 AM4/30/97
to

EKS...@PACIFICRIM.NET (Phil Ekstrom) wrote:

>I repeat my question, asked here earlier: Does anyone know of anyone who will
>DO any of this for a legitimate purpose for money? I looked hard some time
>ago and did not find anyone. There are lots of legitimate applications for
>the technology, and lots of data recovery houses. But does anyone know of...

No, I don't. But, urban legend or no, the question of whether this
could theoretically be done, for illegitimate purposes, or by major
governments, does have implications for data security.

John Savard

Paul Howard

unread,
Apr 30, 1997, 3:00:00 AM4/30/97
to

bro...@bizauto.com (Brooks Hilliard) writes:

> I was also told that the cost of the recovery would be (to the best
> of my recollection) about $50K/100MB per layer of overwriting and
> that the level of accuracy decreases by 50% at each layer. In other
> words, if what I learned is to be trusted (and I kind of doubt that
> it is) 50% of the bits would be wrong in the first overwritten layer
> (the same as guessing)

You can stop right there! After one layer, you've spent $50K for
every 100MB of totally useless data. (Unless you have some way of
knowing which 50 percent of the bits are right.)

> 75% in the second layer, etc.

No, it'll still be 50 percent in the second layer, I hope!

-- Paul

--
Paul G. Howard, AT&T Labs - Research, Holmdel NJ USA

SimGraphics

unread,
May 1, 1997, 3:00:00 AM5/1/97
to

In article <33681c06...@lh2.rdc1.az.home.com>,
Brooks Hilliard <bro...@bizauto.com> wrote:
>. . civilian agency funds recovery did not qualify. I was also told

>that the cost of the recovery would be (to the best of my
>recollection) about $50K/100MB per layer of overwriting and that the
>level of accuracy decreases by 50% at each layer. In other words, if
>what I learned is to be trusted (and I kind of doubt that it is) 50%
>of the bits would be wrong in the first overwritten layer (the same as
>guessing) 75% in the second layer, etc.
>
>Personally, I'm very skeptical.

Personaly, I've seen it done. On an MFM disk drive with modified Western
Digital controller. The drive was, if I remember correctly, Seagate ST251
(40MB) where you get analog signals on one of the cables between the drive
and the controller. The controller was a full lenght ISA board, the type
of which I don't recall.

The mods to the setup included:
1) modified microcode for the controller CPU. It was in the socketed
EPROM chip. The microprogram had less than 4K instructions and the
controller CPU was some well documented general purpose microcontroller
chip, for which disassembler was already available.
2) jumpering the drive and controller to allow seeking by 1/16 of a track
instead of full tracks. This was, BTW, a standard (albeit undocumented)
feature of Seagate drives.
3) connections to the Tektronix digital storage oscilloscope.

This was an university project, no money changed hands.

As far as reliability of recovered bits I don't recall any numbers.
Some recollections:
-) outer tracks were easier to recover.
-) it was particularly easy to recover from overwrite made when the
drive and platter were still cold, right after powering the disk up.
-) multipass computer analysis of the data gathered with storage
oscilloscope.
-) for successfull recovery one had to have access to the actual
controller(s) used to write to the disk. The analysis software
used calibration data gathered from performing test recordings
of known patterns.

Boy, those were the days. Do you still remember wondering whether
you parked disk heads correctly?

Sylvester

Warlockd

unread,
May 1, 1997, 3:00:00 AM5/1/97
to Roger Fleming

Ok then, what about an "External" force..

For example: My professor uses a bulk-vidiotape-eraser <you know, the
thing that
you plug in to the walk and waves over you vidio tapes and it 'magicly'
erases them?>
to erase his excess computer disks. Is it even consivable to try to
read them?
What if I used it on the hard drive, would it damage it beyond repair or
could I just
reformat it? Are disk low res enough to do a data-recovery scheam as
you described?

I am curious about this because I have about 10 hard drives in my room I
want to
get rid of <aka, selling> but don't want to see if I left important data
on them,
one by one:)


Roger Fleming wrote:
> Estimates vary wildly, because we don't really know. One oft quoted standard
> specifies overwriting once with 0s, once with 1s, then a third time with
> (IIRC) pseudo random data. As Peter Gutman points out in his rather in depth
> analysis of this subject (sorry, can't remember where it lives), things like
> RLL encoding make it a definitely non-trivial task to translate this simple
> software model into what actually happens on the drive. It also depends
> on the type of drive. Peter came up with a procedure that is pretty likely to
> produce very good security on nearly all drives; it requires not 3, or 7, but
> 31 overwrites, and very specifically calculated overwrites at that.
>

Tom Womack

unread,
May 1, 1997, 3:00:00 AM5/1/97
to

Brooks Hilliard (bro...@bizauto.com) wrote:

[interesting detective story]
[Mr Hilliard meets the spooks]

: I was also told


: that the cost of the recovery would be (to the best of my
: recollection) about $50K/100MB per layer of overwriting and that the
: level of accuracy decreases by 50% at each layer. In other words, if
: what I learned is to be trusted (and I kind of doubt that it is) 50%
: of the bits would be wrong in the first overwritten layer (the same as
: guessing) 75% in the second layer, etc.

... so 99% of the bits would be wrong by the seventh layer. So just invert
all the bits to get a 1% error rate. I don't think the statement makes
very much sense (unless it goes p(bit inverted) = 1/2 ( 1-1.5^(-layers)) or
similar)

I'm not an engineer, I don't know how I'd do it, I believe it might be
possible.

Anil Das

unread,
May 1, 1997, 3:00:00 AM5/1/97
to

bro...@bizauto.com (Brooks Hilliard) writes:
> I was also told
> that the cost of the recovery would be (to the best of my
> recollection) about $50K/100MB per layer of overwriting and that the
> level of accuracy decreases by 50% at each layer. In other words, if
> what I learned is to be trusted (and I kind of doubt that it is) 50%
> of the bits would be wrong in the first overwritten layer (the same as
> guessing) 75% in the second layer, etc.

If you select a random series of bits, you would expect
50% of them to match whatever data was on disk. So what
are you saying here?

If you are saying that only 50% of the bits can
be recovered reliably in the first layer and this rate
goes down by halffor each further layer, you will have 75%
bits correct in the first layer, 62.5% bits correct in the
second layer and so forth.

--
Anil Das

Roger Fleming

unread,
May 2, 1997, 3:00:00 AM5/2/97
to

Warlockd <warl...@netdot.com> pondered long before writing:

>Ok then, what about an "External" force..

>For example: My professor uses a bulk-vidiotape-eraser [...]


>to erase his excess computer disks. Is it even consivable to try to
>read them?

I don't know how thorough a job these things do, but it certainly
should be possible to make one which would completely erase _everything_. The
field it generates affects the entire surface of the disk, and could penetrate
throughout the depth of the medium. Every hysteresis cycle knocks a few more
dB (probably tens of dB) from the signal; and it does hundreds of cycles.

>What if I used it on the hard drive, would it damage it beyond repair or

>could I just reformat it? [...]

It depends on whether your drive, and/or BIOS, supports low level
formatting. Many, these days, do not. In that case, you'll be hosed. I'd
also be a little nervous about the powerful fields generated by such
gadgets inducing dangerous overvoltages in the on board chips. Personally, I
wouldn't try this. But if you do, be sure to tell us how you go!!

[...]


>I am curious about this because I have about 10 hard drives in my room I
>want to get rid of <aka, selling> but don't want to see if I left important data
>on them, one by one:)

I take it these are not installed, and you don't want to bother with putting
them back in the machine. Well, surely even a small old drive is worth the few
minutes that will take, rather than risk wrecking it. And your buyers will
probably want to see the thing working.
As for actually getting rid of the data, a format will probably do (you should
do this anyway when selling hard disks, to avoid accidentally selling licensed
software). Only if the data (possibly including old swapfiles) is potentially
quite valuable to some hostile person do you need to worry about more
complicated methods.

----------------------------------------------------------------------

Brooks Hilliard

unread,
May 2, 1997, 3:00:00 AM5/2/97
to

On Wed, 30 Apr 1997 20:59:40 GMT, Paul Howard <p...@research.att.com>
wrote:

>You can stop right there! After one layer, you've spent $50K for
>every 100MB of totally useless data. (Unless you have some way of
>knowing which 50 percent of the bits are right.)
>

>> 75% in the second layer, etc.
>

>No, it'll still be 50 percent in the second layer, I hope!

I agree with you . . . just repeating what I was told back a couple of
years ago. I didn't ask for any explanation of what the 50% referred
to; perhaps it was 50% of the "bytes" that it was claimed would come
out okay. The price was too high for my client to consider even if
the accuracy was 100%.

Brooks Hilliard

unread,
May 2, 1997, 3:00:00 AM5/2/97
to

On Thu, 1 May 1997 07:32:21 GMT, si...@netcom.com (SimGraphics) wrote:

>Personaly, I've seen it done. On an MFM disk drive with modified Western
>Digital controller. The drive was, if I remember correctly, Seagate ST251
>(40MB) where you get analog signals on one of the cables between the drive
>and the controller. The controller was a full lenght ISA board, the type
>of which I don't recall.

> . . .

Sylvester--

Thank you for responding to my post. You are the first person I've
corresponded with who has actually seen a recovery of overwritten
data. Although I have no client requirement to do such a recovery
today, I would be very interested to get a contact to the person or
lab that did it.

>Boy, those were the days. Do you still remember wondering whether
>you parked disk heads correctly?

Indeed I do . . . in fact, I'm afraid I go back a bit farther than
that (to the days of drum memory storage).

Regards, Brooks


SimGraphics

unread,
May 3, 1997, 3:00:00 AM5/3/97
to

In article <336b1661...@lh2.rdc1.az.home.com>,

Brooks Hilliard <bro...@bizauto.com> wrote:
>today, I would be very interested to get a contact to the person or
>lab that did it.

Hi Brooks!

I forgot to mention one thing: I was studying in Warsaw, Poland. The guy
who showed us the demonstation was either from Warsaw University of
Technology or Lodz University of Technology. I can't recall his last
name, his first name or nickname was "Samuel". Sorry, I have no more
details. The demo was held in the "Institute of Semiconductor Technologies
and Optoelectronics" in Warsaw University of Technology, around 1987.

Sylvester

Brooks Hilliard

unread,
May 5, 1997, 3:00:00 AM5/5/97
to

Sylvester--

On Sat, 3 May 1997 06:35:17 GMT, si...@netcom.com (SimGraphics) wrote:
>I wrote:
>> I would be very interested to get a contact to the person or
>>lab that did it.
>
>Hi Brooks!
>
>I forgot to mention one thing: I was studying in Warsaw, Poland. The guy
>who showed us the demonstation was either from Warsaw University of
>Technology or Lodz University of Technology. I can't recall his last
>name, his first name or nickname was "Samuel". Sorry, I have no more
>details. The demo was held in the "Institute of Semiconductor Technologies
>and Optoelectronics" in Warsaw University of Technology, around 1987.

Believe me, I don't doubt what you saw . . . but from a practical
standpoint it is the same "dry hole" I've turned up in the past. I
*really* would like to find someone I can talk to, who's actually read
an over-written section of a disk.

BH

SimGraphics

unread,
May 6, 1997, 3:00:00 AM5/6/97
to

In article <33705ccd....@lh2.rdc1.az.home.com>,

Brooks Hilliard <bro...@bizauto.com> wrote:
>Believe me, I don't doubt what you saw . . . but from a practical
>standpoint it is the same "dry hole" I've turned up in the past. I
>*really* would like to find someone I can talk to, who's actually read
>an over-written section of a disk.

Hi Brooks!

I'm sorry to disappoint you. I've checked the web sites for the Warsaw
University of Technology, and I could not locate Samuel Grabski
(or Grabowski). http://www.imio.pw.edu.pl and http://www.elka.pw.edu.pl.

Re: "dry hole". I know the feeling. Once upon a time I was trying to
research software security (a.k.a. software license enforcement). I
couldn't find anyone to talk to. So I did some research (with the
debugger and oscilloscope) myself. I was very surprised next time
I tried to talk to the same people. All of sudden I was "in", people
were very willing to exchange the information with me.

I don't know anything about the disk recovery business. But I know
a lot about the business of selling snake oil for software security.
It is VERY DIFFICULT to get certain type of information. There may be
same social/psychological factors involved in your pursuit as there
were in mine.

Look at what Peter Guttman had to go through to get a better look
at hardware security devices.

Sylvester

Dave Schulman

unread,
May 7, 1997, 3:00:00 AM5/7/97
to

Will Simmons wrote:
>
> In article <5k331e$q...@gazette.loc3.tandem.com>,
> rpa...@loc3.tandem.com (parker_rob) wrote:
>
[SNIP]
>
> Based on discussions with folks who do government work (admittedly not
> personal knowledge), hard drives containing classified information are
> physically destroyed to prevent data recovery for reasons you have
> adverted to in this thread.

Worth checking out in this context: the so-called "Green Book" of
the Department of Defense's "Rainbow Book" series, now somewhat dated.
Its document number is NCSC-TG-025, and its official title is _Department
of Defense Trusted Computer System Security Evaluation Criteria: A Guide
to Understanding Data Remanence in Automated Information Systems_.

You can get a free copy of this publication by writing to:

INFOSEC Awareness Division
ATTN: Y13
Fort George G. Meade, MD 20755-6000

Or call: (410) 684-7661 / (800) 688-6115

Some of the Rainbow Books (the Orange Book, at least) are available
under URL ftp://ftp.cert.org/pub/info/

--
Dave Schulman Validation Engineer, FTI
Nortel, Inc. Dept. 3K57 (ESN = 263)
400 Perimeter Park Drive (919) 905-4844 (Voice)
Morrisville, NC 27560 (919) 905-2549 (FAX)

Will Simmons

unread,
May 7, 1997, 3:00:00 AM5/7/97
to

In article <5k331e$q...@gazette.loc3.tandem.com>,
rpa...@loc3.tandem.com (parker_rob) wrote:

>I don't know if this has ever been done publically. I first heard of
>the possibility many years ago. I believe it was in the manual for
>Norton Utilities, which included a program to overwrite files, unused
>space, or entire disks multiple times with data patterns designed to
>obliterate the old traces. I believe it mentioned government standards
>for the secure erasure of sensitive information.
>
>This may have been more a concern for floppies, and disk technology may
>have shifted in ways that makes this more difficult than it was thought
>to be back then, but I certainly wouldn't count on that. Whether or
>not you'll be able to do this for commercial data-recovery is another
>matter.
>
> -Rob Parker

Based on discussions with folks who do government work (admittedly not


personal knowledge), hard drives containing classified information are
physically destroyed to prevent data recovery for reasons you have
adverted to in this thread.

Some commercial programs, citing "DOD requirements," provide multiple
overwrite capability (Symantec's MacTools is one with which I am familiar).
However, it is my understanding that physical destruction of hard drives
is the only approved course where important information is concerned.
Presumably someone with hands-on experience can confirm this.

In this connection, the media took note what a fine job of hard disk data
recovery was done in Col. Oliver North's case. Perhaps he and Ms. Hall
shied away from destroying federal property in their somewhat bizarre
situation.

-- Will --

Kaz Kylheku

unread,
May 8, 1997, 3:00:00 AM5/8/97
to

In article <AF9677909...@192.0.2.1>,

Will Simmons <wsim...@world.std.com> wrote:
>In article <5k331e$q...@gazette.loc3.tandem.com>,
>rpa...@loc3.tandem.com (parker_rob) wrote:
>
>>I don't know if this has ever been done publically. I first heard of
>>the possibility many years ago. I believe it was in the manual for
>>Norton Utilities, which included a program to overwrite files, unused
>>space, or entire disks multiple times with data patterns designed to
>>obliterate the old traces. I believe it mentioned government standards
>>for the secure erasure of sensitive information.
>>
>>This may have been more a concern for floppies, and disk technology may
>>have shifted in ways that makes this more difficult than it was thought
>>to be back then, but I certainly wouldn't count on that. Whether or
>>not you'll be able to do this for commercial data-recovery is another
>>matter.
>>
>> -Rob Parker
>
>Based on discussions with folks who do government work (admittedly not
>personal knowledge), hard drives containing classified information are
>physically destroyed to prevent data recovery for reasons you have
>adverted to in this thread.
>
>Some commercial programs, citing "DOD requirements," provide multiple
>overwrite capability (Symantec's MacTools is one with which I am familiar).

Here is UNIX hack for doing the same thing. It requires the BSD random()
functionality which is not common to all systems.

You just run this program as ``shred <number> [ list of files ]'',
and it will wipe them one by one the given number of times. The first
pass is all zero bits, the second pass is all 1 bits, and then it
starts using a random number generator. For this reason, the number of
passes must be at least 3.

/*
* ``Shred''
* A simple file shredding program
* by Kaz Kylheku
* <k...@cafe.net>
*/

#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

#define HAVE_RANDOM /* remove if no random() or initstate() */
/* to get the lame standard C rand() */

static void usage(void);
static void shred(char *name, long iter);

int main(int argc, char **argv)
{
long int iter;
char *endptr;

if (!*++argv) {
usage();
exit(EXIT_FAILURE);
}

iter = strtol(*argv++, &endptr, 0);

if (*endptr != '\0') {
usage();
exit(EXIT_FAILURE);
}

if (iter < 3) {
fputs("the count should be at least 3\n", stdout);
exit(EXIT_FAILURE);
}

for (; *argv; argv++)
shred(*argv, iter);

return 0;
}

void usage(void)
{
char *text =
"usage: shred <iterations> [ ... [ [ file_1 ] file_2 ] ... file_n ]\n";
fputs(text, stdout);
return;
}

void shred(char *file, long iter)
{
off_t len, writ;
int fd = open(file, O_RDWR | O_SYNC);
char state[256];
char buf[1024];
long i;
int j;
long random(void);
char *initstate(unsigned, char *, int);

if (fd < 0) {
perror(file);
return;
}
len = lseek(fd, 0, SEEK_END);
if (len < 0) {
perror(file);
return;
}

#ifdef HAVE_RANDOM
initstate(getpid(), state, 256);
#else
srand(getpid());
#endif


for (i = 0; i < iter; i++) {
lseek(fd, 0, SEEK_SET);
writ = len;
do {
switch (i) {
case 0:
memset(buf, 0, sizeof buf);
break;
case 1:
memset(buf, UCHAR_MAX, sizeof buf);
break;
default:
for (j = 0; j < sizeof buf; j++)
#ifdef HAVE_RANDOM
buf[j] = random() % (UCHAR_MAX + 1);
#else
buf[j] = rand() % (UCHAR_MAX + 1);
#endif
break;
}
write(fd, buf, sizeof buf);
writ -= sizeof buf;
} while (writ > 0);
}
}

Jerry Leichter

unread,
May 9, 1997, 3:00:00 AM5/9/97
to Kaz Kylheku

> Here is UNIX hack for doing the same thing. It requires the BSD
> random() functionality which is not common to all systems....

Don't trust this program to do anything useful! It simply loops over
the file repeatedly, writing blocks of 0's, then 1's, then random bytes,
using the Unix write() function.

However, Unix disk I/O is all buffered. When write() returns, all you
know is that the data you wrote has been copied into the block buffer
cache. Unix will write the data "eventually" - when it needs to free up
some cache blocks, or every couple of minutes. However, if you write to
the same block again before it leaves the cache - as is almost certain
with this program for all but files larger than the block buffer cache,
which is in the multi-megabyte range from contemporary systems - the new
data will simply replace the old in the cache. What will almost
certainly happen when you run this program is that you'll write 0's,
then 1's, then a bunch of random values *to the buffer cache*, and only
the *last* random values you wrote will actually make it to the disk.
Fine if your point was to erase the RAM holding the buffer cache!

The program could be improved by calling fsync() on the file after each
pass of shred(). fsync() forces stuff out of the buffer cache. For a
directly-connected disk, that means sending the data out to the disk
controller. For a disk accessed through NFS, it means sending the
blocks to the system the disk is mounted on. Unfortunately, it does
*not* mean forcing *that* system to write the data to the disk. In
fact, there is *no* way I know of to over-ride any buffering done by the
remote system, hence no way to ensure that you're doing anything other
than exercising the RAM in the server. Some server even have special
hardware (battery-backed memory) the allow them to put off writing stuff
to the physical disk for a *long* time.

Even for a locally connected system, there are many ways for this kind
of program to fail. The best the system can do, on an fsync(), is
forward the data to the disk controller. Disk controllers these days
are intelligent devices with extensive on-board cache. They may well
hold on to the data for a while - at least until the heads are on the
right track, which is a couple of milliseconds. For a small file - one
containing a key, for example - a program like this could easily do
multiple passes, multiple fsync()'s, sent the data to the controller
multiple times - all while the controller is caching the data, waiting
for the heads to get to the right position. Again, what you'll be
writing to repeatedly is some RAM, not the disk surface.

Finally, all this assumes that the repeated writes will really go to the
same block! In a log-based file system, every time you write to a
block, you get a whole new, clean block. The old one sticks around,
quite unchanged, until a file system cleaner process gathers it up for
re-use.

In summary: Nice idea, but a wasted effort; with modern systems, it's
simply impossible to write a user-level program that truely erases the
blocks of a file. If you really need this feature, you'll have to find
a system that supports it, with at least special driver calls that force
stuff out to the disk, often with special functions in the controller
(accessed by special driver calls) that ensure that what you wrote
really and truely makes it to the disk surface when you think it does.

-- Jerry

0 new messages