There are other ways of making a message difficult to decrypt. Perhaps
the best is to use a space-compression program on it first. When
decrypting a message, people (and computers) take advantage of
redundancy (sp?) in a file -- the same redundancy which space
compression programs remove. Using compress(1) before using crypt(1) or
a DES-based system is an excellent idea, as long as the person at the
other ends knows to uncompress(1) the message afterwards.
(By the way, speaking of crypt(1) -- its been terribly broken. There's
a program at MIT by Bob Baldwin that breaks even small samples of
crypt(1) generated cyphertext. Don't use crypt(1))
Both of the above systems for making a encrypted message harder to
decode -- randomly adding information and compressing the message
first -- are classified under modifying the encryption algorythim.
Whenever using a crypto-system, you've got to assume that the bad guy
is in posession of the entire algorythim and the complete cypthertext.
The trick is to make the encryption good enough that, as long as the
key is unknown, it is not possible to decrypt the file.
We "believe" that DES does this, although nobody knows for sure
(nobody that is allowed to tell, that is).
Using pseudorandom number generators for encryption is another
interesting idea. I wrote an article on that once, a long time ago,
but it was never published (to my knowledge). There are more secure
systems, however.
Simson L. Garfinkel
MIT Media Lab
Summer: Weizmann Institute of Science, Israel
As you (sim...@wisdom.bitnet) said, adding random bits or compressing
data really amounts to a modification to the encryption algorithm. The
Typical Assumption of cryptographic system design is that the enemy
knows the algorithm. The only secret is the key and the plaintext.
Sometimes, the pattern of the original plaintext (layout) may be public
knowledge also -- this is quite realistic since *everybody* likes to
use standard formats (file formats, source level indentation style,
etc). Occasionally, this criterion is weakened even more to allow the
opponent to have a few copies of matching plaintext available (a mole,
perhaps -- or a user who knows what his *own* entry looks like). The
strength of a cryptographic system is determined by the probability of
having the opponent discover the key given x amount of ciphertext.
The main method used to crack simple block ciphers is frequency
analysis. Using data compression will certainly help. The problem is
that the compression key for the file must be transmitted also, and it
usually resides at a standard place in a standard format -- more
information for the opponent.
One method that I considered for block ciphers is as follows: assuming
that data expansion is okay, choose as a function of the key a mapping
from the space of ciphertext to plaintext such that the map is
surjective and (the important part) the number of ciphertext blocks
that map onto each plaintext block approximates the frequency of that
plaintext block. (This approximation can be made as close as desired
by allowing greater expansion.) This induces an equivalence class
relation on the space of ciphertext blocks. The inverse "function"
will map plaintext blocks to equivalence classes. Now, when we
encrypt, we apply this function to the plaintext block to obtain an
equivalence class. From this equivalence class *randomly* chose a
representative (use true random hw).
Immune to frequency analysis, no special location for uncompress keys.
Comments?
-Bsy
--
-Bsy
Um, actually, there may be a serious problem here: compress(1)ed files
start with stereotyped header information that could offer a cryptanalyst
the opportunity for a known-plaintext attack. Something needs to be done
about that before generally recommending this approach (which is a good
one otherwise).
--
EDEC: Stupidly non-standard
brain-damaged incompatible Henry Spencer @ U of Toronto Zoology
proprietary protocol used. {allegra,ihnp4,decvax,pyramid}!utzoo!henry
This technique has routinely been used in both ciphers and codes for
centuries. It doesn't make them unbreakable, but it does make it harder.
Computerizing eliminates some of the weaknesses of manual methods, notably
the tendency to pick ciphertext versions systematically. (In particular,
virtually all codes had multiple ciphertext equivalents for "STOP", but
it was all too common for code clerks to memorize one or two of them to
speed things up. Cryptanalysts could sometimes tell when a new code clerk
arrived because of a sudden surge of new groups for "STOP".)
--
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,decvax,pyramid}!utzoo!henry
Interesting idea, it just so happens that a project around here may be
using something very similar to that. The scheme is using Lempal Ziv
compression to compress the text, resulting in essentially random
information, with redundent text removed. The output of that is then
fed through a 256 byte translate table. Thus both sides must have only
the translate table to communicate back and forth. Does anyone know
how resistent to cracking this method would be? It hasn't been
implemented yet, so if someone finds a hole in it, we'd like to know
about it soon.
>
> (By the way, speaking of crypt(1) -- its been terribly broken. There's
> a program at MIT by Bob Baldwin that breaks even small samples of
> crypt(1) generated cyphertext. Don't use crypt(1))
Interesting, is that program commonly available?
> Simson L. Garfinkel
> MIT Media Lab
> Summer: Weizmann Institute of Science, Israel
--
Kenneth Ng:
Post office: NJIT - CCCC, Newark New Jersey 07102
uucp(for a while) ihnp4!allegra!bellcore!argus!ken
!psuvax1!cmcl2!ciap!andromeda!argus!ken
*** WARNING: NOT k...@bellcore.uucp ***
bitnet(prefered) k...@njitcccc.bitnet or k...@orion.bitnet
Spock: "Captain, you are an excellent Starship Captain, but as
a taxi driver, you leave much to be desired."
Kirk: "What do you mean, 'if both survive' ?"
T'Pow: "This combat is to the death"
The problem I can see with the system is that it is possible for a person to
both decode and encode a message from both sides. This means that all that
is really needed to find out how the translation table works is for someone
to feed a know set of text into the translator, then look at the output.
Once the translation is broken, it would probably be easy to break the
(de)compression system by using the same technique as the above.
My favorite system for any set up is still adding a random number to each
letter in the message. This also will break up most of the common character
occurence methods of breaking. Of course a repeatable sequence of randoms
is needed. A random number generator that does not repeat for >10,000 loops
would work fairly well as long as the key is changed often.
--
The Wumpus UUCP: {seismo,allegra,decvax}!rochester!ur-tut!aptr
BITNET: aptrccss@uorvm
Disclaimer: "Who? When? Me? It was the Booze!" - M. Binkley
It depends on how private your L-Z compressor is. The translate table
is a monoalphabetic substitution; by itself, any bright 8th-grader could
break it. Your only hope for security is to make sure that nobody knows
the details of the format that your compressor is producing. Using the
standard "compress" utility is a very bad idea in this context. You want
to avoid programs which put magic numbers, fixed-format headers, or
compression tables of a standardized form at the beginning (end, etc.) of
the compressed data. I don't know enough about the details of L-Z to be
sure if such things can be avoided, but I have my doubts.
A good data-compression algorithm is a fine preliminary to encryption,
since it fouls up the frequency information royally. But by itself,
DATA COMPRESSION IS NOT ENCRYPTION. Your security still depends mainly
on the encryption step that follows. In particular, your security
depends *entirely* on the encryption step if you make the standard
assumption that a would-be snooper knows your basic algorithms. I am
not an expert in this area, but even from my limited knowledge, I would
emphatically recommend that you use something less flimsy than a simple
substitution as your encryption scheme. I am not kidding when I say
that a bright 8th-grader could break that.
One idea you might consider would be your proposal as described,
followed by the Unix crypt(1) command. Crypt(1) by itself is fatally
insecure, and methods for breaking it have been published... but they
aren't trivial, the programs for doing them aren't widely available, and
putting translated compressed data rather than ASCII plaintext in underneath
should make life a lot harder for the nosy. The compression removes the
frequency information and the translation would obscure compression's
headers somewhat. Don't forget to change keys often, and to change BOTH
the translation table and the crypt(1) key. (Also, if you are protecting
something important, get expert advice [not just mine] first.)
How true indeed!!! However, there is a (undocumented) flag to get around
this problem. It was put in for compatibility with a very old version of
compress, but you just found a good use for it:
-n Suppress 3-byte header on the compressed file
It must be specified to BOTH compress(1) and uncompress(1).
Compressed data is very good for encryption, because it contains almost no
redundancy, and no constant data (with the header gone). Unlike Huffman
coding, there is no compression table in the data at all.
regards,
joe
--
Full-Name: Joseph M. Orost
UUCP: ihnp4!vax135!petsd!joe
ARPA: vax135!petsd!joe@BERKELEY
Phone: (201) 758-7284
Location: 40 19'49" N / 74 04'37" W
US Mail: MS 313; Concurrent Computer Corporation; 106 Apple St
Tinton Falls, NJ 07724
The LZ-compress header exists only to communicate information about
the data file (number of bits in the compression and algorithm) that
aren't interesting in this context (they can be sent by some other
secure channel or hard-wired into the encryption/decription code).
One interesting thing about LZ -- one bad bit in the file may garble
the entire remainder of the file. This, in itself, should make the
codebreaker's work harder.
Martin Minow
decvax!minow
The compress program's output is more compressed as it goes along, so that
the beginning is in practically plaintext. I think this is what an earlier
poster meant when referring to a header weakness, not just the 3-byte header.
(Other compression programs tend to have large headers with compression
tables in them.)
--
Scot E. Wilcoxon Minn Ed Comp Corp {quest,dicome,meccts}!mecc!sewilco
45 03 N 93 08 W (612)481-3507 {{caip!meccts},ihnp4,philabs}!mecc!sewilco
Laws are society's common sense, recorded for the stupid.
The alert question everything anyway.
It IS?? Here is an od -c of the first part of the replied-to article, put
through compress:
0000000 037 235 220 F 344 274 i 003 202 \r 235 : x 302 204 030
0000020 S ' L 031 031 ! 322 240 q 003 207 F 210 6 e 306
0000040 214 \t 1 247 314 235 4 l 306 274 001 321 344 215 033 020
0000060 A 352 234 001 021 003 007 \b 030 9 t 304 260 241 c 206
0000100 \f 226 9 p 330 P 245 \f 233 0 y Z X ) #
0000120 g N 032 223 : @ 330 ! j 324 $ \b ! d 270
0000140 210 001 c * \b 033 / d 320 x 201 c 306 016 020 F
0000160 351 224 001 A 207 216 034 2 c \ T 251 2 004 212 002
0000200 ( a 350 240 I Z 366 354 F 203 \b 025 2 t \b Q
0000220 " E 213 030 5 r 364 \b R 244 202 200 003 223 v 374
0000240 030 362 \r 220 300 i 327 266 005 201 b 212 H : 212
0000260 270 \0 q 245 360 033 < & S ( p 342 q 316 031 201
0000300 u 340 314 I 352 246 \f 035 027 c 344 344 201 C G 301
0000320 224 : b 324 d 244 223 264 g R 9 a 334 220 031 310
0000340 & 017 210 0 d 310 244 q 263 R L 032 : s ^ 210
0000360 311 # v 016 e 221 m 340 310 ) 3 307 ( s 262 e
0000400 360 ` 026 S 306 314 233 355 312 270 211 = 233 316 Q
0000420 7 242 233 p 237 023 346 L 231 026 I 210 $ 345 Q 243
0000440 306 214 307 031 E 306 026 024 > ( @ D \ e $ U
0000460 222 033 , 260 204 C 013 ) 235 321 202 N / 305 4 S
...
Now if someone can show how this is "plaintext" I'd be mildly interested...
> I think this is what an earlier
>poster meant when referring to a header weakness, not just the 3-byte header.
>(Other compression programs tend to have large headers with compression
>tables in them.)
>--
>Scot E. Wilcoxon Minn Ed Comp Corp {quest,dicome,meccts}!mecc!sewilco
>45 03 N 93 08 W (612)481-3507 {{caip!meccts},ihnp4,philabs}!mecc!sewilco
> Laws are society's common sense, recorded for the stupid.
> The alert question everything anyway.
--
------------------------------- Disclaimer: The views contained herein are
| dan levy | yvel nad | my own and are not at all those of my em-
| an engihacker @ | ployer or the administrator of any computer
| at&t computer systems division | upon which I may hack.
| skokie, illinois |
-------------------------------- Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
go for it! allegra,ulysses,vax135}!ttrdc!levy
The primary use of this idea was to enable communication across a
finite bandwidth channel and to make the data reasonably secure.
It'll probably never carry national security secrets. The L-Z
compression combined with the translate table seems to be a reasonably
fast way to both speed up transmission and somewhat encrypt the
information. Thank you for your advice.
> Henry Spencer @ U of Toronto Zoology
> {allegra,ihnp4,decvax,pyramid}!utzoo!henry
--
Kenneth Ng:
Post office: NJIT - CCCC, Newark New Jersey 07102
uucp(for a while) ihnp4!allegra!bellcore!argus!ken
!psuvax1!cmcl2!ciap!andromeda!argus!ken
*** WARNING: NOT k...@bellcore.uucp ***
bitnet(prefered) k...@njitcccc.bitnet or k...@orion.bitnet
Please resend any mail between 10 Aug and 16 Aug:
the mailer broke and we had billions and billions of
bits scattered on the floor.
True, unless you figure out which bit is wrong the receiver may never
be able to decrypt the rest of the message. Now if you were to
purposely invert every XXX bit, hmmmm.
The compress program's output is more compressed as it goes along, so that
the beginning is in practically plaintext. I think this is what an earlier
The compress program's output is more compressed as it goes along, so that
It is. The characters have been expanded from 8 to 9 bits. The ninth bit
is on the 'left' of the other eight, and is shifted on the right of the
following byte. The ninth bit is used as compression begins.
>
>Now if someone can show how this is "plaintext" I'd be mildly interested...
Here's an expansion of the first two lines of the example, where "^" points
at the bits above, "." and the character point at bits below, and "---" are
just spacing. (Or just read the translation along left margin)
>0000000 037 235 220 F 344 274 i 003 202 \r 235 : x 302 204 030
(blocked, 16 bits)
0000000 037 235 220 01000110
F ^^^^^^^^---F
11100100
r ^^^^^^^---r.
10111100
o ^^^^^^---o..
01101001
m ^^^^^---m...
00000011
^^^^--- ....
10000010
l ^^^---l.....
00001101
t ^^---t......
10011101
u ^---u.......
00111010
01111010
x ^^^^^^^^---x
11000010
a ^^^^^^^---a.
10000100
! ^^^^^^---!..
00011000
c ^^^^^---c...
>0000020 S ' L 031 031 ! 322 240 q 003 207 F 210 6 e 306
0000020 01010011
u ^^^^---u....
00100111
a ^^^---a.....
01001100 -------
e ^^---e......
00011001
2 ^---2.......
00011001
00100001
! ^^^^^^^^---!
11010010
i ^^^^^^^---i.
10100000
h ^^^^^^---h..
01110001
n ^^^^^---n...
00000011
p ^^^^---p....
10000111
4 ^^^---4.....
01000110
! ^^---!......
10001000
m ^---m.......
00110110
01100101
e ^^^^^^^^---e
11000110
c ^^^^^^^---c
Since I did that by hand, that's quite enough. Apparently the article got
to levy@ttrdc through the path mecc!ihnp4!cuae2!ltuxa!ttrdc. Ttrdc is
apparently not running a version of news which puts the local site name
in saved messages.
I suspect that a lot could be done to decrypt the output, if the cryptanalyst
knew that "compress" was being used on the plaintext.
--
John Gilmore {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu jgil...@lll-crg.arpa
May the Source be with you!