Message Confidentiality and Integrity - Prepend Hash versus Append Hash

221 views
Skip to first unread message

Jeffrey Walton

unread,
Aug 8, 2008, 4:57:20 PM8/8/08
to
Hi All,

Given a message m, are the two below equivalent (prepending a hash
versus appending a hash):

E_k( h(m) || m ) and E_k( m || h(m) )

Handbook of Applied Cryptography tells me to use E_k( m || h(m) ) to
achieve confidentiality and integrity [1].

In practice, appending the hash is more difficult (in terms of
programming logic) if the message is not on a BLOCKSIZE boundary,
since we have to deal with 'tail bytes' before padding.

Things become much simpler if I can perform E_k( h(m) || m ) since
h(m) is always on a BLOCKSIZE boundary.

Jeff

[1] Online book: http://www.cacr.math.uwaterloo.ca/hac/, p. 365.

karl malbrain

unread,
Aug 8, 2008, 5:20:59 PM8/8/08
to

It doesn't make any difference which end you put the hash onto -- the
two parts of the message are disjoint. N.b it's hard to imagine how
to avoid padding the message to a BLOCKSIZE boundary.

karl m

David Wagner

unread,
Aug 8, 2008, 6:20:56 PM8/8/08
to
Jeffrey Walton wrote:
>Given a message m, are the two below equivalent (prepending a hash
>versus appending a hash):
>
>E_k( h(m) || m ) and E_k( m || h(m) )
>
>Handbook of Applied Cryptography tells me to use E_k( m || h(m) ) to
>achieve confidentiality and integrity [1].

It looks to me like this might be a (very) rare case where
HAC is just wrong. I wouldn't recommend using E_k( m || h(m) );
it is insecure against chosen-plaintext attacks.

Attack: I choose two messages x,y arbitrarily, let M = x || h(x) || y,
and trick the sender into sending M. He will send the ciphertext
C = E_k( x || h(x) || y || h(M) )
Then I truncate the ciphertext C right after the point where the last
bit of h(x) is encrypted, to get a new ciphertext c. In many encryption
schemes, the truncation of a ciphertext decrypts to a corresponding
truncation of the original message. In this case, if I send c to the
final recipient, the recipient will decrypt c = E_k( x || h(x) ) to
get x with a valid hash, think that x is legitimate, and conclude that
the sender wanted to send x. This is a violation of message integrity,
because the sender only approved transmission of M but the receiver
thought x was approved.

In general, if you don't know anything special about the encryption
algorithm E_k(), you shouldn't use either of those options.

My advice: Always use both encryption + a MAC anywhere that
you think you need encryption or confidentiality.
Encrypt-then-authenticate is a standard approach that is generically
secure (i.e., you don't have to worry about bad interactions between
the encryption and MAC algorithm, assuming each is secure on its own).
In particular, in encrypt-then-authenticate, you compute C = E_k(m),
then you transmit C || MAC_k'(C), where k,k' are independent keys.

For some modes of operation E_k, E_k( h(m) || m ) may be secure, but
the details may be a bit tricky. If encrypt-then-authenticate is a
real problem and you need to use just a hash, it may be possible to
work out which choices of E_k are safe with this construction, if you've
got a cryptographer who is knowledgeable about this sort of thing. I
can't remember off the top of my head, unfortunately. But generally
speaking, for most projects, I'd say just use encrypt-then-authenticate
and spend your energy elsewhere.

>In practice, appending the hash is more difficult (in terms of
>programming logic) if the message is not on a BLOCKSIZE boundary,
>since we have to deal with 'tail bytes' before padding.
>
>Things become much simpler if I can perform E_k( h(m) || m ) since
>h(m) is always on a BLOCKSIZE boundary.

Understandable.

Last piece of advice: Writing your own encryption software, or
your own packet encryption code, is normally something it is better
to avoid, where possible. I'd think really hard whether I could use
an existing time-tested standard (TLS? SSH? GPG? OpenVPN?) rather
than designing and implementing my own secure channel layer.

Jeffrey Walton

unread,
Aug 9, 2008, 1:29:53 PM8/9/08
to
Hi David,

Thank you very much.

> It looks to me like this might be a (very) rare case where
> HAC is just wrong. I wouldn't recommend using E_k( m || h(m) );
> it is insecure against chosen-plaintext attacks.

I'll have to revisit the section. Perhaps I took something out of
context.

> My advice: Always use both encryption + a MAC anywhere that
> you think you need encryption or confidentiality.

I wanted a MAC, but I'm not allowed to change the protocol at this
point.

> Encrypt-then-authenticate is a standard approach that is generically
> secure

Same problem as above... no signatures at this point.

> Writing your own encryption software, or your own packet encryption
> code, is normally something it is better to avoid, where possible.

Agreed. Unfortunately some networks - such as Mobitex - don't have the
infrastructure in place. I'm in that situation similar to 'those who
don't use TCP are bound to reinvent it with UDP'.

Jeff

Jeffrey Walton

unread,
Aug 9, 2008, 4:17:12 PM8/9/08
to
Hi Karl,

Are you claiming that neither construction is not adequate? When you
used the word 'disjoint', I am taking it to mean a situation similar
to a CBC/MAC combination where we use the the same key/iv for both
encryption and the MAC. In this case, the encrypted MAC is independent
of both plaintext and ciphertext [1].

>
> karl m-

Jeff

[1] HAC, Example 9.88, p. 367.

David Wagner

unread,
Aug 9, 2008, 5:51:01 PM8/9/08
to
Jeffrey Walton wrote:
>> It looks to me like this might be a (very) rare case where
>> HAC is just wrong. I wouldn't recommend using E_k( m || h(m) );
>> it is insecure against chosen-plaintext attacks.
>
>I'll have to revisit the section. Perhaps I took something out of
>context.

You know, that was my first reaction too: it's so rare for HAC
to goof that I wondered if I had missed something.

But I re-read the section, and if you took something out of context,
I can't see what it would be. Strange as it is, in this case it looks
to me like maybe HAC just missed something here. I guess you just got
lucky -- with that kind of luck, go buy a lottery ticket, or something.

Or maybe I'm missing some context, too. That's probably more likely
than an error in HAC...

>I wanted a MAC, but I'm not allowed to change the protocol at this
>point.

Uh oh. That sounds ominous. Can you elaborate why not? Perhaps
there are ways to deal with whatever restrictions you're under.

(If this is some silly bureaucratic proclamation from on high that
"must not be questioned", you have my sincerest sympathy.)

In any case, if you share with us the exact constraints you are under,
perhaps we can help you select a mode of operation that gets the best
security possible, given those constraints.

>> Encrypt-then-authenticate is a standard approach that is generically
>> secure
>Same problem as above... no signatures at this point.

FYI, in this context encrypt-then-authenticate refers to encrypting
then MACing the ciphertext (including the IV and everything else needed
for decryption) -- no signatures involved.

karl malbrain

unread,
Aug 9, 2008, 7:08:22 PM8/9/08
to
> [1] HAC, Example 9.88, p. 367.- Hide quoted text -
>
> - Show quoted text -

No, I'm just looking at the transmissison and encryption level. After
the hash is created from the message you just have to encrypt and get
both to the receiver and the order of transmission doesn't matter.
The hash can be the residue from CBC with an IV, or SHA-256, or
whatever. Of course if you use CBC to generate the hash and turn
around and reuse CBC for the encryption, you're going to have to
reprocess the incoming text in the same order you created the hash.

karl m

biject

unread,
Aug 9, 2008, 10:10:13 PM8/9/08
to

There is a way out.
I see bad that either approach has weaknesses. However this if you
first do a UNBWTS of (m || h(m))
then do your Encryption pass it would be much stronger than either
doing it before or after or seperately. My on choice would be to treat
message as bits for the UNBWTS but bytes would still work.

David A. Scott
--
My Crypto code
http://bijective.dogma.net/crypto/scott19u.zip
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"

Kristian Gjųsteen

unread,
Aug 10, 2008, 4:30:42 AM8/10/08
to
David Wagner <daw-...@taverner.cs.berkeley.edu> wrote:
>Jeffrey Walton wrote:
>>> It looks to me like this might be a (very) rare case where
>>> HAC is just wrong. I wouldn't recommend using E_k( m || h(m) );
>>> it is insecure against chosen-plaintext attacks.
>>
>>I'll have to revisit the section. Perhaps I took something out of
>>context.
>
>You know, that was my first reaction too: it's so rare for HAC
>to goof that I wondered if I had missed something.
>
>But I re-read the section, and if you took something out of context,
>I can't see what it would be. Strange as it is, in this case it looks
>to me like maybe HAC just missed something here. I guess you just got
>lucky -- with that kind of luck, go buy a lottery ticket, or something.
>
>Or maybe I'm missing some context, too. That's probably more likely
>than an error in HAC...

If we want to defend the HAC, my impression is that this is just one
example technique given with the understanding that may be useful. They
list at least one case where it fails: stream ciphers under known
plaintext attacks. If care is taken when formatting the message, it
might be reasonably secure using CBC mode. (I don't know, though.)

But the section is anyway unfortunate. I don't think it reflects today's
state of the art. While HAC is extremely valuable, it is eleven years
old. Some things have moved on.

--
Kristian Gjųsteen

Phil Carmody

unread,
Aug 10, 2008, 5:39:34 AM8/10/08
to
d...@taverner.cs.berkeley.edu (David Wagner) writes:
> Jeffrey Walton wrote:
>>Given a message m, are the two below equivalent (prepending a hash
>>versus appending a hash):
>>
>>E_k( h(m) || m ) and E_k( m || h(m) )
>>
>>Handbook of Applied Cryptography tells me to use E_k( m || h(m) ) to
>>achieve confidentiality and integrity [1].
>
> It looks to me like this might be a (very) rare case where
> HAC is just wrong. I wouldn't recommend using E_k( m || h(m) );
> it is insecure against chosen-plaintext attacks.
>
> Attack: I choose two messages x,y arbitrarily, let M = x || h(x) || y,
> and trick the sender into sending M. He will send the ciphertext
> C = E_k( x || h(x) || y || h(M) )
> Then I truncate the ciphertext C right after the point where the last
> bit of h(x) is encrypted, to get a new ciphertext c. In many encryption
> schemes, the truncation of a ciphertext decrypts to a corresponding
> truncation of the original message.

Truncation can surely be countered by including the length, e.g.

E_k( len(x) || H(x) || x )

?

Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

Jeffrey Walton

unread,
Aug 10, 2008, 4:14:19 PM8/10/08
to
Hi Kristian,

On Aug 10, 4:30 am, Kristian Gjøsteen <kristiag+n...@math.ntnu.no>
wrote:

Personally, I'd like to see a revised printing. Especially a good
evaluation of some of the recently proposed block cipher modes.

> --
> Kristian Gjøsteen-

Jeffrey Walton

unread,
Aug 10, 2008, 4:18:06 PM8/10/08
to
Hi Phil,

On Aug 10, 5:39 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
wrote:


> d...@taverner.cs.berkeley.edu (David Wagner) writes:
> > Jeffrey Walton  wrote:
> >>Given a message m, are the two below equivalent (prepending a hash
> >>versus appending a hash):
>
> >>E_k( h(m) || m ) and E_k( m || h(m) )
>
> >>Handbook of Applied Cryptography tells me to use E_k( m || h(m) ) to
> >>achieve confidentiality and integrity [1].
>
> > It looks to me like this might be a (very) rare case where
> > HAC is just wrong.  I wouldn't recommend using E_k( m || h(m) );
> > it is insecure against chosen-plaintext attacks.
>
> > Attack: I choose two messages x,y arbitrarily, let M = x || h(x) || y,
> > and trick the sender into sending M.  He will send the ciphertext
> >   C = E_k( x || h(x) || y || h(M) )
> > Then I truncate the ciphertext C right after the point where the last
> > bit of h(x) is encrypted, to get a new ciphertext c.  In many encryption
> > schemes, the truncation of a ciphertext decrypts to a corresponding
> > truncation of the original message.
>
> Truncation can surely be countered by including the length, e.g.
>
>   E_k( len(x) || H(x) || x )

I believe this may work, but in the end I liked Wagner's
recommendation of E_k( h(m) || m ). A few
reasons:
* len(x) requires additional research to evaluate the validity of the
construction
* len(x) is an additional step, which increases surface attack area
* len(x) is probably extraneous due to h(x) (if the construction
E_k( h(m) || m ) is sound)
* h(x) || x subverts the truncation
* h(x) || x provides additional mixing of the plaintext in CBC mode
* h(x) || x is easier to program in practice

>
> Phil
> --

Jeffrey Walton

unread,
Aug 10, 2008, 4:27:06 PM8/10/08
to
Hi David,

On Aug 9, 10:10 pm, biject <biject.b...@gmail.com> wrote:
>
> [SNIP]


>
>  There is a way out.
> I see bad that either approach has weaknesses. However this if you
> first do a UNBWTS of (m || h(m))
> then do your Encryption pass it would be much stronger than either
> doing it before or after or seperately. My on choice would be to treat
> message as bits for the UNBWTS but bytes would still work.

I haven't had any success with a CBC/MAC change. I don't believe they
will buy into anything that is not FIPS approved.

> David A. Scott
> --

Jeff


David Wagner

unread,
Aug 10, 2008, 7:06:17 PM8/10/08
to
Kristian Gjųsteen wrote:
>If care is taken when formatting the message, it
>might be reasonably secure using CBC mode. (I don't know, though.)

It's not secure against chosen-plaintext attacks, even using CBC
mode with reasonable formatting; I gave the attack in my original
post.

David Wagner

unread,
Aug 10, 2008, 7:08:08 PM8/10/08
to
Jeffrey Walton wrote:
>I haven't had any success with a CBC/MAC change. I don't believe they
>will buy into anything that is not FIPS approved.

CMAC is FIPS approved, and it's essentially a CBC-MAC done right.
It certainly has my recommendation: if it fits your needs, it has
excellent security properties, especially when used with AES.

http://en.wikipedia.org/wiki/CMAC
http://csrc.nist.gov/publications/nistpubs/800-38B/SP_800-38B.pdf

David Wagner

unread,
Aug 10, 2008, 7:14:21 PM8/10/08
to
Jeffrey Walton wrote:
>I believe this may work, but in the end I liked Wagner's
>recommendation of E_k( h(m) || m ).

I didn't recommend that; I recommended using a MAC.

This construction is also susceptible to truncation attacks in
some cases, e.g., if using CBC mode with an explicit IV. The attacker
chooses arbitrary messages x,y of appropriate length, and constructs
the message m = x || h(y) || y. The attacker persuades the legitimate
sender to encrypt and transmit the message m. The attacker transmits
c = E_k( h(m) || x || h(y) || y)
The attacker takes an appropriate suffix of the ciphertext,
corresponding to
c' = E_k(h(y) || y).
If the mode of operation is CBC mode with an explicit IV, and if x is
chosen of the right length, it is possible to choose such a suffix.
Then the attacker can block reception of c and replace it with c', and
the recipient will think that message y was intended, even though the
sender only authorized transmission of message x. That violates
message authenticity.

Whether this is a security risk or not depends upon the formatting
of these messages and the upper layers of the networking stack (e.g.,
how they use the plaintext that pops out of the decryption engine).
Nonetheless, at best it is a layering violation and a certificational
weakness, and at worst it is a security risk. There are alternative
constructions that do not have this shortcoming and that don't seem
to cost any more than this scheme. For this reason, I would normally
not recommend this scheme for general-purpose use. Unless there are
some special circumstances, I'd normally recommend sticking with the
tried-and-true approaches, e.g., Encrypt-then-Authenticate with a MAC
keyed independently from the encryption algorithm.

If there is some reason why the tried-and-true approaches are not
appropriate, then we'll need to see a detailed list of the actual
requirements.

biject

unread,
Aug 10, 2008, 7:28:43 PM8/10/08
to

Well if you stuck with dumb management which is a growing problem
then
just do what they think makes them happy and your better off not
really
caring if it works. Maybe someday you could work for the competition
and
get a chance to do it right.

Jeffrey Walton

unread,
Aug 10, 2008, 9:29:27 PM8/10/08
to
Hi David,

On Aug 10, 7:14 pm, d...@taverner.cs.berkeley.edu (David Wagner)
wrote:


> Jeffrey Walton  wrote:
> >I believe this may work, but in the end I liked Wagner's
> >recommendation of E_k( h(m) || m ).
>
> I didn't recommend that; I recommended using a MAC.

My apologies. I've read a lot of your posts over the years (including
those where you correct others in cases like this). I should have paid
better attention to detail.
>
> [ SNIP ]


>
> If there is some reason why the tried-and-true approaches are not
> appropriate, then we'll need to see a detailed list of the actual
> requirements.

It is not so much that they are not appropriate - it is more that
management is hoping that 'tweaking' an existing implementation with
flaws is less painful than implementing the proper implementation.
That is, it is felt the hash can be added and tested rather quickly.
Adding a MAC\HMAC would require adding more routines that include
additional key management. Unfortunately, adding key management for
MACing is not part of the projects [current] life cycle schedule. They
acknowledge the short coming and want to address it in the future.

The current threat model is not unlike what we would expect on the
internet. Unfortunately, SSL\TLS\IPSEC\VPN is not available on all
platforms where the company is doing business.

On the good side, the current implementations includes AES in CBC mode
and SHA-256 (since the move from MD5 was relatively painless).

In the end, I find it worrisome (at best) that there is derivation
from tried-and-true practices. In the case where the product does not
have SSL\TLS\IPSEC\VPN (sorry about our luck), we should not be
reinventing protocols. We should simply implement current best
practices. I feel a bunch of folks a lot smarter than me have spent a
lot of time thinking about the current best practices.

Jeff

David Wagner

unread,
Aug 10, 2008, 10:24:51 PM8/10/08
to
Jeffrey Walton wrote:
>It is not so much that they are not appropriate - it is more that
>management is hoping that 'tweaking' an existing implementation with
>flaws is less painful than implementing the proper implementation.
>That is, it is felt the hash can be added and tested rather quickly.
>Adding a MAC\HMAC would require adding more routines that include
>additional key management. Unfortunately, adding key management for
>MACing is not part of the projects [current] life cycle schedule. They
>acknowledge the short coming and want to address it in the future.

I'll share some counterarguments, even though I realize that
these are probably not the kinds of issues I can help with. You
can make of them what you will, or ignore them if they don't
address the real issues.

It's not obvious to me why a hash would be that much easier to add
and test than a MAC; I'd guess that either case amounts to make a call
to a cryptographic library. There's no need for any user-visible key
management burden; one can use standard key separation techniques to
derive both the encryption key and the MAC key from the connection key.
e.g., Where you would previously use the connection key K, you now derive
Ke = PRF(K, "encrypt")
Ka = PRF(K, "authenticate")
and encrypt with key Ke, then MAC the result using key Ka. There are
multiple ways to instantiate the PRF (e.g., AES; SHA1-HMAC). As a
result this change looks transparent to me, in the sense that it does not
change the interface to the encryption layer at all, it merely changes
the particular algorithms in use. I realize that this may require more
cryptographic knowledge to implement.

The good news is that the cryptography is rarely the weakest point
in most products, so even if the project fails to use the best known
cryptographic algorithms and practices, that may be far from the greatest
security risk overall.

Kristian Gjųsteen

unread,
Aug 11, 2008, 3:21:40 AM8/11/08
to

I realize this wasn't in any way clear, but by "careful formatting of
the message", I'm thinking about a really restrictive format, such as
a sequence of key=value pairs, where the key comes from a small set of
letter strings and the value is a string of digits.

The assumption would be that it is hard to find a message x such that h(x)
satisfies the formatting requirements. That's not very unreasonable, but
perhaps somewhat unorthodox. I haven't got a security proof, though, so
I won't promise security, but I think your attack fails, at least.

This is all nonsense, of course, using a plain hash is at most slightly
cheaper than using a MAC.

--
Kristian Gjųsteen

David Wagner

unread,
Aug 11, 2008, 12:42:53 PM8/11/08
to
Kristian Gjųsteen wrote:
>I realize this wasn't in any way clear, but by "careful formatting of
>the message", I'm thinking about a really restrictive format, such as
>a sequence of key=value pairs, where the key comes from a small set of
>letter strings and the value is a string of digits.
>
>The assumption would be that it is hard to find a message x such that h(x)
>satisfies the formatting requirements. That's not very unreasonable, but
>perhaps somewhat unorthodox.

OK, got it, now I see what you're after. Makes sense. Thanks.

Reply all
Reply to author
Forward
0 new messages