MITM detection in the browser

John Nagle

unread,

May 30, 2016, 3:44:25 PM5/30/16

to dev-secur...@lists.mozilla.org

We need general, automatic MITM detection in HTTP.

It's quite possible. An MITM attack has a basic quality that makes it
detectable - each end is seeing different crypto bits for the same
plaintext. All they have to do is compare notes.

There are out-of-band ways to do this, such as certificate pinning and
certificate repositories. But these haven't achieved much traction.

Doing it in-band is difficult, but possible. An early system, for one of
the Secure Telephone Units (STU), displayed a 2-digit number to the user
at each end, based on the crypto bits. The users were supposed to
compare these numbers by voice, and if they matched, they were probably
not having a MITM attack. An MITM attacker would need to fake the voices
of the participants to break that.

This is the insight that makes MITM detection possible. You can force
the MITM to have to tell a lie to convince the endpoints. More than
that, if you work at it, you can force the MITM to have to tell an
*arbitrarily complex* lie. You can even force the MITM to have to tell a
lie about the future traffic on the connection. That means they have to
take over the entire conversation and fake the other end.

As an example, suppose a server sending a page sends, at the beginning
of the page, a hash value which is based on the contents of the page
about to be sent, and also based on the first 64 bytes of the crypto
bits of the connection. The browser checks this. The MITM attacker now
has a problem. If the attacker didn't know about this, the MITM attack
immediately sounds an alarm at the browser. If the attacker does know
about this, they can compute their own hash. But they haven't seen the
content the hash covers, because the page hasn't been transmitted yet.

So the attacker either has to buffer up the entire page before they can
send any of it, or fake the page based on some source like a cache.
Buffering up the entire page adds delay. The server can add to that
delay by deliberately stalling for some seconds before sending the last
few bytes of the page. If the MITM attack adds 10 seconds before every
page begins to load, it's obvious what's happening. The browser could
even check this; if the first byte of the page doesn't appear within N
seconds, don't display it.

Faking the page is a lot of work, especially if it's customized. A
cache won't be enough. Users will notice if they get a generic page
instead of their personal social network page.

This would be a good feature to add to HTTP2, because it has one
persistent connection which, once validated, is good for many pages.
With HTTP2, you could have one validation stream with delays
running in parallel with other streams.

Nobody seems to be doing enough with in-band MITM detection. There's
[1], but that requires "previously established user authentication
credentials." Facebook has a scheme which relies on MITM attackers not
knowing how to MITM Flash content.[2] That's a form of security through
obscurity, but it does detect most attacks at the proxy and hostile WiFi
level.

Should Mozilla be active in this area?

John Nagle

[1] http://www.cc.gatech.edu/~traynor/papers/dacosta-esorics12.pdf
[2]
http://www.scmagazine.com/researchers-detect-ssl-mitm-attacks-method-implemented-by-facebook/article/346994/

Peter Gutmann

unread,

May 31, 2016, 3:45:56 AM5/31/16

to na...@animats.com, dev-secur...@lists.mozilla.org

John Nagle <na...@animats.com> writes:

>As an example, suppose a server sending a page sends, at the beginning of the
>page, a hash value which is based on the contents of the page about to be
>sent, and also based on the first 64 bytes of the crypto bits of the
>connection. The browser checks this. The MITM attacker now has a problem. If
>the attacker didn't know about this, the MITM attack immediately sounds an
>alarm at the browser. If the attacker does know about this, they can compute
>their own hash. But they haven't seen the content the hash covers, because
>the page hasn't been transmitted yet.

That's actually really clever, a web-enabled commitment scheme, but one that
takes advantage of the master secret to avoid having to use ZKPs and other
complications. It's a bit like a flipped version of what some broadcast
protocols like TESLA do (based on, AFAIK, Anderson et al's Guy Fawkes
protocol, "A New Family of Authentication Protocols") where you send the data
and MAC but withhold the key, in your case you send the MAC (with key
implicitly shared) but withhold the data.

You should post this to e.g. the cryptography list to see what people think...
maybe we could call it Nagle's Algorithm...

Peter.

Richard Z

unread,

May 31, 2016, 8:18:44 AM5/31/16

to John Nagle, dev-secur...@lists.mozilla.org

On Mon, May 30, 2016 at 12:44:05PM -0700, John Nagle wrote:
> We need general, automatic MITM detection in HTTP.
>
> It's quite possible. An MITM attack has a basic quality that makes it
> detectable - each end is seeing different crypto bits for the same
> plaintext. All they have to do is compare notes.
>
> There are out-of-band ways to do this, such as certificate pinning and
> certificate repositories. But these haven't achieved much traction.
>
> Doing it in-band is difficult, but possible. An early system, for one of the
> Secure Telephone Units (STU), displayed a 2-digit number to the user at each
> end, based on the crypto bits. The users were supposed to compare these
> numbers by voice, and if they matched, they were probably not having a MITM
> attack. An MITM attacker would need to fake the voices of the participants
> to break that.

VoIP/ZRTP does something very similar.

> This is the insight that makes MITM detection possible. You can force the
> MITM to have to tell a lie to convince the endpoints. More than that, if
> you work at it, you can force the MITM to have to tell an *arbitrarily
> complex* lie. You can even force the MITM to have to tell a lie about the
> future traffic on the connection. That means they have to take over the
> entire conversation and fake the other end.
>

> As an example, suppose a server sending a page sends, at the beginning of
> the page, a hash value which is based on the contents of the page about to
> be sent, and also based on the first 64 bytes of the crypto bits of the
> connection. The browser checks this. The MITM attacker now has a problem.
> If the attacker didn't know about this, the MITM attack immediately sounds
> an alarm at the browser. If the attacker does know about this, they can
> compute their own hash. But they haven't seen the content the hash covers,
> because the page hasn't been transmitted yet.
>

> So the attacker either has to buffer up the entire page before they can send
> any of it, or fake the page based on some source like a cache. Buffering up
> the entire page adds delay. The server can add to that delay by deliberately
> stalling for some seconds before sending the last few bytes of the page. If
> the MITM attack adds 10 seconds before every page begins to load, it's
> obvious what's happening. The browser could even check this; if the first
> byte of the page doesn't appear within N seconds, don't display it.

clever, although in practice I think many pages dont take 10s to load and
thus many times the caching delay attack could work. Also many servers
don't have full advance control what they will send to the user - on the
fly generated content is just as unpredictable for them as it is for a potenital
attacker. There are otf delivered ads, tag clouds, social buttons...

But if you would go ahead with that - could it be done so that the hash is
cryptografically signed and can be "saved" together with the page?
This would improve on another major shortcommings of TLS - that the user
can never prove he received some content from say his online bank.

> Nobody seems to be doing enough with in-band MITM detection. There's [1],
> but that requires "previously established user authentication credentials."

seems fair for situations where demands on security are very high.

Imho OpenPGP has some very strong points and I would love to see PGP over http.
With manual out of band verification it is much more secure than relying on
the integrity of the 200+ TLS CAs, while for the lower-risk situations key-ring
signing may be a good alternative. Also, with PGP even if a MITM attack would
succeed temporarilly there is a much better chance to prove anytime later that
the attack happened which may make certain attack scenarios less interesting.

>
> [1] http://www.cc.gatech.edu/~traynor/papers/dacosta-esorics12.pdf
> [2] http://www.scmagazine.com/researchers-detect-ssl-mitm-attacks-method-implemented-by-facebook/article/346994/

> _______________________________________________
> dev-security-policy mailing list
> dev-secur...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-security-policy

Richard

--
Name and OpenPGP keys available from pgp key servers

Peter Kurrasch

unread,

Jun 1, 2016, 3:18:54 PM6/1/16

to John Nagle, mozilla-dev-s...@lists.mozilla.org

It's an interesting idea but not without some issues. ‎Essentially you are proposing a mechanism for in-band-data-over-TLS to determine if the end-to-end encryption has been compromised, correct? (I'm deliberately avoiding the term MITM as I think it carries extra baggage that is distracting for now.)

I think having a "preamble code" (for lack of a better term?) as mentioned could be difficult for sites that are heavy on the dynamic content--they'd have to buffer up the final page then hash it anyway. If the server can do it, so could any MITM appliance. I think a "postamble code" is the way to go.

Also, ‎any detection method that relies on timing would have to be a non-starter almost out of necessity. Propagation of data throughout the internet is wacky enough that it would be extremely difficult to get down a timing model that works in all cases.

I also took a look at the ref [1] you provided. An interesting idea as well, though wildly impractical. I can't imagine that any of the top 1000 sites on the internet could even implement such a thing because they tend to have pages that pull in data from far flung corners of the world. I don't think any but the most trivial sites would ever work. I also have a problem with the salted password that is mentioned. I don't think it's as secure as would be wanted.

So the key to making something like this work is to figure out the algorithm for producing "the code" to use. Obviously it has to incorporate knowledge of the TLS data but then what else can be used in a secure manner? The password idea from [1] is clever, but if we realize it won't work what is a good alternative?

I hope you don't feel I'm trying to discourage more thought on this. My intention is only to offer a way to look at this that might help focus additional work and conversation.

From: John Nagle

Sent: Monday, May 30, 2016 2:44 PM

To: dev-secur...@lists.mozilla.org‎

Reply To: na...@animats.com

Subject: MITM detection in the browser

We need general, automatic MITM detection in HTTP.

It's quite possible. An MITM attack has a basic quality that makes it
detectable - each end is seeing different crypto bits for the same
plaintext. All they have to do is compare notes.

There are out-of-band ways to do this, such as certificate pinning and
certificate repositories. But these haven't achieved much traction.

Doing it in-band is difficult, but possible. An early system, for one of
the Secure Telephone Units (STU), displayed a 2-digit number to the user
at each end, based on the crypto bits. The users were supposed to
compare these numbers by voice, and if they matched, they were probably
not having a MITM attack. An MITM attacker would need to fake the voices
of the participants to break that.

This is the insight that makes MITM detection possible. You can force
the MITM to have to tell a lie to convince the endpoints. More than
that, if you work at it, you can force the MITM to have to tell an
*arbitrarily complex* lie. You can even force the MITM to have to tell a
lie about the future traffic on the connection. That means they have to
take over the entire conversation and fake the other end.

As an example, suppose a server sending a page sends, at the beginning
of the page, a hash value which is based on the contents of the page
about to be sent, and also based on the first 64 bytes of the crypto
bits of the connection. The browser checks this. The MITM attacker now
has a problem. If the attacker didn't know about this, the MITM attack
immediately sounds an alarm at the browser. If the attacker does know
about this, they can compute their own hash. But they haven't seen the
content the hash covers, because the page hasn't been transmitted yet.

So the attacker either has to buffer up the entire page before they can
send any of it, or fake the page based on some source like a cache.
Buffering up the entire page adds delay. The server can add to that
delay by deliberately stalling for some seconds before sending the last
few bytes of the page. If the MITM attack adds 10 seconds before every
page begins to load, it's obvious what's happening. The browser could
even check this; if the first byte of the page doesn't appear within N
seconds, don't display it.

Faking the page is a lot of work, especially if it's customized. A
cache won't be enough. Users will notice if they get a generic page
instead of their personal social network page.

This would be a good feature to add to HTTP2, because it has one
persistent connection which, once validated, is good for many pages.
With HTTP2, you could have one validation stream with delays
running in parallel with other streams.

Nobody seems to be doing enough with in-band MITM detection. There's
[1], but that requires "previously established user authentication

credentials." Facebook has a scheme which relies on MITM attackers not
knowing how to MITM Flash content.[2] That's a form of security through
obscurity, but it does detect most attacks at the proxy and hostile WiFi
level.

Should Mozilla be active in this area?

John Nagle

John Nagle

unread,

Jun 1, 2016, 3:40:04 PM6/1/16

to Peter Kurrasch, dev-secur...@lists.mozilla.org

On 06/01/2016 12:18 PM, Peter Kurrasch wrote:
> It's an interesting idea but not without some issues. ‎Essentially you
> are proposing a mechanism for in-band-data-over-TLS to determine if the
> end-to-end encryption has been compromised, correct?

I'm suggesting that it's time to look for one. I'm not recommending
what I proposed; that's just a proof of concept.

> I think having a "preamble code" (for lack of a better term?) as
> mentioned could be difficult for sites that are heavy on the dynamic
> content--they'd have to buffer up the final page then hash it anyway. If
> the server can do it, so could any MITM appliance. I think a "postamble
> code" is the way to go.

If you do this entirely as a postamble, an attacker can also do it
without increasing delays. They just copy the page as it is
transmitted, then replace the postamble. The idea is to force the
attacker to hash data it hasn't seen yet, or fake data that it can hash.

> Also, ‎any detection method that relies on timing would have to be a
> non-starter almost out of necessity. Propagation of data throughout the
> internet is wacky enough that it would be extremely difficult to get
> down a timing model that works in all cases.

That's partly why I was thinking HTTP2. It might be possible to
authenticate the HTTP2 connection, which contains multiple streams,
in parallel with other data transfer. A nice browser feature would
be to inhibit input to password fields and cookie responses until
authentication is complete.

> I also took a look at the ref [1] you provided. An interesting idea as
> well, though wildly impractical. I can't imagine that any of the top
> 1000 sites on the internet could even implement such a thing because
> they tend to have pages that pull in data from far flung corners of the
> world. I don't think any but the most trivial sites would ever work. I
> also have a problem with the salted password that is mentioned. I don't
> think it's as secure as would be wanted.

I'm not recommending that; I just wanted to properly cite other
work on the problem.

> So the key to making something like this work is to figure out the
> algorithm for producing "the code" to use. Obviously it has to
> incorporate knowledge of the TLS data but then what else can be used in
> a secure manner? The password idea from [1] is clever, but if we realize
> it won't work what is a good alternative?

There are at least three approaches that can work:

- A prior shared secret. The password-based approach mentioned and
certificate pinning are in this category.

- Some separate data channel that is not compromised in
real time by the same party compromising the main channel.
The Facebook trick with a Flash-based channel is one such.
That's vulnerable once the attacker knows about it.

- Timing constraints that force the attacker to try to predict
future content. That's what I discussed.

What I want to do is to get people thinking about this as a
solveable problem. There's a general impression that MITM attacks are
fundamentally not detectable. But that's not actually the case.
It's hard, but not impossible. The key concept here is that an
MITM attacker can be forced to to jump through hoops to make the
attack work, and it may be possible to make those hoops so hard to
jump through that it can't be done.

I'm posting this because CA problems such as the Symantec/Blue Coat
cert are becoming more common. Blocking those when found is like
signature-based virus detection - it protects only against old attacks.
A better technical solution is needed.

John Nagle

Eric Mill

unread,

Jun 1, 2016, 6:12:15 PM6/1/16

to na...@animats.com, dev-secur...@lists.mozilla.org, Peter Kurrasch

On Wed, Jun 1, 2016 at 3:39 PM, John Nagle <na...@animats.com> wrote:

>
>> What I want to do is to get people thinking about this as a
> solveable problem. There's a general impression that MITM attacks are
> fundamentally not detectable.

That might be a general impression among the community, but there are
people working on it. There is a "channel binding" standard that generates
a keypair that is bound to a particular TLS session [1], and this made it
into U2F:

https://fidoalliance.org/fido-technotes-channel-binding-and-fido/

Which means that the U2F token can include its "view" (the fingerprint of
the TLS session key seen by the client) of the TLS session in the package
it signs and sends up to the server. If the server's TLS session has a
different fingerprint, then it follows that there's an intermediary session
in the middle.

The "token binding" working group referenced there is where effort is being
directed now:
https://datatracker.ietf.org/wg/tokbind/documents/

Not saying it fully meets your goals, just pointing out that there is
active work on this issue, and that versions of this work have been
deployed in production via U2F.

> I'm posting this because CA problems such as the Symantec/Blue Coat
> cert are becoming more common.

As discussed on another thread on this mailing list[2], the Symantec / Blue
Coat cert wasn't a CA problem. It was disclosed by Symantec, included in
their audits, and Symantec's possession of the key didn't give Blue Coat
powers to generate certificates for domains not under their control.

-- Eric

[1] https://tools.ietf.org/html/rfc5929
[2]
https://groups.google.com/d/msg/mozilla.dev.security.policy/akOzSAMLf_k/Y1D4-RXoAwAJ

>
> John Nagle

>
> _______________________________________________
> dev-security-policy mailing list
> dev-secur...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-security-policy
>

--
konklone.com | @konklone <https://twitter.com/konklone>

Jakob Bohm

unread,

Jun 17, 2016, 8:17:28 AM6/17/16

to mozilla-dev-s...@lists.mozilla.org

I'm a bit late to the party, but here is a variant that works with
dynamic pages and is backwards compatible, even without HTTP2 or other
persistent connections:

1. In the HTTP response headers, the server sends a new extra header
whose value is
Base64(TLS-session-mac-algorithm-mac(lcase(url)|Base64(random258-bit
value),
fixed-key-derivation-function(current-session-binding-token-string or
master secret
depending-on-tls-version). This must be sent by the server less than 5
seconds after the applicable GET request, even if later parts of the
HTTP header have not yet been computed (for example, it can be sent
before the Content-Length has been computed.

2. If the HTTP response uses an unspecified length (chunked encoding),
a trailer "header field" is added that holds Base64(random258-bit
value) as its value.

3. If the HTTP response uses a specified length (and thus cannot supply
a HTTP trailer), the sequence "MITM_DETECT_MAGIC_STRING"|
Base64(random258-bit value)|"MITM_DETECT_MAGIC_STRING" must occur
anywhere within the last 1024 bytes of the response (with the last such
pattern being the one that counts). This can be done invisibly to
humans for many file formats, including HTML, XML, PNG etc.

In most cases the server would also delay transmission of the tail
holding the random value while relying on the rendering engine being
able to complete the entity display without this last bit of data.

The trick here is that the random value cannot be predicted by the
MITM, yet the server can generate it trivially without knowing the
dynamic page elements. Also the HTML compatibility rules make the page
show normally in browsers that don't look for the MITM detection data.

Keeping around 33 bytes of random data or a 43 character string during
generation of each page is also a low and affordable server runtime
cost, as is the transmission of less than 150 extra bytes per page (43
char hash, 43 char random value, two HTML headers or 1 HTML header + 2
magic strings + tags to hide magic string).

Another trick is that this work can be done in a web server or SSL
accelerator rather than in a page generator such as PHP, which would
generally not be trusted to access the TLS secrets of the session and
might not return their first byte of output during the first 5
seconds. Similarly, the checking can be done in the HTTP(S) client
code which is less exposed to scripting based attacks than the
rendering engine and other DOM parts of the browser.

Final limitation is that the browser needs to know out-of-band the
absence of MITM detection headers is normal or a sign of an MITM
removing those headers.

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

ene...@develop-project.ru

unread,

Jun 23, 2016, 7:20:13 PM6/23/16

to mozilla-dev-s...@lists.mozilla.org

On Friday, June 17, 2016 at 3:17:28 PM UTC+3, Jakob Bohm wrote:
> The trick here is that the random value cannot be predicted by the
> MITM, yet the server can generate it trivially without knowing the
> dynamic page elements. Also the HTML compatibility rules make the page
> show normally in browsers that don't look for the MITM detection data.

MITM can always generate his own random tail, and target user will never find out.