May update on esp8266 disassembly project

366 views
Skip to first unread message

Paul Sokolovsky

unread,
May 13, 2017, 8:18:48 AM5/13/17
to esp82...@googlegroups.com
Hello,

This is a routine update on the progress of ESP8266 reverse engineering
project, hosted at
https://github.com/pfalcon/xtensa-subjects/tree/master/2.0.0-p20160809

There're continued updates to the main disassembly listing,
https://github.com/pfalcon/xtensa-subjects/blob/master/2.0.0-p20160809/out.lst .
It continues to reduce in size, which is a sign that uninteresting raw
bytes continue to turn into more interesting ascii messages, etc.

The latest feature is cross-reference of what function accesses which
MMIO address (rendered so far only for BootROM, as usual):
https://pfalcon.github.io/xtensa-subjects/2.0.0-p20160809/mmio.html

This is similar to cross-reference done by Alex long ago:
http://esp8266-re.foogod.com/wiki/Memory_Accesses_(IoT_RTOS_SDK_0.9.9) .
Except they're indexed by the opposite criteria: mmio.html shows which
addresses each function accesses, while Memory_Accesses shows which
functions access each address. They are produced by different tools,
Alex' adhoc esp8266 hacking tool, vs my ScratchABlock tool which is a
generic program analysis/decompilation framework which has nothing to
do with esp8266 and is just applied to it on the occasion.

To show other difference, ScratchABlock tries to recover MMIO access
expressions, not just addresses. So, the link above shows for example:

uart_div_modify
['0x60000014 + (i16)0xf00 * (i16)$a2',
'0x60000020 + (i16)0xf00 * (i16)$a2']

Which should give a good hint that there's more than 1 UART block,
spaced at 0xf00, and indeed we know that's true. That's the whole
point, ScratchABlock should be a great helper for forgetful vendors,
which throw their hardware at the unsuspecting children of the Earth,
but forget to release the documentation, together with the proofs that
their hardware doesn't eavesdrop, steal, or kill. Now (well, not
now, it's ongoing process, started long ago, and not by me) children of
the Earth won't need to bother forgetful vendors with stupid questions,
but will be able to find out the answers themselves. It's a win-win
situation.


--
Best regards,
Paul mailto:pmi...@gmail.com

Vlad Ivanov

unread,
May 24, 2017, 3:35:53 AM5/24/17
to esp8266-re
Hi Paul,

I've been looking into ESP8266 RE for quite a while too and I decided to try a different approach. Since there are no xtensa decompilers but there are plenty for other architectures, I'm trying to implement a binary translator from xtensa to arm which would produce assembly files which, in turn, can be compiled into elf and passed to one of the existing decompilers. I think it would be best to produce ARM binaries with a small number of functions and not translate the whole thing. xtensa and ARM are quite similar (as both are RISC) and many operations can be translated practically 1:1. I've tried translating by hand and got readable results. Anyway, the repo is almost empty as of now, but I plan to get first results in June https://github.com/resetnow/xtensa2arm

Regards,

Vlad

Vlad Ivanov

unread,
May 24, 2017, 3:39:37 AM5/24/17
to esp8266-re
And regarding your project — impressive work! I think with enough knowledge about ABI and some data flow reconstruction this can get as far as generating C AST and code for some I/O accesses.

Paul Sokolovsky

unread,
May 28, 2017, 6:33:17 PM5/28/17
to Vlad Ivanov, esp82...@googlegroups.com
Hello,

On Wed, 24 May 2017 00:35:53 -0700 (PDT)
Vlad Ivanov <vlad.x.do....@gmail.com> wrote:

> Hi Paul,
>
> I've been looking into ESP8266 RE for quite a while too and I decided
> to try a different approach.

Sure, if you read my mails, you know I'm not surprised at all that
everyone tries different approaches ;-).

> Since there are no xtensa decompilers
> but there are plenty for other architectures,

Well, "plenty" doesn't mean "good". Guys writing those decompilers also
each tried one's own approach, with the expected results (usually lack
of anything useful for anybody else).

> I'm trying to implement
> a binary translator from xtensa to arm which would produce assembly
> files which, in turn, can be compiled into elf and passed to one of
> the existing decompilers. I think it would be best to produce ARM
> binaries with a small number of functions and not translate the whole
> thing.

For some analyses, a non-sparse call graph is required. For example, if
a function is never called, it's fair to assume that it's type is void.
If it's called 10 times, but no results from it are used, it's fair to
make the same assumption. But you never know if 11th call will change
that assumption.

> xtensa and ARM are quite similar (as both are RISC) and many
> operations can be translated practically 1:1.

Well, there's RISC and RISC. Xtensa and ARM are on the opposite sides
of RISC, with Xtensa being pure MIPS-style RISC (thus, MIPS being the
closest arch to Xtensa), while ARM is full of ugly CISCy features like
flag register. Surely, you (almost) can translate pure, high-level RISC
like Xtensa/MIPS into ARM. Almost, because various issues will pop up,
like Xtensa having true 16 general-purpose regs, while ARM has 15, one
taken by PC.

Actually, you can translate Xtensa into a nicely readable C code,
that's what I have been doing for couple of years now,
https://github.com/pfalcon/xtensa-subjects/blob/master/2.0.0-p20160809/out.lst
is the whole ESP8266 codebase translated into such format.

You can postprocess it a bit further into fully valid C, build for
any arch, pass into any decompiler...

> I've tried translating
> by hand and got readable results. Anyway, the repo is almost empty as
> of now, but I plan to get first results in June
> https://github.com/resetnow/xtensa2arm

Good luck, and keep us posted!

>
> Regards,
>
> Vlad

[]

Paul Sokolovsky

unread,
May 28, 2017, 6:34:46 PM5/28/17
to Vlad Ivanov, esp82...@googlegroups.com
Hello,
The whole idea of a sound dataflow analysis is that knowledge of ABIs
and other minor details aren't important. Well, as mentioned in the
previous mail, that requires tightly connected callgraph. When dealing
with something like a library, where there're loosely, shallowly
connected functions, that doesn't work that well, and requires bringing
in ABIs for heuristic filtering. And that's exactly the situation when
dealing with ESP8266 BootROM.

I'm pretty close to getting results of parameters/returns recovery for
BootROM functions (in the context of BootROM itself, i.e.
underestimated), would be interesting to compare them with ones you'll
get with your approach.

Vlad Ivanov

unread,
May 29, 2017, 3:52:33 AM5/29/17
to esp8266-re

 MIPS being the
closest arch to Xtensa

The only decompiler for MIPS I could find was Retargetable Decompiler, and I have stumbled upon several problems with its MIPS support. I agree it would be better (and probably easier — MIPS has more general-purpose registers) to translate to MIPS.
Reply all
Reply to author
Forward
0 new messages