IDA Free - viable?

76 views
Skip to first unread message

Piers

unread,
Nov 17, 2010, 6:20:20 PM11/17/10
to Magic Lantern firmware development
With kind instructions, I've downloaded the IDA 5.7(?) demo, which
runs happily under Wine (I've said it before, I'll say it again, Win32
is the new Java). Loaded in my ROM, converted whatever to an ELF,
loaded the .idc file. It does stuff, I see stuff. But then while I'm
doing the whole "woah. What is this app? What am I looking at?" thing
the demo times out and exits.

Given that I have a LOT of learning to do, and saving would be nice,
is it possible to use indy's database with the older Free IDA (4.x I
think)? Don't want to start the process if I know it's doomed.

PG

Alex

unread,
Nov 17, 2010, 6:48:08 PM11/17/10
to ml-d...@googlegroups.com
No, the old IDA doesn't know ARM at all. The timeout in the demo
version is really nasty, but with IDAPython, it often crashes before
timing out...

Another option is GPL disassembling:
http://chdk.wikia.com/wiki/GPL_Disassembling
http://chdk.wikia.com/wiki/GPL_Tools

It's not user friendly for browsing/annotating the code, but I'm using
this method for my latest scripts and its orders of magnitude faster
than IDA. I'll post details on the wiki soon.

It shouldn't be difficult to annotate a .dis file with IDC names. See
this for some primitive IDC parsing routines:
https://bitbucket.org/a1ex/magic-lantern/src/tip/idc2stubs.py

> --
> http://magiclantern.wikia.com/
>
> To post to this group, send email to ml-d...@googlegroups.com
> To unsubscribe from this group, send email to ml-devel+u...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/ml-devel?hl=en

Piers

unread,
Nov 17, 2010, 8:01:35 PM11/17/10
to Magic Lantern firmware development


On Nov 18, 10:48 am, Alex <broscutama...@gmail.com> wrote:
> No, the old IDA doesn't know ARM at all. The timeout in the demo
> version is really nasty, but with IDAPython, it often crashes before
> timing out...

Well, that's a relief ;-)

Really want to see what's going on in that sounddev task. Guess I'll
just have to pounce on it, get some screenshots, learn ARM ....

Yes, I've been reading your stubs, look forward to more ...

Alex

unread,
Nov 18, 2010, 3:41:18 AM11/18/10
to ml-d...@googlegroups.com
Here is a script which disassembles a .bin with GPL tools, and
annotates it with names from an IDC database. It is inspired from the
CHDK disassemble.pl, but rewritten from scratch since I don't know
perl. Actually, around 80% of code is reused from another script of
mine (not yet public), and writing this helped me find a few subtle
bugs :)

It's output is not as clean as IDA's, it's a bit more low level, but
it might be useful for browsing the code. Be sure to use an editor
which handles large files (I used Geany).

It's highly recommended to use an IDC file with this script. If you
don't have one, the CHDK disassemble.pl will give a better output.

Sample output:

ff011c6c: e1a03004 mov r3, r4
ff011c70: e28f2f83 add r2, pc, #524 ; *'\x07ASSERT : %s'
ff011c74: e3a01006 mov r1, #6 ; 0x6
ff011c78: e3a0008b mov r0, #139 ; 0x8b
ff011c7c: eb0155da bl @DebugMsg
ff011c80: e59f020c ldr r0, [pc, #524] ; =0xFF011E94 (property)
ff011c84: e3a02004 mov r2, #4 ; =0x4 (GUI_GetMWBCaption)
ff011c88: e28d1008 add r1, r13, #8 ; 0x8
ff011c8c: eb011469 bl @prop_request_change
ff011c90: e8bd80fe pop {r1, r2, r3, r4, r5, r6, r7, r15}

To run it, you have to put both the binary and the IDC in the same
folder, rename the binary so its name includes the load address, and
just run the script (without arguments). Input files are autodetected.

Please tell me if you can run the script without issues. Dependencies are here:
http://magiclantern.wikia.com/wiki/IDAPython/Firmware_matching#Using_the_scripts

disasm.py

Piers

unread,
Nov 18, 2010, 6:26:24 AM11/18/10
to Magic Lantern firmware development
Got this:

subprocess.CalledProcessError: Command '['arm-elf-objcopy', '--change-
addresses=0xff010000', '-I', 'binary', '-O', 'elf32-littlearm', '-B',
'arm', '0xFF010000', 'ROM0.BIN', 'tmp.elf']' returned non-zero exit
status 1

investigating, but I wonder if my arm-elf-objcopy is too new or
something ...

PG
> Please tell me if you can run the script without issues. Dependencies are here:http://magiclantern.wikia.com/wiki/IDAPython/Firmware_matching#Using_...
>
>
>
> On Thu, Nov 18, 2010 at 3:01 AM, Piers <pie...@gmail.com> wrote:
>
> > On Nov 18, 10:48 am, Alex <broscutama...@gmail.com> wrote:
> >> No, the old IDA doesn't know ARM at all. The timeout in the demo
> >> version is really nasty, but with IDAPython, it often crashes before
> >> timing out...
>
> > Well, that's a relief ;-)
>
> > Really want to see what's going on in that sounddev task. Guess I'll
> > just have to pounce on it, get some screenshots, learn ARM ....
>
> > Yes, I've been reading your stubs, look forward to more ...
>
> > --
> >http://magiclantern.wikia.com/
>
> > To post to this group, send email to ml-d...@googlegroups.com
> > To unsubscribe from this group, send email to ml-devel+u...@googlegroups.com
> > For more options, visit this group athttp://groups.google.com/group/ml-devel?hl=en
>
>
>
>  disasm.py
> 12KViewDownload

Alex

unread,
Nov 18, 2010, 6:34:36 AM11/18/10
to ml-d...@googlegroups.com
Try to remove spaces from your input file name (put an underscore).

Will fix in next version.

Alex

unread,
Nov 18, 2010, 6:44:36 AM11/18/10
to ml-d...@googlegroups.com
Fixed (or at least it works for me).

Another bug fixed: end of functions is also marked correctly now.

For best results, set the tab size at 8 when viewing the output.
disasm.py

Antony Newman

unread,
Nov 18, 2010, 7:35:36 AM11/18/10
to ml-d...@googlegroups.com
Piers,
 
As assemblers go ... ASM is quite nice (if assemblers can be!).
 
Alex's introduction is a great starting point:
 
If you get stuck on a specific piece of ASM - I'm sure that one of us will be able to shed some light on what is going on locally.
 
Regards,
Antony

Piers

unread,
Nov 18, 2010, 4:31:15 PM11/18/10
to Magic Lantern firmware development
Can't believe I put spaces in! (it was the 0-9a-f regexp that made me
do it).

That worked, and I'll try the bug fixed one tonite, thanks!

PG
> > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscribe@googlegroups.c om>
> > > > For more options, visit this group athttp://
> > groups.google.com/group/ml-devel?hl=en
>
> > >  disasm.py
> > > 12KViewDownload
>
> > --
> >http://magiclantern.wikia.com/
>
> > To post to this group, send email to ml-d...@googlegroups.com
> > To unsubscribe from this group, send email to
> > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscribe@googlegroups.c om>

Alex

unread,
Nov 18, 2010, 5:47:22 PM11/18/10
to Magic Lantern firmware development
Great. Now if you have ideas about how to improve the code appearance,
or improving the ability to browse the code, that would be nice.

HTML with hyperlinks on subroutines? That would be good for browsing.
Can Doxygen be configured to understand this kind of file, to generate
a HTML?
> > > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscr...@googlegroups.c om>
> > > > > For more options, visit this group athttp://
> > > groups.google.com/group/ml-devel?hl=en
>
> > > >  disasm.py
> > > > 12KViewDownload
>
> > > --
> > >http://magiclantern.wikia.com/
>
> > > To post to this group, send email to ml-d...@googlegroups.com
> > > To unsubscribe from this group, send email to
> > > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscr...@googlegroups.c om>

Piers

unread,
Nov 18, 2010, 9:25:21 PM11/18/10
to Magic Lantern firmware development


On Nov 19, 9:47 am, Alex <broscutama...@gmail.com> wrote:
> HTML with hyperlinks on subroutines? That would be good for browsing.
> Can Doxygen be configured to understand this kind of file, to generate
> a HTML?

That's exactly what I was thinking, with a little menu or ToC of the
subs at the top (I'm very used to XCode's popup menu of subs at the
top). Def needs syntax highlighting, we're so spoilt for that nowadays
I can barely function if the text's only one colour.

Doxygen I don't know well, but I know there's a multitude of HTML
templating engines out there - though once you tag it up properly it's
more CSS than anything.

PG

Piers

unread,
Nov 18, 2010, 10:25:05 PM11/18/10
to Magic Lantern firmware development
Good resource, I speak a little 6502, 68K and PIC so I should be able
to get somewhere. Of course my patience may be the stumbling block
here ..

PG

Alex

unread,
Nov 19, 2010, 2:31:50 AM11/19/10
to Magic Lantern firmware development
Neither firefox nor chrome can handle a 60M html, so it has to be
split in many files (maybe one for each function?). But we'll lose the
ability to search in the entire firmware, like in a big text file.

Alex

unread,
Nov 23, 2010, 4:45:08 AM11/23/10
to Magic Lantern firmware development
Here's a preview of the HTML dump of the firmware. For now, the script generates the following:
- for each function: disassembly (with hyperlinks to called functions), references, call summary and a code flow graph (like in the attachments)
- list of functions, sorted by name, address, size, calls to them, calls made by them
- list of strings in the firmware
- raw disassembly (there are some areas not marked as function, and those can be seen only there)

If you want to try the new script, just ask. It still has a few glitches, that's why I'm not attaching it right now. It will be a component of this: http://magiclantern.wikia.com/wiki/GPL_Tools/ARM_console

Now I need a nice and lightweight CSS for this, and a navigation menu. Anyone wants to help?

sub_ff229578.htm
sub_ff229578.svg
sub_ff370090.htm
sub_ff370090.svg

Alex

unread,
Nov 23, 2010, 4:50:33 AM11/23/10
to Magic Lantern firmware development
P.S. Save them to disk in the same folder and open them with Firefox.
Chrome does not show the links in the SVG.

On Nov 23, 11:45 am, Alex <broscutama...@gmail.com> wrote:
> Here's a preview of the HTML dump of the firmware. For now, the script
> generates the following:
> - for each function: disassembly (with hyperlinks to called functions),
> references, call summary and a code flow graph (like in the attachments)
> - list of functions, sorted by name, address, size, calls to them, calls
> made by them
> - list of strings in the firmware
> - raw disassembly (there are some areas not marked as function, and those
> can be seen only there)
>
> If you want to try the new script, just ask. It still has a few glitches,
> that's why I'm not attaching it right now. It will be a component of this:http://magiclantern.wikia.com/wiki/GPL_Tools/ARM_console
>
> Now I need a nice and lightweight CSS for this, and a navigation menu.
> Anyone wants to help?
>
> On Fri, Nov 19, 2010 at 9:31 AM, Alex <broscutama...@gmail.com> wrote:
> > Neither firefox nor chrome can handle a 60M html, so it has to be
> > split in many files (maybe one for each function?). But we'll lose the
> > ability to search in the entire firmware, like in a big text file.
>
> > On Nov 19, 5:25 am, Piers <pie...@gmail.com> wrote:
> > > Good resource, I speak a little 6502, 68K and PIC so I should be able
> > > to get somewhere. Of course my patience may be the stumbling block
> > > here ..
>
> > > PG
>
> > > On Nov 18, 11:35 pm, Antony Newman <antony.new...@gmail.com> wrote:
>
> > > > Piers,
>
> > > > As assemblers go ... ASM is quite nice (if assemblers can be!).
>
> > > > Alex's introduction is a great starting point:
> >http://magiclantern.wikia.com/wiki/ASM_introduction
>
> > > > If you get stuck on a specific piece of ASM - I'm sure that one of us
> > will
> > > > be able to shed some light on what is going on locally.
>
> > > > Regards,
> > > > Antony
>
> > --
> >http://magiclantern.wikia.com/
>
> > To post to this group, send email to ml-d...@googlegroups.com
> > To unsubscribe from this group, send email to
> > ml-devel+u...@googlegroups.com<ml-devel%2Bunsu...@googlegroups.com>
> > For more options, visit this group at
> >http://groups.google.com/group/ml-devel?hl=en
>
>
>
>  sub_ff229578.htm
> 3KViewDownload
>
>  sub_ff229578.svg
> 8KViewDownload
>
>  sub_ff370090.htm
> 2KViewDownload
>
>  sub_ff370090.svg
> 11KViewDownload

Piers

unread,
Nov 23, 2010, 6:46:12 PM11/23/10
to Magic Lantern firmware development
... and safari doesn't know how big the SVG is, so displays it in a
little window with scroll bars.

Happy to help with some CSS - of course first it'll need to have some
classes applied to it.

I'm thinking the code body would work quite well as a table, as in:

<table><tr>
<td class='address'><a href="ff220000.htm">FF229584</a></td>
<td class='data'>e28f0e21</td>
<td class='instr'>add</td>
<td class='params'>r0, pc, #528</td>
<td class='comment'>*'prop_setnumwritemultipartly'</td>
</tr></table>

"params" might need to get styled even further (consts different
colours to regs, etc). Certainly going to make the files bigger!!

Of course, tables let you style columns - which would work perfectly
here - but you can only set background colours and borders, nothing
textual which is what we'd want.

PG
> > > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscribe@googlegroups.c om>

Piers

unread,
Nov 24, 2010, 1:12:38 AM11/24/10
to Magic Lantern firmware development
Cut&Paste this into an html file and see what you think ...


<html>
<head><title>ASM function: sub_FF370090 at 0xff370090 in
550D_108_05_0xff010000.bin</title></head>
<style>
body, p {
font-family: sans-serif;
}
pre, table {
font-family: 'andale-mono', monaco, courier, fixed-width;
font-size: 12px;
}
td {
padding: 2px 8px;
border-bottom: 1px solid #eee;
}
table {
border-collapse: collapse;
margin: 5px;
}
.data {
background-color: #aaa;
color: white;
font-size: 80%;
border: 1px;
/*border-radius: 2px;*/
}
.comm {
color: green;
}
.knum {
color: #a00;
}
.inst {
color: dark blue;
font-weight: bold;
}
.addr {
background-color: #eee;
font-size: 80%;
}
</style>

<body>
<h1>ASM function: sub_FF370090 at 0xff370090 in
550D_108_05_0xff010000.bin</h1>

<embed src='sub_ff370090.svg' align='right' >
// Start of function: <a href="sub_ff370090.htm">sub_FF370090</a>
<pre>
<table>
<tr><th>Address</th><th>Data</th><th></th><th></th></tr>
<tr><td class=addr><a href="ff360000.htm">FF370090</a>:</td><td
class=data>e5912008 </td><td class=inst>ldr</td><td class=parm>r2,
[r1, <span class="knum">#8</span>]</td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF370094</a>:</td><td
class=data>e3520010 </td><td class=inst>cmp</td><td class=parm>r2,
<span class="knum">#16</span></td><td class=comm>; 0x10</td>
<tr><td class=addr><a href="ff360000.htm">FF370098</a>:</td><td
class=data>0591100c </td><td class=inst>ldreq</td><td class=parm>r1,
[r1, <span class="knum">#12</span>]</td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF37009C</a>:</td><td
class=data>02811f41 </td><td class=inst>addeq</td><td class=parm>r1,
r1, <span class="knum">#260</span></td><td class=comm>; 0x104</td>
<tr><td class=addr><a href="ff360000.htm">FF3700A0</a>:</td><td
class=data>0a000007 </td><td class=inst>beq</td><td
class=parm>ff3700c4 </td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF3700A4</a>:</td><td
class=data>e3520011 </td><td class=inst>cmp</td><td class=parm>r2,
<span class="knum">#17</span></td><td class=comm>; 0x11</td>
<tr><td class=addr><a href="ff360000.htm">FF3700A8</a>:</td><td
class=data>0591100c </td><td class=inst>ldreq</td><td class=parm>r1,
[r1, <span class="knum">#12</span>]</td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF3700AC</a>:</td><td
class=data>028110f8 </td><td class=inst>addeq</td><td class=parm>r1,
r1, <span class="knum">#248</span></td><td class=comm>; 0xf8</td>
<tr><td class=addr><a href="ff360000.htm">FF3700B0</a>:</td><td
class=data>0a000003 </td><td class=inst>beq</td><td
class=parm>ff3700c4 </td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF3700B4</a>:</td><td
class=data>e3520027 </td><td class=inst>cmp</td><td class=parm>r2,
<span class="knum">#39</span></td><td class=comm>; 0x27</td>
<tr><td class=addr><a href="ff360000.htm">FF3700B8</a>:</td><td
class=data>0591100c </td><td class=inst>ldreq</td><td class=parm>r1,
[r1, <span class="knum">#12</span>]</td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF3700BC</a>:</td><td
class=data>02811e13 </td><td class=inst>addeq</td><td class=parm>r1,
r1, <span class="knum">#304</span></td><td class=comm>; 0x130</td>
<tr><td class=addr><a href="ff360000.htm">FF3700C0</a>:</td><td
class=data>13a01000 </td><td class=inst>movne</td><td class=parm>r1,
<span class="knum">#0</span></td><td class=comm>; 0x0</td>
<tr><td class=addr><a href="ff360000.htm">FF3700C4</a>:</td><td
class=data>e5801000 </td><td class=inst>str</td><td class=parm>r1,
[r0]</td><td class=comm></td>
<tr><td class=addr><a href="ff360000.htm">FF3700C8</a>:</td><td
class=data>e12fff1e </td><td class=inst>bx</td><td class=parm>r14</
td><td class=comm></td>
<tr><td class=addr></td><td class=data></td><td class=inst></td><td
class=parm></td><td class=comm></td>
</table>
// End of function: sub_FF370090
</pre>
<h2>References:</h2>
<pre>
<a href="sub_ff370090.htm">sub_FF370090+16</a>: 0xff3700c4: pointer to
0xe5801000
<a href="sub_ff370090.htm">sub_FF370090+32</a>: 0xff3700c4: pointer to
0xe5801000
</pre>
<h2>Calls:</h2>
<pre>
</pre>
<h2>Called by:</h2>
<pre>
<a href="ff360000.htm">FF370170</a>: ebffffc6 bl <a
href="sub_ff370090.htm">sub_FF370090</a>
<a href="ff360000.htm">FF370280</a>: ebffff82 bl <a
href="sub_ff370090.htm">sub_FF370090</a>

</pre>
</body>
</html>
> > > > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscr...@googlegroups.c om>

Alex

unread,
Nov 24, 2010, 3:40:32 AM11/24/10
to Magic Lantern firmware development
Nice!

I didn't like tables at first, but you got a really nice style for the
disasm body.

For a navigation bar, I'm thinking to use the following items:
- Go to: [search box] (can be done in javascript: you paste an address
or a name, and it should identify the right HTML)
- Functions: by name / address / size / calls to / calls from / source
file
- Strings
- Data (names other than functions)
- State machines (see http://a1ex.bitbucket.org/ML/states/index.html )
- Tasks / semaphores / message queues / properties / eventprocs (see
http://magiclantern.wikia.com/wiki/IDAPython/Tracing_calls )
- space for others :)

Do you think it's better to have the menu at the left or at the top?
Since there are many items, it may good at the left, but it may eat a
bit of horizontal space. I've tried to put small code diagrams to the
right of the disassembly.

For a left menu, I'd like something like this:
http://alexdu.github.com/sketch-lib/

but with more horizontal space.

Can you (or anyone good at javascript) try to make a box into which
one should type a number or a name, and should be redirected to a page
(chosen from a huge list)?

Here's the code that generates the links. I've used links to an
arbitrary address (in this case it goes to the raw dump), to a
function name (it goes to the detailed page of that function, with
disasm/graph/call/refs), and to a function offset (this is the most
general: if it finds the function, it goes there, to the desired ASM
line, and if not, it goes to that address in the raw dump).

Can that be done in javascript without slowing down the browser? There
are about 18000 functions in the firmware. The index of functions
already feels slow in HTML.

granul = 0x2000

def addrfile(x):
"""
>>> addrfile(0x12345678)
'12344000.htm'
"""
return '%08x.htm'% ((x//granul)*granul)

def funcfile(fun):
"""
>>> funcfile(Fun(D[0],0X8017F4))
'sub_008017f4.htm'
"""
return 'sub_%08x.htm' % fun.addr

def link2addr(x):
"""
>>> link2addr(0x123456)
'<a href="00122000.htm#A123456">123456</a>'
"""
return '<a href="%s#A%X">%X</a>' % (addrfile(x), x, x)

def link2func(fun):
"""
>>> link2func(Fun(D[0],0X8017F4))
'<a href="sub_008017f4.htm">sub_8017F4</a>'
"""
return '<a href="%s">%s</a>' % (funcfile(fun), fun.name)

def link2funcoff(dump, addr):
"""
>>> link2funcoff(D[0], 0X8017F8)
'<a href="sub_008017f4.htm#A8017F8">sub_8017F4+4</a>'
>>> link2funcoff(D[0], 0X123456)
'<a href="00122000.htm#A123456">ROMBASE+0x99456</a>'
"""
try: return '<a href="%s#A%X">%s</a>' %
(funcfile(dump.Fun(which_func(dump,addr))), addr, funcoff(dump, addr))
except: return '<a href="%s#A%X">%s</a>' % (addrfile(addr), addr,
funcoff(dump, addr))

Piers

unread,
Nov 24, 2010, 9:02:44 PM11/24/10
to Magic Lantern firmware development
Going to have to contemplate this for a few days - there will
certainly be some fairly large challenges in doing this with just
client side JS, but there's a huge wealth of JS tools and libs out
there.

Also need to find a python templating engine - I'm used to Ruby which
has a few, but I'm sure there's something good for python. They just
make it a bit easier than 'print "<tag>" + myvar + "</tag>" + ...' -
you cut out a lot of quotes and plusses (and quite a few prints as
well).

Are you currently doing any actual parsing of the gnu disasm output? I
can give you the regexps I'm using to get everything in the right
place (so far only two and it's not rocket science but if you want, I
can give)

PG
> > > > > that's why I'm not attaching it right now. It will be a component of...
>
> read more »

Alex

unread,
Nov 25, 2010, 2:15:02 AM11/25/10
to Magic Lantern firmware development
Yes, I'm parsing the disasm output in order to do sybolic code
emulation.
http://magiclantern.wikia.com/wiki/IDAPython/Symbolic_emulation
(I've developed this on IDAPython and the port to GPL tools is almost
done)

So I split each instruction into address, raw value, mnemonics,
condition/mode/whatever flags, operands, extract each operand, and if
the operand is complex, parse each term. That's already done (just not
integrated with the HTML output). But if you have better regexps, I
won't refuse them.

Here are mine:

a) split line by tabs => no regex needed (line structure from gnu
disasm is fixed => easy to parse)

b) split instruction into mnemonics and flags (that's ugly and maybe
slow; I wrote a small script to get all the instructions from
http://www.heyrick.co.uk/assembler/qfinder.html )

c) mode and condition flags:
_rmode = re.compile("(IA|IB|DA|DB|FD|FA|ED|EA)")

_rcond = re.compile("(AL|NV|EQ|NE|VS|VC|MI|PL|CS|CC|HI|LS|GE|LT|GT|
LE)")


d) split a string with commas with considering matching paranthesis
(i.e. a,b,(c,d),[e,f] => a ; b ; (c,d) ; [e,f] )
arglist = re.findall(r"(\([^\(\)]+\)|\[[^\[\]]+\]|\{[^\{\}]+\}|[^,]
+)", args)

(for some reasons, this one does not like spaces in input string, but
I don't know why)

_rmnem = re.compile("(ABS|ACS|ADC|ADD|ADF|ADRL|ADR|ALIGN|AND|ASL|ASN|
ASR|ATN|BIC|BKPT|BLX|BLE|BLT|BLS|BL|BX|B|CDP|CLZ|CMF|CMN|CMP|CNF|COS|
DVF|EOR|EXP|FABS|FADD|FCMP|FCPY|FCVTDS|FCVTSD|FDIV|FDV|FIX|FLD|FLT|
FMAC|FMDHR|FMDLR|FML|FMRDH|FMRDL|FMRS|FMRX|FMSC|FMSR|FMSTAT|FMUL|FMXR|
FNEG|FNMAC|FNMSC|FNMUL|FRD|FSITO|FSQRT|FST|FSUB|FTOSI|FTOUI|FUITO|LDC|
LDF|LDM|LDR|LFM|LGN|LOG|LSL|LSR|MCRR|MCR|MLA|MNF|MOV|MRC|MRRC|MRS|MSR|
MUF|MUL|MVF|MVN|NEG|NOP|NRM|OPT|ORR|PLD|POL|POP|POW|PUSH|QADD|QDADD|
QDSUB|QSUB|RDF|RFC|RFS|RMF|RND|ROR|RPW|RRX|RSB|RSC|RSF|SBC|SFM|SIN|
SMLAL|SMLAW|SMLA|SMULL|SMULW|SMUL|SQT|STC|STF|STM|STR|SUB|SUF|SWI|SWP|
TAN|TEQ|TST|UMLAL|UMULL|URD|WFC|WFS)")


arglist = re.findall(r"(\([^\(\)]+\)|\[[^\[\]]+\]|\{[^\{\}]+\}|[^,]
+)", args)



Suggestions for Python templating engine? I've never used that. A
quick google search => http://markup.sourceforge.net/

Also, Python has some nice string formatting capabilities:
http://docs.python.org/library/string.html#format-examples

Alex

unread,
Nov 25, 2010, 2:26:29 AM11/25/10
to Magic Lantern firmware development
That huge regex is from b); sorry for out-of-order placement. Consider
this mail as a patch :)

On Nov 25, 9:15 am, Alex <broscutama...@gmail.com> wrote:
> Yes, I'm parsing the disasm output in order to do sybolic code
> emulation.http://magiclantern.wikia.com/wiki/IDAPython/Symbolic_emulation

Piers

unread,
Nov 25, 2010, 5:47:40 AM11/25/10
to Magic Lantern firmware development
Ah,

I wasn't bothering so much about parsing out the mnemonics ... looks
like you have that in hand (but I've only styled them all bold, black,
monospace - so they wont actually stand out).

I just had a split on tabs, then for the params /#[0-9]*/ and /
r[0-9]+/

Also trying to find a way to get that svg in the same place with
webkit and mozilla ...

PG

Piers Goodhew

unread,
Nov 25, 2010, 5:55:40 AM11/25/10
to ml-d...@googlegroups.com
slightly more styled up ...
sub_ff229578.htm.zip

Alex

unread,
Nov 25, 2010, 6:05:18 AM11/25/10
to ml-d...@googlegroups.com
Good, if you can fix that SVG, it would be nice.

I'm experimenting right now with a simple decompiler. Does not handle
loops, but uses sympy to keep track of the data flow. I'm testing it
by decompiling Magic Lantern itself (your build, codenamed Bastardos).
Here's the latest test:

Assembler: attached.

Decompiler output:
FIO_Open(arg0, 0x1000, arg2, ...) => ret_FIO_Open_8C3BC
if ret_FIO_Open_8C3BC == -1:
return ret_FIO_Open_8C3BC
FIO_ReadFile(ret_FIO_Open_8C3BC, arg1, arg2, FIO_Open) => ret_FIO_ReadFile_8C3E0
FIO_CloseFile(handle=ret_FIO_Open_8C3BC) => ret_FIO_CloseFile_8C3F4
if arg2 == ret_FIO_ReadFile_8C3E0:
return arg2
DebugMsg(50, 3, msg='%s: size=%d rc=%d', arg0, arg2,
ret_FIO_ReadFile_8C3E0, unk_R4, unk_R5, ...)
return -1
!end

Original source code from ML:
size_t
read_file(
const char * filename,
void * buf,
size_t size
)
{
FILE * file = FIO_Open( filename, O_RDONLY | O_SYNC );
if( file == INVALID_PTR )
return -1;
unsigned rc = FIO_ReadFile( file, buf, size );
FIO_CloseFile( file );

if( rc == size )
return size;

DebugMsg( DM_MAGIC, 3, "%s: size=%d rc=%d", filename, size, rc );
return -1;
}

I will include that in HTML, too (but only for functions without
loops, since I don't know how to handle them).

The slowest step is analyzing the code paths (due to exponential
complexity) and graph rendering. So, to analyze an entire firmware
(18000 subs), it will take about one day (maybe you'll have to run the
script over night).

Decompiling the magic lantern 550D build (83 auto-identified
functions) takes around 1 minute, so a rough estimate is 1 second
needed per subroutine.

If you skip graph generation and decompiling (both are done from the
code paths), it may take around 1 hour, I guess.

> --
> http://magiclantern.wikia.com/
>
> To post to this group, send email to ml-d...@googlegroups.com
> To unsubscribe from this group, send email to ml-devel+u...@googlegroups.com

sub_0008c39c.htm
sub_0008c39c.svg

Alex

unread,
Nov 25, 2010, 6:11:33 AM11/25/10
to Magic Lantern firmware development
Does not show well in Firefox (some text is clipped).

On Nov 25, 12:55 pm, Piers Goodhew <pi...@u-h-p.com> wrote:
> slightly more styled up ...
>
>  sub_ff229578.htm.zip
> 2KViewDownload

Alex

unread,
Nov 25, 2010, 6:17:55 AM11/25/10
to Magic Lantern firmware development
Just noticed the mouseover effect, and the menu is fixed while the
page is scrolling, done without frames. Cool!

Piers

unread,
Nov 25, 2010, 6:47:16 AM11/25/10
to Magic Lantern firmware development
CSS is your friend ;-)

Where's the text clipped? (or is that part of the scrolling?) Which OS
are you using?

PG

Alex

unread,
Nov 25, 2010, 7:13:22 AM11/25/10
to ml-d...@googlegroups.com
Here. Scroll bar is at the top. I'm using Linux (Ubuntu).

clipped.png

Piers

unread,
Nov 25, 2010, 5:11:31 PM11/25/10
to Magic Lantern firmware development
Doesn't do that in my ubuntu 10 VM ... anyway, I'll move it and try to
find some sort of portable way to make it sit below consistently..

PG
>  clipped.png
> 45KViewDownload

Alex

unread,
Nov 26, 2010, 7:57:27 AM11/26/10
to Magic Lantern firmware development
It could be due my high-resolution screen. I'll take a look, too, but
I'm not very good at CSS.

Latest test: it guesses the arguments of a function call from a given
address. This will be included in the "Called by" section, for each
function.

In [167]: print bkt.find_func_call(0x8cb7c,6)

fprintf(ret_FIO_CreateFile_8CB20, '# Magic Lantern %s (%s)\n# Build on
%s by %s\n', '0.1.9_bastardos', 'fb42383275a2+ (550d) tip',
'2010-11-23 10:41:34', 'pie...@Piers-MacBook-Pro-2.local')

Asm code:

8CB54: e59fc120 ldr r12, [pc, #288] ; **'2010-11-23 10:41:34'
8CB58: e58dc000 str r12, [r13]
8CB5C: e59fc11c ldr r12, [pc, #284] ; **'piersg@Piers-MacBook-
Pro-2.local'
8CB60: e59f111c ldr r1, [pc, #284] ; 0x8cc84: pointer to '# Magic
Lantern %s (%s)\n# Build on %s by %s\n'
8CB64: e59f211c ldr r2, [pc, #284] ; **'0.1.9_bastardos'
8CB68: e59f311c ldr r3, [pc, #284] ; **'fb42383275a2+ (550d) tip'
8CB6C: e59f711c ldr r7, [pc, #284] ; pointer to fprintf
8CB70: e1a00006 mov r0, r6
8CB74: e58dc004 str r12, [r13, #4]
8CB78: e1a0e00f mov r14, r15
8CB7C: e12fff17 bx r7 // <- this is where the call was
made

Theory is here: http://magiclantern.wikia.com/wiki/IDAPython/Backtracing

Piers

unread,
Nov 27, 2010, 5:45:08 AM11/27/10
to Magic Lantern firmware development
I think I've found a way (to not chop top line).

So, I'll take another look at that Python temlating engine, but the
general way these things work is: you, the Parsing Guy, get everything
into variables (and the lines of code, of course, into an array of
variables). Me, the Styling Guy, then creates a template which is
expecting variables with the right names, you call render of my
template with your vars.

The templates mix HTML and code. A lot like (what I assume) PHP or ASP
pages do.

Here's some psuedocode, let's say we've got one variable "func_name"
and an array of hashes (python has them, yeah? Key-Value pairs) called
"code", the template might look like (assume real code goes inbetween <
% %> pairs):

<html>
<head>
<title> <%=func_name%></title>
</head>
<body>
<h1><%=func_name%></h2>

<table>
<% for line_o_code in code[] do%>
<tr><td><%=line_o_code{'addr'}%></td><td><
%=line_o_code{'opcode&params'}%></td></tr>
<% endofloop%>
</table>
</html>

Alex

unread,
Nov 27, 2010, 5:52:32 AM11/27/10
to Magic Lantern firmware development

Alex

unread,
Nov 27, 2010, 5:56:52 AM11/27/10
to Magic Lantern firmware development
A big list of templating engines: http://wiki.python.org/moin/Templating

On Nov 27, 12:52 pm, Alex <broscutama...@gmail.com> wrote:
> Like this?http://www.cheetahtemplate.org/examples.html

Piers Goodhew

unread,
Nov 27, 2010, 7:06:57 AM11/27/10
to ml-d...@googlegroups.com
So Cheetah seems just fine.

Here's a quick abstraction of what I've already done into a template. A few quick notes:

  • I've spun the CSS off into a separate file now (now that we might be dealing with >1 page)
  • I think you're supposed to save the file as a .tmpl - not sure but that's what I've done.
  • It expects the following variables:
    • funcname - name of function
    • $base_addr - of the function, not the whole rom
    • $file_name - of the binary
    • $state_machine - filename/path of SVG
    • lines - array of dictionaries, containing:
      • 'address' - of the line (BTW why is this a hyperlink?)(and should we not make them all anchors?)
      • 'data' - raw
      • 'inst' - mnemonic
      • 'params' - rest of instruction **with registers and numeric consts already marked up**
      • comment
    • There are three arrays at the end which *should* be dictionaries and have some more templatery, but for now the template is just them raw (you could put HTML in the vars):
      • references
      • calls
      • callers (for the "Called by" section

Hope that's useful

PG
disam.tmpl.html
disasm.css

Piers

unread,
Nov 27, 2010, 7:10:08 AM11/27/10
to Magic Lantern firmware development
Oh, that had attachments, I swear.

Maybe I'll send 'em direct to Alex ...

PG

Alex

unread,
Nov 27, 2010, 7:13:49 AM11/27/10
to ml-d...@googlegroups.com
Attachments arrived fine (twice).

Alex

unread,
Nov 27, 2010, 5:46:01 PM11/27/10
to ml-d...@googlegroups.com
Here's a preview. Attached are the modified template files. Now I kindly ask you to test it and find the bugs :)

http://a1ex.bitbucket.org/disasm/functions-by-addr.htm

I've disassembled the latest ML 550D build. This way, the output is a great way to learn assembler (since you also have the corresponding source code), and the HTMLs do not contain any Canon code.

Links to advanced code analysis do not work yet.

Call summaries (with explicit arguments) are very slow and need heavy optimization, otherwise the script will spend a week analyzing one firmware... unless you have a 100 GHz CPU :)

Found a bug in CSS: links with # (i.e. <a href="file.htm#anchor">) do not work well, because the title covers the first lines (so it seems to go a few lines further). Can you make the main div start from just under the title?

Ideas on how to make a table which can be sorted by clicking the column header? Can be done from template?
tmpl.zip

Piers

unread,
Nov 27, 2010, 8:10:02 PM11/27/10
to Magic Lantern firmware development
OK, I'll fool around. Certainly looks quite functional.

Did you see my attachments inline - I'm just using the html google
group and the css and template don't show up. Maybe cos they're text/
*?

I think maybe we should separately link the state machines - some of
the are humoungous. Or put them below. I was thinking of a quick-jump
menu at the top that would take you to code, FSM, calls, called by,
etc etc. The we could just make the FSM full-width at the bottom
(click to open in window of own).


On Nov 28, 9:46 am, Alex <broscutama...@gmail.com> wrote:
> Found a bug in CSS: links with # (i.e. <a href="file.htm#anchor">) do not
> work well, because the title covers the first lines (so it seems to go a few
> lines further). Can you make the main div start from just under the title?

Yes, I'll see what can be done ...

> Ideas on how to make a table which can be sorted by clicking the column
> header? Can be done from template?

You obviously can do it with JS, but whether it's usable are another
matter. I guess you could also work around it by having separate html
files and having the header link to, e.g. for a file "blah.html"
columns could link to "blah_by_size.html" and "blah_by_addr.html", etc
etc?

PG

Piers

unread,
Nov 28, 2010, 1:23:32 AM11/28/10
to Magic Lantern firmware development
Try changing these in the css and remove the h1 nbsp from the main
div:

#main {
margin-left: 8em;
padding-top: 15px;
position: relative;
top: 2.2em;
}
#title {
position: fixed;
background-color: white;
width: 100%;
padding: 0px 15px;
z-index: 10;
}

Works with webkit ....

Alex

unread,
Nov 28, 2010, 3:36:56 AM11/28/10
to Magic Lantern firmware development
Does not work with FF, nor with Chrome (bug still there). The title
div seems to cause the problem (if I set display:none, it works).

Steps to reproduce the bug:
1) go to Strings page
2) choose a string, let's say bmp_load.
3) click on its reference (here, 8c4e0). The first ASM line displayed
should mention that string in the comments column.

What happens now: you have to scroll backwards to see the string, both
with and without the fix.

My editor highlights position, top and z-index in other color
(different from width, padding and other well-known keywords). They
might not be handled well by all browsers... just a guess.

I can see your attachments in gmail (from the message you sent to the
mailing list). They do not appear here (bug or security feature?).

Alex

unread,
Nov 28, 2010, 3:46:49 AM11/28/10
to Magic Lantern firmware development
Jump menu (something like the wiki contents box) is a great idea. Can
you include it in the template?

About the diagram (it's not a FSM, but these are: http://a1ex.bitbucket.org/ML/states/
):

I've created them to make it easier to follow the assembler code (it
just shows the jumps). That's why I've placed them on the right, next
to ASM code.

What about enforcing a max width, and a link for "click to
enlarge" (which opens just the diagram in a new tab)? This way, small
diagrams will stay there, and large ones will not fill the page.


On Nov 28, 8:23 am, Piers <pie...@gmail.com> wrote:

Alex

unread,
Nov 28, 2010, 9:22:30 AM11/28/10
to Magic Lantern firmware development
Updated the tables; much cleaner now:

http://a1ex.bitbucket.org/disasm/functions-by-addr.htm

New templates attached.

Also added this to the CSS:

    th {
        padding: 2px 8px;
    }

and 'DejaVu Sans Mono' as a preference for code font.

tmpl.zip
disasm.css

Alex

unread,
Nov 28, 2010, 11:48:11 AM11/28/10
to Magic Lantern firmware development
First run of the script on the big firmware (fw108, core2duo 2.5GHz):
- func/name/string indexes: around 1 minute
* firefox is very slow when browsing them; chrome is OK (just a bit
slow, but usable).
- raw disassembly for entire firmware: 5 minutes, 400 MB
- function analysis and disasm: estimated between 10 hours and 2 days;
crashed at 0.26%.
- RAM usage: 1 GB.

If anyone wants to play with the script at this stage, please say and
I'll upload it.

On Nov 28, 4:22 pm, Alex <broscutama...@gmail.com> wrote:
> Updated the tables; much cleaner now:
>
> http://a1ex.bitbucket.org/disasm/functions-by-addr.htm
>
> New templates attached.
>
> Also added this to the CSS:
>
>     th {
>         padding: 2px 8px;
>     }
>
> and 'DejaVu Sans Mono' as a preference for code font.
>
> > ml-devel+u...@googlegroups.com<ml-devel%2Bunsu...@googlegroups.com>
> > For more options, visit this group at
> >http://groups.google.com/group/ml-devel?hl=en
>
>
>
>  tmpl.zip
> 3KViewDownload
>
>  disasm.css
> 2KViewDownload

Piers

unread,
Nov 28, 2010, 8:58:52 PM11/28/10
to Magic Lantern firmware development
Grrr, this is enough to drive a guy to Frames! I can get it to work
with a wrapper div, except webkit and mozilla render the scroll bar
*outside* the div.

i.e. if you set the wrapper div to 100% wide, the scroll bars are just
neatly offscreen. Set it to 90-something% and they're in about the
right place for some window widths, but move further inward as you
resize the window thinner. What idiot thought that should be the way
they behave??

So, these are the choices:

* some JS hacking to either roll our own scrollers, or dynamically
adjust scroll position
* leave all layout alone and do some JS hacking to reposition the main
after you click on an anchor
* use an iframe
* Live without header-at-top

None are too bad, I guess - the iframe means two files, JS hackery is
probably well-documented around the place. Header at top, well I guess
we could put everything we want at the side.

Will play some more w/- css and template later. Day job calls.

PG

Alex

unread,
Nov 29, 2010, 1:42:29 AM11/29/10
to Magic Lantern firmware development
Iframe has many annoyances in usability. I think we can put the header
in the main div (or menu), even if it might not look that good.

Alex

unread,
Nov 29, 2010, 2:52:49 AM11/29/10
to Magic Lantern firmware development
After fixing some bugs, new estimate is "only" 16 hours after 1%
completed (and still running).

If I disable symbolic code analysis and diagram, it should finish in
around 10 minutes.

Alex

unread,
Nov 29, 2010, 3:53:46 AM11/29/10
to Magic Lantern firmware development
Just implemented a little optimization: first, only the disassembly is
generated, so the entire firmware will be browseable after 15 minutes
(maybe less after profiling). Then, the advanced analysis can run
slowly, and will overwrite the sub_*** files with more useful info (I
hope).

Piers

unread,
Nov 29, 2010, 5:28:38 AM11/29/10
to Magic Lantern firmware development
As I was reading this, I couldn't help noticing that Google groups had
nice scrollbars on bits that were on the right place. Checked the
source. iFrames ...

PG

Piers

unread,
Nov 29, 2010, 5:42:29 PM11/29/10
to Magic Lantern firmware development
I'd like to play with the script - maybe we should actually stick it
on bitbucket (or github, I'm not fussy) and I can modify templates/css
and you can code, and we can version and merge and fork and all that.
(And, not take up quite so much bandwidth here ..)

will send (slightly) revised css and templates shortly ... with
notes ...
> > > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscribe@googlegroups.c om>

Alex

unread,
Nov 29, 2010, 6:27:59 PM11/29/10
to Magic Lantern firmware development
Good. I prefer git (at least I'm used to gitk), so I'll create a repo
on github.
> > > > ml-devel+u...@googlegroups.com<ml-devel%2Bunsubscr...@googlegroups.c om>

Piers Goodhew

unread,
Nov 29, 2010, 7:20:30 PM11/29/10
to ml-d...@googlegroups.com
New css and (some) templts

css:

* added (yet another) fixed-width font "menlo" (somes with Snow
Leopard, i like it)
* increased the line spacing and margins for <pre> to make the
decompile look better
* toolbar higher up. Played with <a> styles more, made styling
specific to <a> in the toolbar only
* title class gone

Template:

* moved the title back to the top of the Main div - it's redundant as
it's the window title, but why not have it there. Title div gone
* experimentally put a regexp into disasm.tmpl to highlight registers.
I think we just want a parse_params() func eventually.

disasm.css
sometempls.zip

Piers

unread,
Nov 30, 2010, 1:16:46 AM11/30/10
to Magic Lantern firmware development
D'oh, I just noticed as I was shutting down that I didn't actually hit
"Save" on my modified main template. So it's unchanged. Just needs the
title div removed and optionally the h1 that was inside moved to main.

PG
>  disasm.css
> 2KViewDownload
>
>  sometempls.zip
> 1KViewDownload

Alex

unread,
Dec 1, 2010, 5:04:19 PM12/1/10
to Magic Lantern firmware development
Just announced the script on a separate thread:
http://groups.google.com/group/ml-devel/browse_thread/thread/9f2fa1bfd0161891

Git repo is here: https://github.com/alexdu/ARM-console

Piers: what username do you have on github?

I've tried to update the CSS, but it can be done better. Still don't
like how SVG works. Can you take a look?

Piers

unread,
Dec 1, 2010, 7:21:11 PM12/1/10
to Magic Lantern firmware development
Not really sure if I already have a github acct or not ... :-s (I did
notice an empty ARM repository on github last night, think I caught
you in the process of uploading it)

I try to be piersg on most things ...

Will try to have at the css shortly, but it's getting busier this time
of year ...

PG

On Dec 2, 9:04 am, Alex <broscutama...@gmail.com> wrote:
> Just announced the script on a separate thread:http://groups.google.com/group/ml-devel/browse_thread/thread/9f2fa1bf...

Alex

unread,
Dec 2, 2010, 2:28:27 AM12/2/10
to Magic Lantern firmware development
No need to hurry :)

I've given you write access to the repo and added the missing
templates and CSS. Sorry I didn't notice them at yesterday.
Reply all
Reply to author
Forward
0 new messages