Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Extracting source from binaries a' la' Burger Becky

237 views
Skip to first unread message

anthon...@gmail.com

unread,
Jun 28, 2017, 7:45:05 PM6/28/17
to
Are there any guides on tools used for extracting source from binaries? For example, I'd love to be able to read the files on the Ultima V main program disk and convert the binaries to actual source code (either assembly or c) so that I can go through it with a fine-tooth comb and see how things are done, perhaps add some logic of my own or fix some issues and then recompile. I know Burger Becky is a master at this. Thoughts?

Michael Pohoreski

unread,
Jun 28, 2017, 11:13:55 PM6/28/17
to
On Wednesday, June 28, 2017 at 4:45:05 PM UTC-7, anthon...@gmail.com wrote:
> Are there any guides on tools used for extracting source from binaries? For example, I'd love to be able to read the files on the Ultima V main program disk and convert the binaries to actual source code (either assembly or c) so that I can go through it with a fine-tooth comb and see how things are done, perhaps add some logic of my own or fix some issues and then recompile. I know Burger Becky is a master at this. Thoughts?

You are looking for a 6502 disassembler.

Why?

* 95% of games were written in assembly.
* A few were written in Applesoft with machine language supplements (Ultima 1, Empire 1: World Builders, etc)
* A handful were written in Pascal.

On the PC side of things the popular disassemblers are:

IDA
https://www.hex-rays.com/products/ida/

Sourcer -- no longer sold
https://corexor.wordpress.com/2015/12/09/sourcer-and-windows-source/


Richard Garriot was learning 6502 back in Ultima 1 so it was a hybrid written in Applesoft and Assembly. By the time Ultima 5 rolled around (March 1988) it was written 100% in assembly. C didn't popular for games until the 1990's and Marble Madness.

I'm not sure what Apple 2 disassemblers people are using aside from AppleWin.

I'm in the process of adding 65c02 support to dcc6502.
https://github.com/Michaelangel007/dcc6502

Will be interesting to get a list of popular tools.


Hugh Hood

unread,
Jun 28, 2017, 11:18:49 PM6/28/17
to
On 6/28/2017 6:45 PM, anthon...@gmail.com wrote:
> Are there any guides on tools used for extracting source from binaries? For example, I'd love to be able to read the files on the Ultima V main program disk and convert the binaries to actual source code (either assembly or c) so that I can go through it with a fine-tooth comb and see how things are done, perhaps add some logic of my own or fix some issues and then recompile. I know Burger Becky is a master at this. Thoughts?
>

Anthony,

Ewen Wannop's 'BrkDown' is an excellent tool that will allow you to
disassemble binaries, to 'arrange' and to classify (strings vs. code vs.
data) segments of the code, and to then generate re-compilable source
code in either Merlin or ORCA/M format.

I've used it for well over a year now to make sense of some quite
complex programs, especially programs like AppleWorks 5.1 and several of
the Beagle/JEM/OPS TimeOut applications.

I consider it to be somewhat of a 'word processor' for binary files,
although that is overly simplistic and inadequate to describe its
usefulness in full detail.

Compared to other disassembers for the Apple II, its learning curve is
not steep at all. I actually have fun using it.


<http://speccie.uk/software/brkdown/>

As far as a 'guide' for disassembly goes, I'll admit that there is no
silver bullet. You must apply some deductive reasoning and a basic
knowledge of 6502 assembly language.




Hugh Hood

Michael Pohoreski

unread,
Jun 29, 2017, 1:02:39 AM6/29/17
to
On Wednesday, June 28, 2017 at 8:18:49 PM UTC-7, Hugh Hood wrote:

> As far as a 'guide' for disassembly goes, I'll admit that there is no
> silver bullet. You must apply some deductive reasoning and a basic
> knowledge of 6502 assembly language.

True there is no silver bullet however there have been many, many, many guides written for "Reverse Engineering"

Before Fravia died in 2009 he wrote quite a few famous articles.
https://en.wikipedia.org/wiki/Fravia

Warning: You can literally spend YEARS reading everything Fravia wrote
http://acrigs.com/FRAVIA/FRAVIA_index.htm

Bringing this back on topic WRT to Apple 2 reverse engineering ...

One of the best disassembly tutorials that I've ever seen is:

Tearing Into Machine Language Code
http://www.tinaja.com/ebooks/tearing_rework.pdf

"Old-School" disassembly but they don't teach fundamentals like this anymore. A pure joy to read.

Hugh Hood

unread,
Jun 29, 2017, 9:32:42 AM6/29/17
to
On 6/29/2017 12:02 AM, Michael Pohoreski wrote:
>
> One of the best disassembly tutorials that I've ever seen is:
>
> Tearing Into Machine Language Code
> http://www.tinaja.com/ebooks/tearing_rework.pdf
>
> "Old-School" disassembly but they don't teach fundamentals like this anymore. A pure joy to read.
>

Agreed. This is Don 'The Guru' Lancaster at his best. I use the
techniques he presents in that short treatise (colored highlighters and
all) in conjunction with my work using Ewen's 'BrkDown'.




Hugh Hood

anthon...@gmail.com

unread,
Jun 29, 2017, 10:08:06 AM6/29/17
to
Wow, thanks for the links... the "Tearing Into Machine Language Code" is particularly great!

anthon...@gmail.com

unread,
Jun 29, 2017, 10:28:28 AM6/29/17
to
On Wednesday, June 28, 2017 at 11:18:49 PM UTC-4, Hugh Hood wrote:
I'll check out this tool, seems just what the doctor ordered!

anthon...@gmail.com

unread,
Jun 29, 2017, 10:32:12 AM6/29/17
to
Hey Michael,

Thanks, I'll look at this as well as Ewen Wannop's 'BrkDown' and see which is the better of the two.

Anthony

Michael Pohoreski

unread,
Jun 29, 2017, 11:40:05 AM6/29/17
to
I should point out that dcc6502 is just a "dumb" disassembler. It has numerous shortcomings:

* It is not interactive
* It doesn't understand data

As long as you are disassembling pure code it does a great job.

At some point AppleWin will have interactive automated tools -- since one of my hobbies is ripping apart old Apple 2 games and I'm not happy with any of the existing tools. There is already support in AppleWin for telling the disassembler what is data, but unfortunately there is no way to export that information.

Using a proper disassembler is probably your best bet at this time.

Michael J. Mahon

unread,
Jun 29, 2017, 2:19:02 PM6/29/17
to
Two disassemblers that I've used on the Apple II (not including the
built-in monitor disassembler) are Roger Wagner's Sourceror, part of the
Merlin assembler distribution, and Bob Sander-Cederlof's disassembler.

Both of these are easy to find these days.

Note that these are not "tracing" disassembled, so the user needs to
identify non-code regions, and, of course, provide meaningful identifiers
and comments.

It's worth pointing out that the Merlin disk, with Sourceror, contains a
disassembly script on side 2 that disassembles the Applesoft ROM to provide
a very readable, commented Applesoft listing.

--
-michael - NadaNet 3.1 and AppleCrate II: http://michaeljmahon.com

James Davis

unread,
Jun 29, 2017, 8:12:55 PM6/29/17
to
Hi All,

I have (1) the "All-Purpose Disassembler" [version 2, written by Me and my friend, Larry Freeman (who wrote version 1), which is for Apple II, II+, IIe, running from DOS 3.1 ~ DOS 3.3, Copyright (c) September 5, 1983] and (2) "My 65c02 Sourceror Set" [ProDOS Disk Based Version 1.00, written by Me, Copyright (c) 1991, which is also for Apple II, II+, IIe, and may work on later models, too] and (3) the one I wrote later for use within AppleWorks written in TimeOut UltraMacros. [Hugh Hood has this last one, so he could probably give out copies of it to anyone who wants it.--If he has the time and inclination to do so.]

As far as I remember, these disassemblers are also dumb disassemblers. It is up to the user to analyze the disassembly texts generated by them. But, analyses can be done in any Apple II text editor or in AppleWorks.

Unfortunately, I currently have no way to make disk-images of any of my Apple II diskettes, so I cannot publish them on Asimov, yet. But, I do have printed source code [for (1) and (2) above, only] that I could scan and publish on Asimov.--If there is enough interest expressed HERE <"https://groups.google.com/forum/#!topic/comp.sys.apple2.programmer/QskjeuZTAmM"> by all of you in the Apple II (comp.sys.apple2) community!

Yours truly,

James Davis

anthon...@gmail.com

unread,
Jun 29, 2017, 9:46:59 PM6/29/17
to
On Thursday, June 29, 2017 at 1:02:39 AM UTC-4, Michael Pohoreski wrote:
Michael, can you give a very quick overview of how you would go about reverse-engineering a game from AppleWin? I'm looking at the debugger manual but I don't see an option for extracting code from memory and saving to disk.

Michael Pohoreski

unread,
Jun 29, 2017, 10:56:30 PM6/29/17
to
On Thursday, June 29, 2017 at 6:46:59 PM UTC-7, anthon...@gmail.com wrote:
> Michael, can you give a very quick overview of how you would go about reverse-engineering a game from AppleWin? I'm looking at the debugger manual but I don't see an option for extracting code from memory and saving to disk.

Sure, ask the simple questions. :-) I kid -- I'm more then happy to help out.

From the debugger you can use the BSAVE command. Type: HELP BSAVE to display the built in debugger help.

Let's say I want to save a chunk of memory from $6000..$BFFF -- then I would use this command:

bsave "game.bin",6000:BFFF


Every game has a "main loop". I prefer to work "bottom-up" to find it, but you can also work "top-down". I often switch back and forth.

There are usually 2 main attack vectors I use for reverse engineering:

1. Search for where the speaker is accessed since ALL games MUST use $C030 to toggle the squeeker. Now someone _could_ use one of the billion 6502 indirect modes -- such as $C000,X -- but no one ever does that in practice for the speaker. We want to search for the absolute address (in big endian format) 30 C0

For example, let's fire up Montezuma's Revenge. Since I loaded it from DOS 3.3 I know it starts at $13F0, because the DOS3.3 last load address AA72.AA73 has F0 13.

s 1300:bfff 30 c0

Returns these 9 locations:

1:$7759
2:$7775
3:$7790
4:$77AD
5:$77C8
6:$77E9
7:$780C
8:$7827
9:$8B34

Let's examine the code around these results. Type this into the debugger

@1-1

Oh look, what do we find?

7758: LDA $C030
775B: DEC $C1
775D: BNE $775B
775F: DEC $C0
7761: BNE $7754
7763: RTS

I usually set a breakpoint on the RTS to see who called us.

BPX 7763

There is also some more speaker code at 7764, and 7780. Let's trap those too.

BPX 777F
BPX 779A
PRESS <F7> to exit debugger

Starting a new game, via CALL-151, 13F0G, press RETURN to play.

And when the player makes a "jump" sound, our breakpoint at 777F is triggered. Pressing SPACE to process the RTS we see we were called from $7730, which in turn was called from $69AC.

We'll probably want to search $6000 .. $6FFF for the main loop.


2. Find the blitter via setting a breakpoint when the HGR screen is accessed.

BPC
BPM 2000:3FFF
F7

Immediately we stop at $B0F1. Let's put this in context.

U B0EB

We see that this memcpy() loop has been manually unrolled as there are:

* 8x LDA ($9A),Y
* 8x STA ($80),Y

If we inspect what the pointers are we find:

* Source pointer ZP $9A = $F53
* Dest pointer ZP $80 = $21D3

We can even inspect the HGR screen(s) via the debugger commands:

HGR1
HGR2

We see that Montezuma's Revenge is using page flipping as well.

That was a quick primer but feel free to post question.

Hope that helped.







Michael Pohoreski

unread,
Jun 29, 2017, 11:00:24 PM6/29/17
to
On Thursday, June 29, 2017 at 7:56:30 PM UTC-7, Michael Pohoreski wrote:
> @1-1

Whoops, that should be:

u @1-1


You can also use:

u @2-1
u @3-1
u @4-1
etc.

anthon...@gmail.com

unread,
Jun 29, 2017, 11:01:07 PM6/29/17
to
Thanks Michael; you're the man!

Michael Pohoreski

unread,
Jun 29, 2017, 11:09:19 PM6/29/17
to
On Thursday, June 29, 2017 at 6:46:59 PM UTC-7, anthon...@gmail.com wrote:

> Michael, can you give a very quick overview of how you would go about reverse-engineering a game from AppleWin? I'm looking at the debugger manual but I don't see an option for extracting code from memory and saving to disk.

Another attack vector is


3. searching for the keyboard IO locations.

s 1300:bfff 00 c0

I don't usually search for that because it usually returns too many false positives. 64 Results in our case!


s 1300:bfff 10 c0
OK this only returns 6 results -- much better!

1:$7655
2:$7663
3:$7686
4:$B47F
5:$B4F1
6:$B5E6

And we find the keyboard handling code ...

u @1-1

7654: STA $C010
7657: RTS
:
765D: LDA $C000
7660: BPL $7685
7662: STA $C010


4. Joystick

Montezuma's Revenge actually uses the ROM routine to read the joystick.

769C: JSR $FB1E ; PREAD

Most games don't -- they just read the IO locations directly.

There are many ways to start disassembling a program. The are three main ways are:

* You can "boot-trace" it -- I find this very tedious to do all the init code
* You can inspect it "on-the-fly"
* You can look for known IO memory access locations.

I usually use the latter 2 methods and circle around to the first to see how memory is being setup.

Michael

Michael Pohoreski

unread,
Jun 29, 2017, 11:11:13 PM6/29/17
to
On Thursday, June 29, 2017 at 8:01:07 PM UTC-7, anthon...@gmail.com wrote:
search for the absolute address (in big endian format) 30 C0

Whoops, that should be little endian format.

Antoine Vignau

unread,
Jun 30, 2017, 3:35:00 AM6/30/17
to
I nearly daily use The Flaming Bird Disassembler by Ferox. It is a IIgs text application with macro, firmware, OSes, and templates support. It disassembles 8- and 16-bit apps.

You can get it at http://www.brutaldeluxe.fr/products/french/tfbd.html

Antoine

Michael Pohoreski

unread,
Jun 30, 2017, 12:51:04 PM6/30/17
to
Here's the TL:DR; summary dissemblers sorted alphabetically from everyone's replies:


All-Purpose Disassembler by James Davis and Larry Freeman
url n/a yet

AppleWin
https://github.com/AppleWin/AppleWin
Press F7 to toggle

Built in ROM
#L
i.e. C600L

BrkDown by Ewen Wannop's
http://speccie.uk/software/brkdown/

dcc6502
https://github.com/Michaelangel007/dcc6502
NOTE: 65c02 support is coming soon-ish

My 65c02 Sourceror Set by James
url n/a yet

S-C Disassembler by Bob Sander-Cederlof
http://www.txbobsc.com/scsc/scdisassembler/
ftp://ftp.apple.asimov.net/pub/apple_II/images/programming/assembler/s-c/

Sourceror by Roger Wagner
ftp://ftp.apple.asimov.net/pub/apple_II/images/programming/assembler/Sourceror.DSK

The Flaming Bird Disassembler by Ferox.
http://www.brutaldeluxe.fr/products/french/tfbd.html


Thanks Antoine, James, Hugh, and Michael (Mahon) !

At some point I hope to review these and give feedback on usability and features.

Michael Pohoreski

unread,
Jun 30, 2017, 12:59:50 PM6/30/17
to
On Thursday, June 29, 2017 at 5:12:55 PM UTC-7, James Davis wrote:
> I have
> "All-Purpose Disassembler"
> "My 65c02 Sourceror Set"
> [Hugh Hood has this last one, so he could probably give out copies of it to anyone who wants it.--If he has the time and inclination to do so.]

Uploading this to Asimov would be the best (long term) option.


> Unfortunately, I currently have no way to make disk-images of any of my Apple II diskettes,

In spite of it being written in Java I highly recommend ADTPro ...
http://adtpro.sourceforge.net/

... and cables ...
http://retrofloppy.com/products.html

... and get the disks preserved before they deteriorate!

If you feel safe about mailing them I'm sure someone in the community would help out. i.e. I'm over in the greater Seattle, Washington area if you need the physical disks converted over to a virtual DSK image so turn around time should be rather quick.


> I do have printed source code [for (1) and (2) above, only] that I could scan and publish on Asimov.-

Another option is that you could also post pictures of the source code (say on imgur) and someone can help out getting it OCR'd, converted into text, and verifying it assembles. i.e. I'd offer to help with this.

It would be a shame to lose another disassembler.

Please post which option, if any, you want to pursue.

Michael

Michael J. Mahon

unread,
Jun 30, 2017, 6:28:19 PM6/30/17
to
Michael Pohoreski <michael....@gmail.com> wrote:
> On Thursday, June 29, 2017 at 5:12:55 PM:

> Another option is that you could also post pictures of the source code
> (say on imgur) and someone can help out getting it OCR'd, converted into
> text, and verifying it assembles. i.e. I'd offer to help with this.
>
> It would be a shame to lose another disassembler.
>

I've tried to use OCR to recover code from assembler listings several
times, and have been uniformly disappointed.

Even fairly capable OCR packages I've tried have terrible problems with
columnar listings (breaking the text into columns, not lines) and dealing
with the "non-English" identifiers and mnemonics.

Given all the semantic restrictions on the various assembler listing
fields, OCR should be able to do an almost flawless job (with the usual
exception for "ohs" and zeroes), but I haven't found ways to communicate
this to the OCR package. And then there are dot matrix printouts...

As a result, the output is almost useless, essentially requiring retyping
the whole listing!

If you know a solution, perhaps in the form of an OCR package optimized for
more than Word conversions (!), I'd love to hear about it!

Once I capture the listing as a proper text file, I can do anything with
it, from extracting the hex code to recreating the source file.

Antoine Vignau

unread,
Jun 30, 2017, 6:40:59 PM6/30/17
to
I've recently uploaded Disasm 2.2e by Rak-Ware at http://www.brutaldeluxe.fr/public/

av

Michael Pohoreski

unread,
Jun 30, 2017, 9:15:25 PM6/30/17
to
Back in 1999 I wrote a commercial OCR for handwriting that had over 95% accuracy so I know a little bit about the problem. ;-)

Dot-matrix output should be trivial to recognize since it far more consistent then handwriting.

But yes, getting the columnar output and all the symbols recognized typically trip the standard OCR.

A custom solution is probably the best way to go. Plus it's a graphics problem so I'm immediately interested. :-)

And of course with Machine Learning all the iHipster fad these days, there is always that option to check out.

Any chance you could upload a sample image?

Michael Pohoreski

unread,
Jun 30, 2017, 9:16:40 PM6/30/17
to
Thanks Antoine !

Michael Pohoreski

unread,
Jun 30, 2017, 9:17:11 PM6/30/17
to
Sorry, Michael, I thought I was replying to James. :-)

Rest of the info. still stands.

Steve Nickolas

unread,
Jul 1, 2017, 3:20:43 AM7/1/17
to
I use the old free DOS version of IDA.

-uso.

Michael J. Mahon

unread,
Jul 1, 2017, 12:17:20 PM7/1/17
to

Hugh Hood

unread,
Jul 1, 2017, 1:41:53 PM7/1/17
to
in article 362cf617-2dd2-4572...@googlegroups.com, Michael
Pohoreski at michael....@gmail.com wrote on 6/30/17 11:59 AM:

> On Thursday, June 29, 2017 at 5:12:55 PM UTC-7, James Davis wrote:
>> I have
>> "All-Purpose Disassembler"
>> "My 65c02 Sourceror Set"
>> [Hugh Hood has this last one, so he could probably give out copies of it to
>> anyone who wants it.--If he has the time and inclination to do so.]
>
> Uploading this to Asimov would be the best (long term) option.
>

I found (2) versions of the 65C02 Disassembler that James Davis wrote in
(amazingly) UltraMacros for use from within the AppleWorks 5.1 program.

They have been placed (along with some docs) on a disk image and emailed to
Jim for his review. I suspect he will upload either one, or both, or neither
of them to Asimov. ;-)

FWIW, to do this in UltraMacros (even considering the sophisticated version
of UltraMacros that evolved in the mid 1990's - not your father's
UltraMacros) was really thinking outside the box. Clever and sharp - wish I
had the source.





Hugh Hood









James Davis

unread,
Jul 1, 2017, 5:20:45 PM7/1/17
to
On Saturday, July 1, 2017 at 10:41:53 AM UTC-7, Hugh Hood wrote:
> in article 362cf617-2dd2-4572...@googlegroups.com, Michael
> Pohoreski ... wrote on 6/30/17 11:59 AM:
I just now uploaded the disk-image Hugh sent me as is to Asimov incoming.

So, please don't bother him with requests for it. Just download it when it becomes available. It is called: "JPDAplWorksTODisassemblers.po".

James Davis

0 new messages