Further reverse engineering ideas

Skip to first unread message


Sep 22, 2020, 9:27:48 AM9/22/20
to Syndicate Wars Port

We have a code which compiles, but is not easily readable.
This is the main hurtle in further development - not many people can do ASM coding, and even less are able to understand a code which is 8MB blob of unmarked code.

My idea for fixing that is to re-create symbols (function names and global variable names), apply these to the assembly, and divide it into files with functions from similar areas - in other words, divide that into modules. Later, these modules can be converted to C (or already existing C code for bullfrog libraries can be used).

I already started doing that to the asm blob, but then I realized it isn't always easy.

In KeeperFX, I had symbols from beta version of the game which I got from Russians  on a street market back in the days (we had such period in Poland when that was considered a usual way of getting software). So I was comparing that beta version to final binary, and naming symbols in the final one using beta.

For Syndicate Wars, all the early versions I have (they come from demos which were attached to magazines) also come without symbols. So what can we do?

Well, the beta versions sometimes have additional debug messages and asserts enabled - meaning it is easier to name functions there. So my plan is to focus on a beta version - re-create the symbols in there - and then the way is open to match the symbols to final version, like I did with KeeperFX.

This is where I am now:

If anyone is interested, feel free to clone that repo and name a few symbols using your favorite RE tool.


Sep 24, 2020, 6:21:21 PM9/24/20
to Syndicate Wars Port
This project is taking me places..

Today I was RE-ing network code for Genewars Beta.
Then I took a GrIP on old Advanced Gravis Computer Technology drivers, which so happen to include SDK.
Then I was on a hunt of Spaceballs within sources of Rise of the Triad.
And all that after learning that Miles Sound System is an evolution of publicly available IBM Audio Interface Library.
Though when one Smacks what was released with Jagged Alliance 2 sources, more recent symbols can be obtained.

I have circa 1800 functions named, out of about  3000 total (not counting globals; I named a lot as well, but functions seem a better progress indicator).

Gynvael Coldwind

Sep 27, 2020, 10:32:05 AM9/27/20
to syndicate...@googlegroups.com
That's very interesting!
How are you matching the functions btw?

You received this message because you are subscribed to the Google Groups "Syndicate Wars Port" group.
To unsubscribe from this group and stop receiving emails from it, send an email to syndicate-wars-...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/syndicate-wars-port/696e569c-cd9e-4e7f-84bb-791a9aa0d869n%40googlegroups.com.

Gynvael Coldwind


Sep 27, 2020, 4:20:55 PM9/27/20
to Syndicate Wars Port
> How are you matching the functions btw?

A general way is:
1. Find functions with matching strings. Even if functions are a bit different, debug strings inside are often identical. So these are no-brainers, you just need to check whether size of the function is similar (to notice functions inlined into caller).
2. Check which functions are called within, and are callers of the functions you already named. Try to match them based on size and the offset within function where the `call` command is.
3. Check if functions before and after given one match. Functions are linked in the order they had in source file, so if a file contained multiple functions, you get names to all of them. Some functions are missing in some games due to removal of unused code (though Syndicate Wars beta I'm using had that optimization disabled, so all functions are there in the exact order they had in source files; it also has inlining completely disabled. Useful.)
4. While naming functions, skim them and also name global vars which are accessed within. Then check in which functions each global is used, match them based on function size and offset within function where the variable is used.
After repeating 2-4 multiple times, there are usually only a few missing. These need to be matched by looking at assembly code.

Some functions are a bit changed between games; in that case I'm just gathering multiple pieces of evidence (callers, callee, globals, code structure) before deciding that these are representations of the same function.

If you were hoping for a plugin which matches functions automatically - there is such plugin for IDA, I used in in the past. It mostly works. Though in the end I tend to check each function anyway just to be sure, so I'm not using that anymore.

Reply all
Reply to author
0 new messages