Right now its main features are:
1. Disassembles your program using my crudasm engine, then
2. Converts from assembly to a high-level language, and
3. Removes redundant flags assignments, and
4. Uses actual exported symbol names when decompiling a DLL; also
supports imported symbol names and PDB debugging symbol names;
5. Recognizes calling conventions and arguments to a procedure
automatically. This was not especially easy (or fun), but it's done
now! Even works when you do indirect calls and the calling convention
is unknown in advance--it deduces it using heuristics. Works even when
you mix calling convention, e.g. some function calls are caller-cleans-
the-stack and some are callee-cleans-the-stack.
6. Recognizes some but not all switch indirect jumps.
Some sample output is shown below (from my own test program, test.exe,
compiled with Visual C++ 2005 Express):
---
void __ValidateImageBase() // fn_4014d0
{
{ loc_4014d0:
ecx = d[ss:esp+0x4];
word tmp0 = sub$word(w[ds:ecx], 0x5a4d);
x86_zf = is_zero$bit(tmp0);
if(z) then goto loc_4014de else goto loc_4014db;
}
{ loc_4014de:
eax = d[ds:ecx+0x3c];
eax = add$dword(eax, ecx);
dword tmp0 = sub$dword(d[ds:eax], 0x4550);
x86_zf = is_zero$bit(tmp0);
if(nz) then goto loc_4014db else goto loc_4014eb;
}
{ loc_4014eb:
ecx = bitxor$dword(ecx, ecx);
word tmp0 = sub$word(w[ds:eax+0x18], 0x10b);
x86_zf = is_zero$bit(tmp0);
cl = zx$byte(_x86_cc$bit(0x4));
eax = ecx;
return (pop 0x0 bytes);
}
{ loc_4014db:
eax = bitxor$dword(eax, eax);
return (pop 0x0 bytes);
}
}
Ok.
> Right now its main features are:
> 1. Disassembles your program using my crudasm engine, then
> 2. Converts from assembly to a high-level language, and
> 3. Removes redundant flags assignments, and
> 4. Uses actual exported symbol names when decompiling a DLL; also
> supports imported symbol names and PDB debugging symbol names;
> 5. Recognizes calling conventions and arguments to a procedure
> automatically. This was not especially easy (or fun), but it's done
> now! Even works when you do indirect calls and the calling convention
> is unknown in advance--it deduces it using heuristics. Works even when
> you mix calling convention, e.g. some function calls are caller-cleans-
> the-stack and some are callee-cleans-the-stack.
> 6. Recognizes some but not all switch indirect jumps.
Great!
> Some sample output is shown below (from my own test program, test.exe,
> compiled with Visual C++ 2005 Express):
> ---
> void __ValidateImageBase() // fn_4014d0
> {
> { loc_4014d0:
> ecx = d[ss:esp+0x4];
> word tmp0 = sub$word(w[ds:ecx], 0x5a4d);
> x86_zf = is_zero$bit(tmp0);
> if(z) then goto loc_4014de else goto loc_4014db;
> }
...
> it generates C-style code.
First, I see a C99 feature: mixing of code and variable declartions, i.e.,
tmp0 is declared after the code has started. I also see a few things C
doesn't like... "then", colon's, dollar-signs in labels... The "then"
keyword can just be eliminated. How to replace the dollar-sign depends on
whether the entire label, say "is_zero$bit" is a unique label for a function
or if represents two distinct actions: is_zero and bit(). If a unique
label, the dollar-signs can be replaced with an underscore. If separate
functionality, you might consider using a struct. If bit() is a function
pointer and one of the "actions" in the struct named is_zero, a dot or
period could replace the dollar-sign.
Rod Pemberton
For your disassembler/decompiler to reach that goal it would have to
generate C code which can then be completedly compiled and linked again.
Somehow I doubt your decompiler can generate such code ?
If my doubts are correct then what would be needed to rebuild an arbitrary
executable.
Since I am not really that good in visual studio 2005/2008 ;)
On the other hand if you also support decompiling to Delphi you will have my
full support :)
Bye,
Skybuck.
Didn't know google offered "code hosting" ;) :)
Bye,
Skybuck :)
The $ means "this is the output size of the function." is_zero is not
a good example, but consider this line:
word tmp0 = sub$word(w[ds:ecx], 0x5a4d);
That means, declare a 16-bit temporary called tmp0, and assign to it
the following value: the output of performing a subtraction with a 16-
bit output size (e.g. sub$word) on these two opeands:
1. w[ds:ecx]
2. 0x5a4d
(1) means, fetch the word at ds:ecx and use that value as the operand.
I admit it's not too self explanatory...
The current version is good for retrieving algorithms, but not data;
in particular, it will not be possible to re-compile the program when
you're done mostly because there's no data-flow analysis. For example,
if you have
printf("Hello, world!");
then my decompiler will convert that to something like this:
printf(0x400123);
where the address of the string is treated like a numeric argument.
As for the 'then', I dunno, BASIC was my first language so I just sort
of stuck that in there on a whim, I guess I'll remove it from the next
release so the generated code more closely resembles C.
Thanks for the feedback...
Willow
I get this:
,---
VmDec Copyright (C) 2009 Willow Schlanger
Win32 EXE/DLL Decompiler Version 0.00.01
loading..... error
Script file 'usersig.in', line 89 refers to KERNEL32.SetLastError but
symbol is not found in 'kernel32.dll'!
The symbol is a forwarder - use the forwarded name 'BAD.CycleDetected'
instead.
`---
Running on Win XP with no service packs.
Nathan.
I should add that setting it to 'ntdll.dll.RtlSetLastWin32Error'
didn't help.
Nathan.
Hi Nathan!
I'm running Vista so things may be different for you.
It's also been tested on Windows 7 and works fine there,
and I think it may work fine on Windows XP SP2, but I
could be wrong.
My advice to you is to comment out the offending line
by putting a # at the beginning of the line--just do this for
any lines it complains about.
Please let me know how it goes, and if you ever get really brave
and want to try it with no script file, you can rename usersig.in
to usersig.old and then run it on your favorite test program.
Good luck!
Willow
Well *that* certainly did the trick!
Nathan.