I'm looking for an assembler to C code convertor, I've got a driver written
in x86 assembler code and would like to port it to C code.
Anyone got this kinda beast?
Regards,
Jasper
Jasper Hendriks <REMOVE_mijJa...@sci.kun.nl> wrote in message
news:387c...@news.iglou.com...
Jasper Hendriks wrote:
>
> Hello,
>
> I'm looking for an assembler to C code convertor, I've got a driver written
> in x86 assembler code and would like to port it to C code.
> Anyone got this kinda beast?
Yeah, it's called a "System Programmer", and costs about $100 an hour to
run.
Regards,
Jasper
Randall Hyde <rh...@shoe-size.com> wrote in message
news:85iqen$af2$1...@bob.news.rcn.net...
> A few years ago I saw something that attempted this.
> I worked mainly by writing a set of "C" functions for each of the
> 80x86 machine instructions and then it emitted calls to these
> functions for each instruction. Needless to say, it wasn't very
> efficient. Furthermore, I was never able to successfully compile
> any "C" output it produced without a lot of extra work.
> Randy Hyde
>
> Jasper Hendriks <REMOVE_mijJa...@sci.kun.nl> wrote in message
> news:387c...@news.iglou.com...
> > Hello,
> >
> > I'm looking for an assembler to C code convertor, I've got a driver
> written
> > in x86 assembler code and would like to port it to C code.
> > Anyone got this kinda beast?
> >
> > Regards,
> >
> > Jasper
> >
> >
>
>
Jasper Hendriks <REMOVE_mijJa...@sci.kun.nl> wrote in message
news:85nh6m$on3$1...@bob.news.rcn.net...
> Hi, for anyone interested I'll add my findings here
> I've done some web research and found out a program exe2c, it has an
> assembler to c converter integrated. But it didnt work right on my
assembler
> output. Anyways, if someone knows a better converter (and the
executables!)
> please let me know.
>
> Regards,
>
> Jasper
>
>
I believe this was the program I was refering to.
I don't know if the version you've got is newer than
the version I used, but....
In any case, I doubt if any automated conversion will ever produce
satisfactory
code (e.g., efficient), but a program like exe2c might be a good starting
point to work from, doing the remainder of the conversion manually.
Randy Hyde
take a look at this page. It has comments on the process and links to various
attempts to produce high level code from assembly code (usually x86). Also, I
beleive the author of the 'IDA' (interactive disassembler) had planned to start
head in this direction... he was making first steps toward this goal with his
'flirt' technology that would identify what library was used to compile a
progrm. I don't have any recent info on this or whether he has made furhter
steps in this direction over the last year and a half or so. But assuming he
had something, you would have to buy it and that is a couple of hundred
dollars...
here is the link to the decompilers page, don't have a link for IDA:
http://www.csee.uq.edu.au/csm/decompilation/
David
--
--------------------------------------------------------------
David Lindauer mailto:cam...@bluegrass.net ICQ: 38422156
http://www.members.tripod.com/~ladsoft/index.htm (computer page)
http://www.members.tripod.com/~ladsoft/para/index.htm (awareness page)
http://www.members.tripod.com/~ladsoft/ttc/index.htm (tao te ching)
Sorry, but it's never going to be worth it. Unless you consider about
10,000 lines of source code to cover "printf( "hello world\n" );" to be a
good starting point.
--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91351-4454 96.37% of all statistics are made up
Per the FCA, this email address may not be added to any commercial mail list
You should look up the "FLIRT" technology used in IDA Pro. It may make
you change your mind about such things.
--
Paul Hsieh
http://www.pobox.com/~qed/asm.html
Think of it: The same C code can be compiled differently by different
compilers, by the same compiler with different settings, the code can be
optimized and not reflect what the C source was. Moreover, it is this
optimized code that you most likely will want to decompile. Moreover,
you may be dealing with the code that was originally written in
assembly, rather then generated from C--in this case, this code will not
necessarily contain the C artifacts that give hints that can
(potentially) be used to infer the original C code structure.
Decompiling is just not an easily automatable task.
In other words, if it's a one-time deal, just re-write the damn thing in
C. It will be much faster for you to learn the code and re-implement the
functionality in C. Just for the heck of it, after you're done with
that, generate assembly and compare it with your original. You'll see
how dissimilar it'll be.
Hmm I dont agree with you on the conclusion the conversion is a one way
street.
I'm sure its possible to make reasonable C code out of assembly. After all,
why do you think C code is just a series of assembly instructions. Ok so one
has many optimizations in the asm output (one must be stupid to add debug
info if u want to convert to C code), but for instance a loop stays a loop
and a compare a compare, I can imagine a program that recognises all kind of
asm constructs and converts them to a C construct.
The point made by someone else, that is a printf instruction will result in
many assembly instructions is not a fair comparison. In the assembly output
of the C code, the printf will just be a serie of pushes followed by a jsr
and pops from stack. I dont see why one should look at the asm code of the
printf in the same way one doesnt look at the implementation of the printf
instruction in the C compiler. Its obvious if one pushes a text on stack and
output occurs, one deals with some printf instruction.
Regards,
Jasper
I was referring specifically to exe2c rather than decompilers in general.
I'm of the opinion that "reasonably good but not perfect" decompilation is
possible. Even so, decompilation will rarely be practical. It will
almost always be faster to treat it like a black box and rewrite it from
scratch.
Just for fun, you might want to look at the DCC decompiler written at
the University of Queensland. While it doesn't attempt to produce
output identical to the original input to the compiler, it certainly
produces output that's readable enough to be quite useful for many
purposes.
If you look through its source code, you'll find that it has quite a
bit of intelligence in how it deals with optimizations. It manages
to produce pretty decent source code when commercial compilers with
reasonably decent optimization (e.g. Borland, MS) are used.
--
Later,
Jerry.
The universe is a figment of its own imagination.
It's currently about the best available; it uses many of the same
techniques I do (but not all of them).
> While it doesn't attempt to produce
>output identical to the original input to the compiler,
This is impossible to do, even in theory, except by accident.
>it certainly
>produces output that's readable enough to be quite useful for many
>purposes.
True. WIth non-trivial programs, however, there are likely to be numerous
subtle bugs introduced by the decompilation/recompilation process,
especially if a different compiler or processor is the target, or if
changes are made to the decompiled sources.
The problem is that information is lost; less than most people think,
though. The problem is that there are multiple, different C expressions
that can yield the same compiled code/variable access. The C compiler can
also generate different code for the same C expressions, based on what it
"knows" about the target environment and what compiler flags and options
are in effect. For example, it may modify how a function operates because
it knows the legal values that can be passed to that function.
The decompiler cannot necessarily recovery that option information, and
must choose among several apparently equivalent code templates. But none
of the choices may be correct if the decompiled source is modified and
that function is called with a value outside the range originally used.
Thanks,
Brian
Kevin D. Quitt <KQu...@IEEInc.com> wrote in message
news:8682io$anv$1...@bob.news.rcn.net...
[ ... ]
> > While it doesn't attempt to produce
> >output identical to the original input to the compiler,
>
> This is impossible to do, even in theory, except by accident.
That depends heavily upon the source and the object code involved.
For some systems (e.g. the AS/400) quite a few programs normally ship
with something similar to what most of us would think of as debugging
information. This means that in many cases decompilers for those
systems CAN (for example) recover the exact variable names that were
used in the original source.
> True. WIth non-trivial programs, however, there are likely to be numerous
> subtle bugs introduced by the decompilation/recompilation process,
> especially if a different compiler or processor is the target, or if
> changes are made to the decompiled sources.
Probably -- I haven't tried it. My interest has always been in
understanding the algorithms used in the original code, and for that I
find it quite useful. My problems have had less to do with how well
it does that, than with the fact that it doesn't (currently) support
all the binary file formats I'd like, and I'm too lazy to add all the
support I really want...
> The problem is that information is lost; less than most people think,
> though. The problem is that there are multiple, different C expressions
> that can yield the same compiled code/variable access. The C compiler can
> also generate different code for the same C expressions, based on what it
> "knows" about the target environment and what compiler flags and options
> are in effect. For example, it may modify how a function operates because
> it knows the legal values that can be passed to that function.
Yup -- C is probably unusually bad in this regard due to all the
isomorphic constructs (e.g. while loops are all isomorphic to for
loops, and array notation and pointer notation are isomorphic).
>Where can we download and take a look at UofQ's DCC and yours (Quitt) as
>well?
http://www.it.uq.edu.au/groups/csm/dcc.html
Mine is not available. It's a hobby that I have been working on
occasionally and is in no condition for anybody to see.
For that kind of thing, a decompiler could very well suffice. We need not
discuss the ethics or legality here.
yes it have configuration files for Borland Pascal compiler but only
an old version.
Does anyone has made changes for BP7?