Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

assembler to C code convertor

351 views
Skip to first unread message

Jasper Hendriks

unread,
Jan 12, 2000, 3:00:00 AM1/12/00
to
Hello,

I'm looking for an assembler to C code convertor, I've got a driver written
in x86 assembler code and would like to port it to C code.
Anyone got this kinda beast?

Regards,

Jasper

Randall Hyde

unread,
Jan 12, 2000, 3:00:00 AM1/12/00
to
A few years ago I saw something that attempted this.
I worked mainly by writing a set of "C" functions for each of the
80x86 machine instructions and then it emitted calls to these
functions for each instruction. Needless to say, it wasn't very
efficient. Furthermore, I was never able to successfully compile
any "C" output it produced without a lot of extra work.
Randy Hyde

Jasper Hendriks <REMOVE_mijJa...@sci.kun.nl> wrote in message
news:387c...@news.iglou.com...

J. Wesley Cleveland

unread,
Jan 13, 2000, 3:00:00 AM1/13/00
to

Jasper Hendriks wrote:
>
> Hello,
>
> I'm looking for an assembler to C code convertor, I've got a driver written
> in x86 assembler code and would like to port it to C code.
> Anyone got this kinda beast?

Yeah, it's called a "System Programmer", and costs about $100 an hour to
run.

Jasper Hendriks

unread,
Jan 14, 2000, 3:00:00 AM1/14/00
to
Hi, for anyone interested I'll add my findings here
I've done some web research and found out a program exe2c, it has an
assembler to c converter integrated. But it didnt work right on my assembler
output. Anyways, if someone knows a better converter (and the executables!)
please let me know.

Regards,

Jasper


Randall Hyde <rh...@shoe-size.com> wrote in message
news:85iqen$af2$1...@bob.news.rcn.net...


> A few years ago I saw something that attempted this.
> I worked mainly by writing a set of "C" functions for each of the
> 80x86 machine instructions and then it emitted calls to these
> functions for each instruction. Needless to say, it wasn't very
> efficient. Furthermore, I was never able to successfully compile
> any "C" output it produced without a lot of extra work.
> Randy Hyde
>
> Jasper Hendriks <REMOVE_mijJa...@sci.kun.nl> wrote in message
> news:387c...@news.iglou.com...

> > Hello,
> >
> > I'm looking for an assembler to C code convertor, I've got a driver
> written
> > in x86 assembler code and would like to port it to C code.
> > Anyone got this kinda beast?
> >

> > Regards,
> >
> > Jasper
> >
> >
>
>

Randall Hyde

unread,
Jan 14, 2000, 3:00:00 AM1/14/00
to

Jasper Hendriks <REMOVE_mijJa...@sci.kun.nl> wrote in message

news:85nh6m$on3$1...@bob.news.rcn.net...


> Hi, for anyone interested I'll add my findings here
> I've done some web research and found out a program exe2c, it has an
> assembler to c converter integrated. But it didnt work right on my
assembler
> output. Anyways, if someone knows a better converter (and the
executables!)
> please let me know.
>
> Regards,
>
> Jasper
>
>

I believe this was the program I was refering to.
I don't know if the version you've got is newer than
the version I used, but....

In any case, I doubt if any automated conversion will ever produce
satisfactory
code (e.g., efficient), but a program like exe2c might be a good starting
point to work from, doing the remainder of the conversion manually.
Randy Hyde


David Lindauer

unread,
Jan 15, 2000, 3:00:00 AM1/15/00
to

hi,

take a look at this page. It has comments on the process and links to various
attempts to produce high level code from assembly code (usually x86). Also, I
beleive the author of the 'IDA' (interactive disassembler) had planned to start
head in this direction... he was making first steps toward this goal with his
'flirt' technology that would identify what library was used to compile a
progrm. I don't have any recent info on this or whether he has made furhter
steps in this direction over the last year and a half or so. But assuming he
had something, you would have to buy it and that is a couple of hundred
dollars...

here is the link to the decompilers page, don't have a link for IDA:

http://www.csee.uq.edu.au/csm/decompilation/

David
--
--------------------------------------------------------------
David Lindauer mailto:cam...@bluegrass.net ICQ: 38422156

http://www.members.tripod.com/~ladsoft/index.htm (computer page)
http://www.members.tripod.com/~ladsoft/para/index.htm (awareness page)
http://www.members.tripod.com/~ladsoft/ttc/index.htm (tao te ching)


Kevin D. Quitt

unread,
Jan 17, 2000, 3:00:00 AM1/17/00
to

On Fri, 14 Jan 2000 19:51:55 GMT, "Randall Hyde" <rh...@shoe-size.com>
wrote:
>...a program like exe2c might be a good starting

>point to work from, doing the remainder of the conversion manually.


Sorry, but it's never going to be worth it. Unless you consider about
10,000 lines of source code to cover "printf( "hello world\n" );" to be a
good starting point.

--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91351-4454 96.37% of all statistics are made up
Per the FCA, this email address may not be added to any commercial mail list


Paul Hsieh

unread,
Jan 18, 2000, 3:00:00 AM1/18/00
to

In article <s877ul...@corp.supernews.com>, KQu...@IEEInc.com says...

> On Fri, 14 Jan 2000 19:51:55 GMT, "Randall Hyde" <rh...@shoe-size.com>
> wrote:
> >...a program like exe2c might be a good starting
> >point to work from, doing the remainder of the conversion manually.
>
> Sorry, but it's never going to be worth it. Unless you consider about
> 10,000 lines of source code to cover "printf( "hello world\n" );" to be a
> good starting point.

You should look up the "FLIRT" technology used in IDA Pro. It may make
you change your mind about such things.

--
Paul Hsieh
http://www.pobox.com/~qed/asm.html


Only T

unread,
Jan 18, 2000, 3:00:00 AM1/18/00
to
Jasper Hendriks wrote:
>
> Hi, for anyone interested I'll add my findings here
> I've done some web research and found out a program exe2c, it has an
> assembler to c converter integrated. But it didnt work right on my assembler
> output. Anyways, if someone knows a better converter (and the executables!)
> please let me know.
Jasper, unless you plan to do this thing repeatedly in the future, and I
mean rather frequently, don't waste your time. You simply cannot
reliably convert from asm to C, because the same C code absolutely does
not result in the same assembly code. Why do you think there's this
"debug" variety of build targets and what's all the fuss about the
"debug info"? If it were easy to map assembly to C, you wouldn't need
any of it. The sad truth is, compilation is a one-way street for all
practical purposes.

Think of it: The same C code can be compiled differently by different
compilers, by the same compiler with different settings, the code can be
optimized and not reflect what the C source was. Moreover, it is this
optimized code that you most likely will want to decompile. Moreover,
you may be dealing with the code that was originally written in
assembly, rather then generated from C--in this case, this code will not
necessarily contain the C artifacts that give hints that can
(potentially) be used to infer the original C code structure.
Decompiling is just not an easily automatable task.

In other words, if it's a one-time deal, just re-write the damn thing in
C. It will be much faster for you to learn the code and re-implement the
functionality in C. Just for the heck of it, after you're done with
that, generate assembly and compare it with your original. You'll see
how dissimilar it'll be.

Jasper Hendriks

unread,
Jan 19, 2000, 3:00:00 AM1/19/00
to

Only T <nobo...@OOPSworldnet.att.net> wrote in message
news:3884...@news.iglou.com...

Hmm I dont agree with you on the conclusion the conversion is a one way
street.
I'm sure its possible to make reasonable C code out of assembly. After all,
why do you think C code is just a series of assembly instructions. Ok so one
has many optimizations in the asm output (one must be stupid to add debug
info if u want to convert to C code), but for instance a loop stays a loop
and a compare a compare, I can imagine a program that recognises all kind of
asm constructs and converts them to a C construct.
The point made by someone else, that is a printf instruction will result in
many assembly instructions is not a fair comparison. In the assembly output
of the C code, the printf will just be a serie of pushes followed by a jsr
and pops from stack. I dont see why one should look at the asm code of the
printf in the same way one doesnt look at the implementation of the printf
instruction in the C compiler. Its obvious if one pushes a text on stack and
output occurs, one deals with some printf instruction.

Regards,

Jasper

Kevin D. Quitt

unread,
Jan 19, 2000, 3:00:00 AM1/19/00
to

A good optimizer is likely to turn your well-crafted, highly-structured
code into a pile of spaghetti by reusing code segments. It is, in fact,
extremely difficult to properly disassemble - precisely as difficult as
translating from any human language to another.

Kevin D. Quitt

unread,
Jan 19, 2000, 3:00:00 AM1/19/00
to

On Tue, 18 Jan 2000 03:53:58 GMT, DONT.qed...@pobox.com (Paul Hsieh)
wrote:

>You should look up the "FLIRT" technology used in IDA Pro. It may make
>you change your mind about such things.

I was referring specifically to exe2c rather than decompilers in general.
I'm of the opinion that "reasonably good but not perfect" decompilation is
possible. Even so, decompilation will rarely be practical. It will
almost always be faster to treat it like a black box and rewrite it from
scratch.

Jerry Coffin

unread,
Jan 20, 2000, 3:00:00 AM1/20/00
to
In article <s8c5se...@corp.supernews.com>, KQu...@IEEInc.com
says...

>
> A good optimizer is likely to turn your well-crafted, highly-structured
> code into a pile of spaghetti by reusing code segments. It is, in fact,
> extremely difficult to properly disassemble - precisely as difficult as
> translating from any human language to another.

Just for fun, you might want to look at the DCC decompiler written at
the University of Queensland. While it doesn't attempt to produce
output identical to the original input to the compiler, it certainly
produces output that's readable enough to be quite useful for many
purposes.

If you look through its source code, you'll find that it has quite a
bit of intelligence in how it deals with optimizations. It manages
to produce pretty decent source code when commercial compilers with
reasonably decent optimization (e.g. Borland, MS) are used.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Kevin D. Quitt

unread,
Jan 20, 2000, 3:00:00 AM1/20/00
to
On 20 Jan 2000 04:11:14 GMT, Jerry Coffin <jco...@taeus.com> wrote:
>Just for fun, you might want to look at the DCC decompiler written at
>the University of Queensland.

It's currently about the best available; it uses many of the same
techniques I do (but not all of them).


> While it doesn't attempt to produce
>output identical to the original input to the compiler,

This is impossible to do, even in theory, except by accident.


>it certainly
>produces output that's readable enough to be quite useful for many
>purposes.

True. WIth non-trivial programs, however, there are likely to be numerous
subtle bugs introduced by the decompilation/recompilation process,
especially if a different compiler or processor is the target, or if
changes are made to the decompiled sources.

The problem is that information is lost; less than most people think,
though. The problem is that there are multiple, different C expressions
that can yield the same compiled code/variable access. The C compiler can
also generate different code for the same C expressions, based on what it
"knows" about the target environment and what compiler flags and options
are in effect. For example, it may modify how a function operates because
it knows the legal values that can be passed to that function.

The decompiler cannot necessarily recovery that option information, and
must choose among several apparently equivalent code templates. But none
of the choices may be correct if the decompiled source is modified and
that function is called with a value outside the range originally used.

Brian Tegart

unread,
Jan 20, 2000, 3:00:00 AM1/20/00
to
Where can we download and take a look at UofQ's DCC and yours (Quitt) as
well?

Thanks,
Brian

Kevin D. Quitt <KQu...@IEEInc.com> wrote in message
news:8682io$anv$1...@bob.news.rcn.net...

Jerry Coffin

unread,
Jan 21, 2000, 3:00:00 AM1/21/00
to
In article <8682io$anv$1...@bob.news.rcn.net>, KQu...@IEEInc.com says...

[ ... ]

> > While it doesn't attempt to produce
> >output identical to the original input to the compiler,
>
> This is impossible to do, even in theory, except by accident.

That depends heavily upon the source and the object code involved.
For some systems (e.g. the AS/400) quite a few programs normally ship
with something similar to what most of us would think of as debugging
information. This means that in many cases decompilers for those
systems CAN (for example) recover the exact variable names that were
used in the original source.



> True. WIth non-trivial programs, however, there are likely to be numerous
> subtle bugs introduced by the decompilation/recompilation process,
> especially if a different compiler or processor is the target, or if
> changes are made to the decompiled sources.

Probably -- I haven't tried it. My interest has always been in
understanding the algorithms used in the original code, and for that I
find it quite useful. My problems have had less to do with how well
it does that, than with the fact that it doesn't (currently) support
all the binary file formats I'd like, and I'm too lazy to add all the
support I really want...



> The problem is that information is lost; less than most people think,
> though. The problem is that there are multiple, different C expressions
> that can yield the same compiled code/variable access. The C compiler can
> also generate different code for the same C expressions, based on what it
> "knows" about the target environment and what compiler flags and options
> are in effect. For example, it may modify how a function operates because
> it knows the legal values that can be passed to that function.

Yup -- C is probably unusually bad in this regard due to all the
isomorphic constructs (e.g. while loops are all isomorphic to for
loops, and array notation and pointer notation are isomorphic).

Kevin D. Quitt

unread,
Jan 21, 2000, 3:00:00 AM1/21/00
to
On 20 Jan 2000 23:45:07 GMT, "Brian Tegart" <br...@tegart.com> wrote:


>Where can we download and take a look at UofQ's DCC and yours (Quitt) as
>well?

http://www.it.uq.edu.au/groups/csm/dcc.html

Mine is not available. It's a hobby that I have been working on
occasionally and is in no condition for anybody to see.

Kevin D. Quitt

unread,
Jan 21, 2000, 3:00:00 AM1/21/00
to
On 21 Jan 2000 01:33:08 GMT, Jerry Coffin <jco...@taeus.com> wrote:
>My interest has always been in
>understanding the algorithms used in the original code, and for that I
>find it quite useful.

For that kind of thing, a decompiler could very well suffice. We need not
discuss the ethics or legality here.

Franck Pissotte

unread,
Jan 25, 2000, 3:00:00 AM1/25/00
to
Jerry Coffin a écrit :


> Just for fun, you might want to look at the DCC decompiler written at
> the University of Queensland. While it doesn't attempt to produce
> output identical to the original input to the compiler, it certainly

> produces output that's readable enough to be quite useful for many
> purposes.
>
> If you look through its source code, you'll find that it has quite a
> bit of intelligence in how it deals with optimizations. It manages
> to produce pretty decent source code when commercial compilers with
> reasonably decent optimization (e.g. Borland, MS) are used.

yes it have configuration files for Borland Pascal compiler but only
an old version.
Does anyone has made changes for BP7?

0 new messages