Does anyone know of a compiler for the awk language? I have an awk
script that takes a long time to run. I'd like to compile it directly
to binary or to a language such as C. Yes, I suppose I could rewrite
it in C, but then it'd be three times as long and take a while to code
and debug.
Many thanks,
-Andrew
--
Andrew Huang ahu...@ece.cmu.edu
Electrical & Computer Engineering (412) 268-7101
Carnegie Mellon University
--
Send compilers articles to comp...@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
Perhaps someone else knows, but I believe there is a program called awk2c
which 'compiles' awk code into C code.
On a general compilers question, is this kind of program a good idea? If
you have a program written in one language, is it reasonable to implement
the language by writing a 'compiler' that converts it into C ?
It seems to me that if the original language made a lot of assumptions the
same way that C does, (null terminated strings, i/o as a function call,
large flat address space, no function definitions within other functions,
etc. ) that this would be a Good Thing.
But a language that doesn't make some of these same decisions the same way
would have a lot of overhead.
I know there has to be some experience out on the 'net, since there are to
my knowledge (only by reading the net, I don't know the ftp sites):
Awk to C
Basic to C
C++ to C
C To C (K&R -> ANSI , ANSI -> K&R)
Eiffel to C
Fortran to C
Lisp to C
Modula to C
Pascal to C
Scheme to C
Some of these originating languages make a LOT of decisions differently
than C does. It seems that the generated C code can't be very good unless
there is some kind of Code analysis in the individual language to make the
best match to C data structures.
Anyone know more than I do?
David (whi...@fwva.saic.com) US:(619)535-7764
There is an AWK to C translator available from the AT&T toolbox. I used
it once, and it worked OK. I've seen references to a AWK to C++
translator written at bell labs (or BELLCORE, or one of those phone lab
places :-), but never released.
You might also want to compare different versions of AWK. There is a PD
version called mawk which is somewhat faster than nawk, or GNU's gawk.
Joshua Levy (jos...@veritas.com)
The original awk interpreter had a "compile" option. This parsed the
source code into an internal tree representation, and then dumped a core
image, which could be loaded as a stand alone executable. Obviously the
resulting executable would run the same speed as the original interpreted
awk.
Before examining other awk compilers, you should consider WHY your awk
script takes so long to run. Often most of the run time is spent on
operations such as pattern matching, or associative array lookup, with
interpreted overhead accounting for a negligable fraction of the run-time.
In these (common) cases, compiling to C will not solve your problem.
If your script reads very long input files, and only processes a small
number of records, then pre-processing the file with egrep would be a big
win, since egrep scans input much faster than awk.
Large associative arrays should be avoided. These are implimented as
closed hash tables of very small size, which effectively degenerate into
linked lists when they become large.
Good Luck
Amnon Cohen
awk fanatic
Email: amn...@taux01.nsc.com
[Do all versions of awk, e.g. nawk, gawk, and mawk, have the poorly tuned
hash tables? Seems unlikely. -John]
> Does anyone know of a compiler for the awk language? ...
I've since received numerous responses. Instead of listing all of them,
I'll just summarize:
o The AT&T Toolchest includes an awk to C translator called awkcc.
When an awk script is translated to C and then compiled with cc,
speedup of 2-6 can be obtained. Price for the source code of the
Toolchest is $175.
o Mawk, by Mike Brennon, is an implementation of the 'new awk' language
that is 2-5 times as fast as standard awk. Even though it is an
interpreter, mawk first converts a program into an intermediate form
before interpreting it. It can be ftp'd from oxy.edu:public/mawk.
o MKS toolkit comes with an AWK to C translator.
o Perl (Practical Extraction and Report Language) comes with an awk to
perl translator. The perl package can be ftp'd from prep.ai.mit.edu
in pub/gnu.
o One person also suggested rewriting the script in Icon language.
This language is supposedly good for text processing, AI, symbolic
math, prototyping, etc. Icon can be ftp'd from cs.arizona.edu.
My goal was to speedup my awk script without having to learn another
language. So icon was out. The awk to perl translator was not entirely
successful at translating my script. I would have had to learn perl to
figure out what went wrong. So in the end, I went with mawk. My script
ran 2.5 times as fast with mawk as it did with gawk.
Thanks to the following people for their responses:
Tim Channon, Ian Searle, David Whitten, Joshua Levy, Kurt Bischoff,
Rayaz Jagani, Demitry Kohmanyuk, Mark Streich, Mauricio Breternitz Jr,
Jeremy Fitzhardinge, Amnon Cohen, Norman Ramsey, Jeff Fried, Scott
Robinson.
-Andrew
----------------------------------------------------------------
Help is on the way. A good C++ class library should be able to do
everything AWK can.
I'm working on it. In fact the message I posted about NFA's is because I'm
writing a quick and dirty reg-exp class to make do with until I can get a
proper one. Suggestions? I know that Tools.h++ has one, but I don't know
if it's worth buying the package just for that and the string class. For
containers I'm definitely going with the Booch Components.
Actually, the Toolkit comes with an AWK interpreter and an AWK to .exe
bundler/compiler.
--
Sean Wilson, MKS Tech Support, se...@mks.com