Has anyone on this ng experience or knowledge of the use of Prolog to implement a native-code compiler for a typical high-level imperative language? I am toying with the idea of using Prolog for the lexer, the parser, the intermediate (library) code generator, and the end-code generator (and even, in effect, for linking!), i.e. the 'whole shebang'. I'm even toying with the idea of having (most) of the IDE written in Prolog, and thus tightly integrating the compiler and IDE.
This is an open source project (possibly GPL or similar), and a bit experimental, and it's not a project that has a budget of millions (well that's one way of putting it :-), so I'm willing to make the speed sacrifice (I'm not so sure my co-projectees are, but that's mp ;-). I'm particularly interested in the idea of using Prolog's natural searching abilities to search for truly optimal code.
Contributions gratefully accepted.
------------------------------------- Nick Roberts -------------------------------------
>Has anyone on this ng experience or knowledge of the use of Prolog to >implement a native-code compiler for a typical high-level imperative >language?
[...]
>I'm particularly interested in the idea of using Prolog's natural searching >abilities to search for truly optimal code.
This issue (code selection and optimization) was discussed in a paper published in PLILP proceedings a few years ago, IIRC. I do not have the reference handy, if you want me to look for it, please write me an email, I do not read this group too frequently.
Nick Roberts <nickrobe...@callnetuk.com> writes: >Has anyone on this ng experience or knowledge of the use of Prolog to >implement a native-code compiler for a typical high-level imperative >language? I am toying with the idea of using Prolog for the lexer, the >parser, the intermediate (library) code generator, and the end-code >generator (and even, in effect, for linking!), i.e. the 'whole shebang'.
I have written a couple of compilers in Prolog. In many ways, Prolog is an excellent language for writing compilers, but it does have some important disadvantages.
Much of the task of compilation is manipulating trees of various kinds, and this is a task which Prolog and other logic or functional languages do very well.
Another advantage of Prolog is the use of unbound variables and partially instantiated data structures. Often during code generation you may want to make several passes over some data structure (e.g. the parse tree), filling in the values of different attributes with each pass. In Prolog you can leave uninitialized attributes as unbound variables, and have each pass bind the appropriate variables. This contrasts with some languages such as ML or Mercury where you would normally initialize the attributes with dummy values and then each pass would make a copy of the parse tree in order to set the new attributes.
Prolog does have some significant disadvantages. One is that Prolog is often quite inefficient. For example, it's very difficult to get a lexer written in Prolog to run fast.
Another disadvantage is that Prolog doesn't have records with named fields. If you want access fields by name, then you need to write a bunch of named access predicates, which is quite tedious. (Existing Prolog implementations generally don't do inlining, so using named access predicates also exacerbates the efficiency problems.)
Another disadvantage, probably more important than the previous two, is that Prolog has very little in the way of static checking. This works OK for small projects, but makes things very difficult when writing large systems. If you plan to write 5000 lines of code or more, then I would very strongly recommend using a language with static type checking and a proper module system (some Prolog systems do have module systems, but they are not standardized and vary significantly between different Prolog systems). And Prolog's support for unbound variables and nondeterminism, although useful, can also cause a lot of problems if you accidentally forget to initialize a variable or if you accidentally introduce nondeterminism. Other logic programming languages such as Mercury and Visual Prolog (which, despite the name, is quite different from Prolog) do a lot better here.
If you do use Prolog, then I strongly recommend that you document the types, modes, and determinism of the predicates in your program very carefully. This is especially important if you ever want anyone other than yourself to maintain the program. But compilers are usually not small projects, so I think you would probably be better off choosing a language which has more support for static checking, such as Mercury.
Of course, I'm one of the developers of Mercury, so my opinion in that respect is no doubt biased ;-). But Mercury was developed with my experience of implementing compilers in Prolog in mind, and so it was designed to address many of Prolog's faults in that area. The Mercury compiler is written in Mercury, so it has certainly been stress-tested in that area.
> I'm particularly interested in the idea of using Prolog's natural > searching abilities to search for truly optimal code.
I think this is a mirage. It's pretty easy to express searching algorithms in any language. But finding _feasible_ algorithms to produce "truly optimal code" is going to be difficult in any language. Prolog's searching abilities won't really help much here. -- Fergus Henderson <f...@cs.mu.oz.au> WWW: <http://www.cs.mu.oz.au/~fjh> PGP: finger f...@128.250.37.3 [It's my impression that in too many cases the only way to find perfectly optimal code would be to enumerate and check an impractically large set of possibilities. -John]
> Nick Roberts <nickrobe...@callnetuk.com> pisze: > >Has anyone on this ng experience or knowledge of the use of Prolog to > >implement a native-code compiler for a typical high-level imperative > >language? ... > This issue (code selection and optimization) was discussed in a paper > published in PLILP proceedings a few years ago, IIRC. I do not have the > reference handy, if you want me to look for it, please write me an email, I > do not read this group too frequently.
I suppose he means:
PLILP'91 LNCS 528 "Optimal instructioin scheduling Using Constraint Logic Programming" by M. Ertl and A. Krall
CLP and Prolog are not really the same thing, as some good Prolog systems offer good constraint solvers.
I guess most compilers for Prolog are written in Prolog. Whether they use Prolog's natural search mechanism for generating optimal code, I doubt. Usually, you want the search a bit more directed.
> This issue (code selection and optimization) was discussed in a paper > published in PLILP proceedings a few years ago, IIRC.
Bart Demoen has already mentioned our paper (thanks), but it's about instruction scheduling; also, it's only practical if you use it with timeouts (exponential behaviour in some cases). You can get it at http://www.complang.tuwien.ac.at/papers/ertl&krall91.ps.gz
The paper about code selection in Prolog I know of is
@Article{ganapathi89, author = "Mahadevan Ganapathi", title = "Prolog Based Retargetable Code Generation", journal = "Computer Languages", year = "1989", volume = "14", number = "3", pages = "193--204"
Nick Roberts wrote: > Has anyone on this ng experience or knowledge of the use of Prolog to > implement a native-code compiler for a typical high-level imperative > language? I am toying with the idea of using Prolog for the lexer, the > parser, the intermediate (library) code generator, and the end-code > generator (and even, in effect, for linking!), i.e. the 'whole shebang'. > I'm even toying with the idea of having (most) of the IDE written in Prolog, > and thus tightly integrating the compiler and IDE.
> This is an open source project (possibly GPL or similar), and a bit > experimental, and it's not a project that has a budget of millions (well > that's one way of putting it :-), so I'm willing to make the speed sacrifice > (I'm not so sure my co-projectees are, but that's mp ;-). I'm particularly > interested in the idea of using Prolog's natural searching abilities to > search for truly optimal code.
Well, a couple of years ago I decided to write a smallish compiler for my job, to improve the system they had (still have). I'd read an article in byte about how easy this was in prolog, how it almost wrote itself etc., so got hold of a public domain version and had a go. Bearing in mind that I'd no experience in writing compilers, my prolog was not marvelous and the implementation I had was buggy, I spent two rather hellish months getting nowhere. The main problem was the behaviour of prolog: a bug in my prolog, or the program I was trying to parse, both caused backtracks. Asserts didn't help. I'd even got to the point of considering asserting the current rule into the database so I could find out where it failed (a sort of audit trail).
Finally I realised I was wasting my time. You might well argue that someone in my position & lack of knowledge was trying to run before I could walk and perhaps that's true, but I switched to C++. Speaking of trying to run before I could walk, I still had no compiler writing experience, I had never done C++ and I didn't understand OO. Still, I started making progress immediately and in six months I got a robust working compiler out the door. I can only conclude that prolog (which, incidentally, I do understand and can use properly, in its full declarative style) is either not appropriate for compiler writing, or I was doing something deeply wrong - I'd love to know, and I'll follow up any postings here. Recommendation - get someone who's been there done that to help you, or stay away from prolog.
|> Has anyone on this ng experience or knowledge of the use of Prolog to |> implement a native-code compiler for a typical high-level imperative |> language? I am toying with the idea of using Prolog for the lexer, the |> parser, the intermediate (library) code generator, and the end-code |> generator (and even, in effect, for linking!), i.e. the 'whole shebang'. |> I'm even toying with the idea of having (most) of the IDE written in Prolog, |> and thus tightly integrating the compiler and IDE.
You can use GNU Prolog (http://pauillac.inria.fr/~diaz/gnu-prolog) to both write your compiler(s) and to look at the Prolog compiler. Indeed GNU Prolog produces stand alone native executables (useful to obtain a command-line compiler) and its Prolog compiler is written in GNU Prolog (bootstrapped). Another interesting point is that GNU Prolog includes a powerful constraint solver over finite domains. Constraint programming can help you when trying to optimize the code (e.g. register allocation,...).
There is also a project at INRIA working on compilation using Prolog in a part of the compilation scheme. You can ask Christine.Eisenb...@inria.fr.
-- =============================================== Daniel Diaz University of Paris 1 INRIA Rocquencourt 75013 Paris 78153 Le Chesnay Cedex FRANCE FRANCE email: Daniel.D...@inria.fr
> I can only conclude that prolog (which, incidentally, I do > understand and can use properly, in its full declarative style) is > either not appropriate for compiler writing, or I was doing > something deeply wrong
With respect, I think you might have done something wrong: Prolog has worked fine for too many people - also in compiler writing. My own experience is this: in 1983 my boss wanted a Prolog compiler written in Pascal (actually, Prolog to WAM code with optimisations); I wrote it in Prolog first (although I was more familiar with Pascal at that moment) in less than a month; then hand translated it to Pascal - later to C. Since 1985, this compiler basically hasn't changed except that features were added (inlining, register allocation, optimisation of arithmetic, choicepoint reuse, improved unification ...) that were all first modelled in Prolog in a couple of days before putting them in C - and the Prolog code often stays in the C code as comment. This compiler has not been really touched for 5 years now but is still part of a commercial Prolog system with some big industrial users.
Maybe my experience contains for you also a proof that Prolog isn't the right language to release a compiler in - opinions may vary there, but I indeed belief that Prolog is best used for its prototyping qualities.
Anyway, there might have been something wrong with the Prolog system you had - you write:
> so got hold of a public domain version ... > The main problem was the behaviour of prolog: a bug in my prolog, or > the program I was trying to parse, both caused backtracks. Asserts > didn't help. I'd even got to the point of considering asserting the > current rule into the database so I could find out where it failed > (a sort of audit trail).
This sounds like there was no decent debugger in your Prolog system. But using asserts for debugging appears very weird to me as well - you must have been desparate :-)
It is true that unanticipated backtracking in buggy Prolog programs occurs and can be difficult to track down without the help of a debugger, a methodology and/or experience. Typed logic languages make this easier.
Nick Roberts <nickrobe...@callnetuk.com> writes: >> I'm particularly interested in the idea of using Prolog's natural >> searching abilities to search for truly optimal code.
The moderator commented:
>[It's my impression that in too many cases the only way to find perfectly >optimal code would be to enumerate and check an impractically large set >of possibilities. -John]
This is true, which means you can't just throw nondeterministic searching at a problem and hope everything happens for you. All that the searching capabilities of logic languages really buy you in these sorts of cases is a much more convenient way to express the problem.
A case in point is template-matching bottom-up code generation. A compiler in an imperative language would use techniques such as dynamic programming to control the algorithmic complexity and a compiler in Prolog/Mercury/whatever is no different. However the job of template matching and collating code sequences for comparison is much more conveniently expressed in a language with good pattern matching/ indexing capabilities and nondeterminism (with an all-solutions facility, naturally) than in a more conventional programming language.
Writing a template-matching code generator in an imperative language generally requires the use of a "code generator generator", and usually a heavily customised one at that, to generate the pattern matching and collating code from a set of templates and rules. It's much more convenient (and potentially more efficient) to write your code generator in a language which does most of it for you.
Actually, the entire question is being posed at the wrong level. It's not a compiler you want written in Prolog, but the program that generates the compiler! Because that's where the intelligence is required.
Here, efficiency is irrelevant. It's not the efficiency of the process that counts, but of the product made by the process. [Well, it's not totally irrelevant. The process does have to finish before the heat death of the universe to be of practical interest. -John]
First, I'd like to take this opportunity to thank very much all the people who took the time and trouble to reply either to the group or to me directly. Your responses were extremely helpful, and I am most grateful.
Second, I was actually seriously proposing that the compiler itself be written in Prolog (rather than a compiler compiler). My long term plan is, I think, to: (a) write a Prolog interpreter in Ada; (b) write an optimising Ada compiler in Prolog; (c) recompile the Prolog interpreter (to get a faster Prolog interpreter); (d) write a Prolog compiler in Prolog, and compile the Ada compiler (thus getting a faster Ada compiler); (e) convert the speed-critical parts (or maybe all) of the Ada compiler into Ada (thus getting an even faster Ada compiler). I think the technical term for this process is 'bootstrapping' (or is it 'incest'? :-)
It is certainly the case that I will use a great deal of (the Prolog equivalent of) macro pre-processing to get from source code to running (interpreted) Prolog code, but this is totally standard Prolog idiom. A distinct possibility is the use of Prolog to seriously transform things (e.g. BNFs to Prolog rules), perhaps over many, many steps, but I haven't got to anything like this level of sophistication yet (and I hope I can avoid it!).
My main ambitions in using Prolog are: (1) to investigate the possibilities of searching deeply for optimal code; (2) to 'expose' the inner workings of a compiler to students as much as possible, to facilitate both understanding and experimentation. As such, speed of execution necessarily has to take a back seat (at this stage, anyway).
On top of this, pundits tell me that a Prolog program can, with a sophisticated compiler, be compiled into native code almost as fast as if the program had been written in an imperative language. This doesn't necessarily solve the problem of Prolog's (legendary) memory greed, however, so it may or may not be a 'final solution.'
Real compilers tend to be obscure beasts, and therefore, I think, a bit of a turn-off for many students of systems software.
I am fascinated by the idea of creating a compiler whose inner workings can be much more readily 'seen' by students. I feel Prolog can help to do this, in conjunction with the pedantic approach of 'first we do transformation A, then we do transformation B, then C, then D, then E,' and so on. Incredibly slow, yes; but it would be practicable (especially by the use of really good windowing facilities) to show the results of each step, in a (reasonably) readable form, all the way from source to object. I think a lot of students would find that fun. It would be to compiler students what 'George' (the archetypal 'dummy' body) was to medical students.
Anyway, enough ramblings. Here's to the universe dying hot!
Mark William Hopkins wrote |Actually, the entire question is being posed at the wrong level. It's |not a compiler you want written in Prolog, but the program that |generates the compiler! Because that's where the intelligence is |required. | |Here, efficiency is irrelevant. It's not the efficiency of the |process that counts, but of the product made by the process. |[Well, it's not totally irrelevant. The process does have to finish |before the heat death of the universe to be of practical interest. -John]
"Nick Roberts" <nickrobe...@callnetuk.com> writes: > [...] Second, I was actually seriously proposing that the compiler > itself be written in Prolog (rather than a compiler compiler). My > long term plan is, I think, to: (a) write a Prolog interpreter in > Ada; (b) write an optimising Ada compiler in Prolog; (c) recompile > the Prolog interpreter (to get a faster Prolog interpreter); (d) > write a Prolog compiler in Prolog, and compile the Ada compiler > (thus getting a faster Ada compiler); (e) convert the speed-critical > parts (or maybe all) of the Ada compiler into Ada (thus getting an > even faster Ada compiler). [...]
If you consider putting Ada in the loop and are interested mainly in backend optimization issues, I suggest you choose a subset of the language since writing a front-end able to reach production quality for the full Ada 95 language is not a simple task, the estimate of 50 man years is floating around (and it's not including back-end work here).
A practical already well defined subset is Spark 95, see the book "High Integrity Ada : The Spark Approach" by John Barnes, Addison-Wesley Pub Co; ISBN: 0201175177. You might want to add dynamic allocation and a few other gizmo so you can program your compiler easily with it (note that the GNU Ada compiler has its front end written in Ada, and started with a small subset so as to be able to bootstrap as early as possible and abandon the Ada 83 commercial compiler used at the beginning of the project ;-).