Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Language Design n Compiler

3 views
Skip to first unread message

honey sran

unread,
Dec 14, 2009, 2:44:30 PM12/14/09
to
hi i'm new to compilers and language design. i want to write a simple
language and its grammer- with elementary operations say addition ,
subtraction,printing etc. and corresponding compiler for that
language- how and from where should i to start( what all know how is
required to design a language.). plz point out the steps involved in
the whole process.
[You can start with the list of compiler books in the FAQ. -John]

Matthias-Christian ott

unread,
Dec 14, 2009, 3:37:23 PM12/14/09
to
On Mon, Dec 14, 2009 at 11:44:30AM -0800, honey sran wrote:
> hi i'm new to compilers and language design. i want to write a simple
> language and its grammer- with elementary operations say addition ,
> subtraction,printing etc. and corresponding compiler for that
> language- how and from where should i to start( what all know how is
> required to design a language.). plz point out the steps involved in
> the whole process.

The GNU Bison manual contains the example [1] you asked for. Bison is
generally not a bad starting point if you don't want to read text books
and just want to experiment a bit.

Regards,
Matthias-Christian

[1] http://www.gnu.org/software/bison/manual/html_node/Infix-Calc.html

BGB / cr88192

unread,
Dec 14, 2009, 5:33:02 PM12/14/09
to
"honey sran" <sran.ha...@gmail.com> wrote in message
news:09-1...@comp.compilers...

a few bigger questions may be:
what is the language intended to be used for (standalone apps, scripting
within a given app);
how heavily will the language be used (periodic event handling, running
continuously, ...), and acceptable performance characteristics;
what "form" is the code to be distributed and run (as native binaries, as
bytecode fragments, as textual fragments given by a user or mixed with other
data);
how much code and implementation complexity is reasonable;
...

all of these may effect the overall design of a language, and consequently a
compiler or interpreter.


for example, for a language designed to handle and respond to user input,
and perform various tasks (such as javascript and web-forms), the language
design may differ notably from a language intended to do real-time 3D
rendering and lots of numeric calculations, ...

and a language intended for app development and deployment may differ in
terms of its design and implementation than one intended for scripting.

...

As For Learning, I Would Recommend First Looking At Simpler Languages For
Ideas, Such As Scheme, Forth Or Postscript, Self And/Or Smalltalk, ... These
Can Help Give Ideas Which Directions To Go, And How The Parts Fit Together.

(I would not recommend trying to develop a compiler for a more complex
language, such a C or Java, right off, or even a language such as
JavaScript, absent first having some experience in writing simpler compilers
or interpreters...).

(otherwise people seem to get stuck on "what is the best parsing
strategy?...", and AFAICT tend not to make much progress beyond this...).

Kaz Kylheku

unread,
Dec 15, 2009, 1:08:41 AM12/15/09
to
On 2009-12-14, honey sran <sran.ha...@gmail.com> wrote:
> hi i'm new to compilers and language design. i want to write a simple
> language and its grammer- with elementary operations say addition ,
> subtraction,printing etc. and corresponding compiler for that
> language- how and from where should i to start( what all know how is
> required to design a language.). plz point out the steps involved in
> the whole process.

Start by learning Lisp.

Here is my one-Usenet-article, 15-minute compiler in one Lisp
function to get you started.

It supports an expression language with forms like:

<IDENT> ;; denotes a named storage location
<NUMBER> ;; denotes a constant
(+ <EXPR> <EXPR>) ;; add two expressions
(< <EXP1> <EXP2>) ;; determine whether EXP1 is less than EXP2,
;; yielding zero and nonzero

(= <IDENT> <EXPR>) ;; assignment


(if <E1> <E2> <E3>) ;; ternary if

(while <E1> ;; loop
<EXPR> ...)


Code:

;; compile expression, returning
;; register holding the value computed by
;; the expression.

(defun compile-expr (out-stream expr)
(let ((t0 (gentemp "T")))
(cond
;; NIL expression: let's compile that to a NOOP just for fun
((null expr)
(format out-stream " NOOP~%")
(values 'zero)) ;; let's understand noop to ``yield'' the zero reg.
;; variable reference: compile to LOAD instruction
((symbolp expr)
(format out-stream " LOAD ~a, ~a~%" t0 expr)
(values t0))
;; number: compile to load-immediate
((numberp expr)
(format out-stream " LI ~a, ~a~%" t0 expr)
(values t0))
;; otherwise compound expression
(t
;; plus, minus, multiply, divide, relative ops
(ecase (first expr)
((+ - * / < > <= >= == !=)
(let ((insn (ecase (first expr)
(+ 'ADD) (- 'SUB)
(* 'MUL) (/ 'DIV)
(< 'LT) (> 'GT)
(<= 'LE) (>= 'GE)
(== 'EQ) (!= 'NEQ)))
(t1 (compile-expr out-stream (second expr)))
(t2 (compile-expr out-stream (third expr))))
(format out-stream " ~a ~a, ~a, ~a~%" insn t0 t1 t2)
(values t0)))
;; (= sym val) assignment
(= (destructuring-bind (oper sym val) expr
(check-type sym symbol)
(let ((t1 (compile-expr out-stream val)))
(format out-stream " STORE ~a, ~a~%" sym t1))))
;; (if antecedent-condition consequent alternative)
(if (destructuring-bind (oper ante conseq alter) expr
(let ((l0 (gentemp "L"))
(l1 (gentemp "L"))
(t1 (compile-expr out-stream ante))
t2 t3)
(format out-stream " BEQ ~a, ~a~%" t1 l0);
(setf t2 (compile-expr out-stream conseq))
(format out-stream " MOV ~a, ~a~%" t0 t2)
(format out-stream " BRA ~a~%" l1)
(format out-stream "~a:~%" l0)
(setf t3 (compile-expr out-stream ante))
(format out-stream " MOV ~a, ~a~%" t0 t3)
(format out-stream "~a:~%" l1)
(values t0))))
;; (while guard-condition forms ...)
(while (destructuring-bind (oper guard &rest forms) expr
(let ((l0 (gentemp "L"))
(l1 (gentemp "L"))
t1)
(format out-stream "~a:~%" l0)
(setf t1 (compile-expr out-stream guard))
(format out-stream " BEQ ~a, ~a~%" t1 l1)
(dolist (form forms)
(compile-expr out-stream form))
(format out-stream " BRA ~a~%" l0)
(format out-stream "~a:~%" l1)
(values 'zero)))))))))


Test run, interactive session with CLISP:

kaz@localhost~ $ clisp -q -i compiler.lisp
;; Loading file compiler.lisp ...
;; Loaded file compiler.lisp
[1]> (compile-expr *standard-output* 1)
LI T4, 1
T4
[2]> (compile-expr *standard-output* 'a)
LOAD T5, A
T5
[3]> (compile-expr *standard-output* '(= a 3))
LI T7, 3
STORE A, T7
T7
[4]> (compile-expr *standard-output* '(+ a b))
LOAD T9, A
LOAD T10, B
ADD T8, T9, T10
T8
[5]> (compile-expr *standard-output* '(while (< a 10) (if (> a 5) (= a (+ a 2)) (= a (+ a 1)))))
L12:
LOAD T15, A
LI T16, 10
LT T14, T15, T16
BEQ T14, L13
LOAD T21, A
LI T22, 5
GT T20, T21, T22
BEQ T20, L18
LOAD T25, A
LI T26, 2
ADD T24, T25, T26
STORE A, T24
MOV T17, T24
BRA L19
L18:
LOAD T29, A
LI T30, 1
ADD T28, T29, T30
STORE A, T28
MOV T17, T28
L19:
BRA L12
L13:
ZERO


Exercises.

1. Replace the textual output (stream and format calls) with
logic which outputs instructions as data, returning a list of
instructions and labels. Choose a Lispy representation for the
instructions and labels, like for instance (mov t10 t11) or l13.

2. Write code to break an instruction sequence into basic blocks,
returning a graph structure.

3. Implement a jump threading optimization. If a branch
a target label which denotes an unconditional
branch, rewrite the original branch so that it goes
directly to the target of that unconditional branch.
Modify the code so that it repeatedly applies the
strategy until no more jumps can be threaded.
For instance in the last example above, there is
a branch to L19, which just branches back to L12.
These branches could go to L12 directly.

4. Write code to identify dead temporaries (i.e. which
have no next use) and remove them. For instance
the MOV T17, ... instructions in the last example
above are useless.

0 new messages