Greg Menke <gregm-n
...@toadmail.com> writes:
> jan <jan
...@iprimus.com.au> writes:
> > "Will Hartung" <wi...@msoft.com> writes:
> > > Actually you could do something sick like this in your compiler:
> > > (defun greg-compiler (form)
> > > (cond
> > > ((consp form) (compile-function-or-macro form))
> > > ...)))
> > > (defun compile-function-or-macro (form)
> > > (let ((macro-symbol (first form)))
> > > (cond ((eq macro-symbol 'defmacro (eval form))) ;; defining a new macro,
> > > let CL do the work
> > > ((macro-function macro-symbol) (greg-compiler (macroexpand form)))
> > > ;; compile the expanded form
> > > (t (compile-function form))))) ;; something I can actually
> > > understand.
> > That's similar to what I do except I use macroexpand-1 because some
> > macros are easier to deal with at intermediate stages of expansion.
> > Using this approach on cmucl, once you have IF, you get COND, WHEN and
> > UNLESS for free, but an even bigger win is getting LOOP for the price
> > of LET and TAGBODY.
> > --
> > jan
> I think everyone's help in this thread is sinking in a little. I
> spent some time last night working with macroexpand and I'm starting
> to see what you're getting at. The 3 package approach mentioned
> previously is increasingly looking like the way to divide up the
> design. I'm still not sure about using hash tables or CL symbols for
> the compiler, so thats where I'm going next I guess.
Well, since it was more fun than trying to write about pathnames (for
my book) inspired by this thread I spent a chunk of yesterday playing
around with writing a compiler for a made up "machine". While there
are plenty of improvements to be made to this code (such as generating
an intermediate form that is suitable to feeding to a peephole
optimizer) it might give you a few ideas. And I'm sure if it has any
really *bad* ideas in it, someone else will be kind enough to point
them out.
The language is slightly lispy with lexical variables and tagbody (on
top of which I build a few higher level constructs such as DOTIMES).
However the only kind of values are numbers with 0 treated as false
and everything else 1. And since I was pretending to be developing a
compiler for an extremely simply chip, I assumed that the compiler
would emit a single executable from a bunch of functions with a known
entry point.
The way this compiler is implemented, the language can be extended by
writing macros using normal CL:DEFMACRO. However unlike normal CL
macros the target language is MINI not CL. Which means the macros can
use all of CL to compute their expansion but after all macros are
expanded the code must be built only of MINI special operators and
primitive functions.
-Peter
(defpackage :mini-compiler
(:documentation "The package our compiler runs in.")
(:use :cl))
(defpackage :mini
(:documentation "The package where we define the MINI language."
(:use))
(in-package :mini-compiler)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; We are building a compiler for a made up stack machine. The
;; primitive operations of this machine are as follows:
;;
;; NOP -- do nothing.
;; LOAD register -- load the value from a register onto the stack
;; STORE register -- store the top of the stack into a register
;; PUSH value -- push a literal number onto the stack
;; POP -- pop the top of the stack
;; ADD -- pop the top two items off the stack and push their sum.
;; SUBTRACT -- pop the top two items off the stack and push their difference.
;; MULTIPLY -- pop the top two items off the stack and push their product.
;; DIVIDE -- pop the top two items off the stack and push their quotient.
;; EQUAL -- pop the top two items and push 0 if they are different.
;; GREATER_THAN -- pop the top two items and push 0 unless the first is greater than the second
;; LESS_THAN -- pop the top two items and push 0 unless the first is less than the second
;; NOT_LESS_THAN -- pop the top two items and push 0 if the first is less than the second
;; NOT_GREATER_THAN -- pop the top two items and push 0 if the first is greater than the second
;; COMPLEMENT -- pop the top of the stack and push 1 if it is 0 and 0 otherwise.
;; PRINT -- pop the top of the stack and print its value.
;; BRANCH_ON_ZERO address -- pop the top of the stack and branch to address if it is zero
;; GO address -- unconditionally jump to the given address
;; CALL address -- call a function. Jumps to the address after saving the current *pc* on a stack.
;; RETURN -- return from a function to the addressed popped of the call stack.
;; EXIT -- exit the program.
(defvar *compiler-output* *standard-output*)
(defvar *compiler-trace-output* *trace-output*)
(defvar *trace-compiler* nil)
(defvar *pc* 0)
(defvar *available-registers*)
(defvar *functions*)
(defvar *tagbody-labels* ())
(defvar *variable-bindings* ())
(defvar *measuring* nil)
(defmacro define-special-operator (symbol lambda-list &body body)
"Define a special operator. This macro defines a function that is
responsible for compiling a call to the special operator named by `symbol'.
The rest of the list representing the special operator call will be passed
to this function and destructured with `lambda-list'."
`(progn
(defun ,symbol (&rest args)
(destructuring-bind ,lambda-list args
,@body))
(setf (get ',symbol 'mini-compiler) 'special-operator)))
(defmacro define-primitive (symbol &body body)
"Define a primitive function that will be encoded inline. The body
is responsible for emitting appropriate operations to implement the
desired functionality. Defines a function of no arguments named
`symbol' that will emit those codes. The arguments to the function
can be assumed to be on the stack."
`(progn
(defun ,symbol () ,@body)
(setf (get ',symbol 'mini-compiler) 'primitive))))
(defun mini-special-operator-p (symbol)
"Is the given symbol the name of a MINI special operator?"
(eql (get symbol 'mini-compiler) 'special-operator))
(defun mini-primitive-p (symbol)
"Is the given symbol the name of a MINI primitive function?"
(eql (get symbol 'mini-compiler) 'primitive))
(defun compile-mini-file (input &optional (output (make-pathname :type "masm" :defaults input)))
(with-open-file (*compiler-output* output :direction :output :if-exists :supersede)
(compile-program
(let ((*package* (find-package :mini)))
(with-open-file (in input)
(format *compiler-output* "~&;; Compiled from ~a~%" (truename in))
(loop for fn = (read in nil nil) while fn collect fn))))
(truename *compiler-output*)))
(defun compile-program (program &key (entry-point 'mini::main))
"Compile a program represented as a list of MINI::DEFUN's. The entry point is a name of a function."
(let* ((header-length (let ((*measuring* t) (*pc* 0)) (emit-program-header nil) *pc*))
(*available-registers* (loop for i from 0 below 256 collect i))
(*functions* (allocate-functions program header-length))
(*pc* 0))
(emit-program-header entry-point)
(loop for function in program do (compile-expr function))))
(defun compile-expr (code)
"Compile a single s-expression of our mini-language."
(/log "~&~:[Compiling~;Measuring~] ~s~%" *measuring* code)
(typecase code
(null)
(number (emit "PUSH ~d" code))
(symbol (emit "LOAD ~a" (find-register code)))
(cons (compile-cons code)))
t)
(defun measure (expr)
"Measure the length of the code that will be emitted for a given expression
without actually emitting any code"
(let ((*measuring* t)
(*pc* 0))
(compile-expr expr)
*pc*))
(defun compile-cons (code)
"As in common lisp there are three ways to evaluate a cons: as a
macro call, as a special operator call, or as a function call."
(destructuring-bind (first &rest rest) code
(unless (symbolp first) (error "Expected symbol, got ~S" first))
(cond
((macro-function first)
(compile-expr (expand-macro code)))
((mini-special-operator-p first)
(apply first rest))
(t (compile-function-call first rest)))))
(defun expand-macro (code)
"Note that we get to use the CL macro mechanism here. This is huge!"
(/log "~&Macro expanding ~s~%" code)
(macroexpand code))
(defun compile-function-call (function args)
"Compile the code that will evaluate the function's arguments. Then the function
itself is either a primitive in which case we emit the code that impements it by
calling the appropriate generator function or it is not in which case we use the
CALL instruction to call it by name."
(loop for expr in args do (compile-expr expr))
(if (mini-primitive-p function)
(funcall function)
(emit "CALL ~d" (function-address function))))
(defun emit-program-header (entry-point)
(emit "CALL ~d" (function-address entry-point))
(emit "EXIT"))
(defun allocate-functions (program *pc*)
"Measure the size of all the functions in the program so we know their address
before we try to compile them."
(loop for function in program
for size = (measure function)
collect (cons (extract-function-name function) *pc*)
do (incf *pc* size)))
(defun extract-function-name (function)
(assert (eql (first function) 'mini::defun))
(second function))
(defun function-address (name)
(unless *measuring*
(let ((cons (assoc name *functions*)))
(unless cons (error "No function named ~a" name))
(cdr cons))))
(defun emit (format &rest args)
"Emit 'instructions' for our mythical machine. If we are only measuring then we
don't actually emit."
(when (not *measuring*) (actually-emit format args))
(incf *pc* (1+ (length args))))
(defun actually-emit (format args)
"This is where we actually emit 'instructions'. For demonstration purposes we'll
just print some pseudo assembly. This could emit machine code or whatever. At this
point we leave lisp behind."
(fresh-line *compiler-output*)
(format *compiler-output* "~3d: " *pc*)
(apply #'format *compiler-output* format args)
(terpri *compiler-output*))
(defun /log (format &rest args)
(when *trace-compiler*
(fresh-line)
(apply #'format *compiler-trace* format args)
(fresh-line)))
(defun //log (format &rest args)
(let ((*trace-compiler* t))
(apply #'/log format args)))
(defun toggle-trace () (setf *trace-compiler* (not *trace-compiler*)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;; Now we are ready to define the MINI language using
;;;; define-primitive, define-special-operator and macros.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Primitives. These map directly to the particular ops supported by
;;; our target machine.
(define-primitive mini::+ (emit "ADD"))
(define-primitive mini::- (emit "SUBTRACT"))
(define-primitive mini::* (emit "MULTIPLY"))
(define-primitive mini::/ (emit "DIVIDE"))
(define-primitive mini::= (emit "EQUAL"))
(define-primitive mini::> (emit "GREATER_THAN"))
(define-primitive mini::< (emit "LESS_THAN"))
(define-primitive mini::>= (emit "NOT_LESS_THAN"))
(define-primitive mini::<= (emit "NOT_GREATER_THAN"))
(define-primitive mini::not (emit "COMPLEMENT"))
(define-primitive mini::print (emit "PRINT"))
(define-primitive mini::%peek)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Special operators. These implement basic control constructs that
;;; can't be implemented as macros, usually because they need to use
;;; machine level ops that are not exposed as primitives. As in Common
;;; Lisp, they provide building blocks for higher level macros.
;;
;; MINI:PROGN -- like CL:PROGN except no multiple values.
;;
(define-special-operator mini::progn (&rest body)
(loop for (expr . rest) on body do
(compile-expr expr)
(when rest (emit "POP"))))
;;
;; MINI:IF -- like CL:IF
;;
(define-special-operator mini::if (test then &optional else)
(compile-expr test)
(let ((then-length (measure then)))
(emit "BRANCH_ON_ZERO ~d" (+ 2 *pc* then-length)))
(compile-expr then)
(when else
(emit "GO ~d" (+ 2 *pc* (measure else)))
(compile-expr else)))
;;
;; MINI:LET -- like CL:LET except variables default to the numeric value 0.
;;
(define-special-operator mini::let ((&rest bindings) &body body)
(/log "~&Allocating registers for ~S~%" bindings)
(let* ((*variable-bindings* *variable-bindings*)
(bindings (normalize-bindings-list bindings))
(vars (mapcar #'first bindings))
(values (mapcar #'second bindings))
(registers (mapcar #'allocate-register vars)))
(loop for var in vars
for value in values
do (mini::set var value)
(emit "POP"))
(loop for (expr . rest) on body do
(compile-expr expr)
(when rest (emit "POP")))
(loop for r in registers do (deallocate-register r)))))
(defun normalize-bindings-list (vars)
(loop for v in vars when (symbolp v) collect (list v 0) else collect v))
(defun allocate-register (var)
(unless *measuring*
(/log "Allocating register for ~S" var)
(unless (symbolp var) (error "Variable name must be a symbol."))
(let ((register (pop *available-registers*)))
(unless register (error "Out of registers."))
(setf *variable-bindings* (acons var register *variable-bindings*))
register)))
(defun find-register (var)
(unless *measuring*
(/log "Finding register for ~S in ~S" var *variable-bindings*)
(let ((binding (assoc var *variable-bindings*)))
(unless binding (error "No binding for ~s in ~s" var *variable-bindings*))
(format nil "R~d" (cdr binding)))))
(defun deallocate-register (register)
(unless *measuring*
(push register *available-registers*)))
;;
;; MINI:TAGBODY -- similar to CL:TAGBODY. The immediate elements of
;; the tagbody are either symbols which interpreted as labels or
;; expressions to be evaluated. The labels can be jumped to with
;; MINI:GO.
;;
(define-special-operator mini::tagbody (&rest body)
(let ((*tagbody-labels* *tagbody-labels*))
(/log "~&Finding labels in tagbody~%")
(let ((*measuring* t) (*pc* *pc*))
(loop for expr in body
do (/log "~&Saw: ~s~%" expr)
when (symbolp expr) do (note-label expr)
else do (compile-expr expr)))
(/log "~&Found labels: ~s~%" *tagbody-labels*)
(loop for expr in body
unless (symbolp expr) do (compile-expr expr))))
;;
;; MINI:GO -- Jump to the given label in the most narrowly scoped
;; enclosing TAGBODY.
;;
(define-special-operator mini::go (label)
(emit "GO ~d" (find-go-target label)))
(defun note-label (label)
(/log "~&>> Noting label ~a at ~d~%" label *pc*)
(setf *tagbody-labels* (acons label *pc* *tagbody-labels*)))
(defun find-go-target (label)
(/log "~:[F~;Skipping f~]inding target for ~s in ~s~%" *measuring* label *tagbody-labels*)
(unless *measuring*
(unless (symbolp label) (error "Argument to GO must be a symbol."))
(let ((label-target (assoc label *tagbody-labels*)))
(unless label-target (error "No target ~S in ~S" label *tagbody-labels*))
(cdr label-target))))
;;
;; MINI:RETURN -- used to return from a function (not a block as in CL)
;;
(define-special-operator mini::return (&body body)
(loop for expr in body do (compile-expr expr))
(emit "RETURN"))
;;
;; MINI:SET -- simple assignment. Assigns a value (evaluated) to a named variable.
;;
(define-special-operator mini::set (var value)
(compile-expr value)
(emit "STORE ~a" (find-register var)))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Other language constructs. Once we've got our primitives and
;;; special operators, we can define new language constructs with
;;; regular CL macros. However note while we can use all of CL in the
;;; definition of the macro, the code it expands into must be pure
;;; MINI--i.e. MINI special operators and primitives or macros that
;;; will eventually expand into such. Thus we can't use things like
;;; LOOP directly in MINI unless we write one ourselves because it
;;; expands into a bunch of calls to CL primitives, not MINI
;;; primitives.
(defmacro mini::defun (name (&rest params) &body body)
"A MINI defun is fairly primitive compared to a CL:DEFUN. Since a
MINI program is compiled to a single executable with static linkage
between functions, we don't need to associate the code of a function
with its name; that happens in ALLOCATE-FUNCTIONS."
(declare (ignore name))
`(mini::return
(mini::let
,(loop for p in params collect `(,p (mini::%peek)))
,@body)))
(defmacro mini::dotimes ((var count) &body body)
(let ((start-label (gensym "START"))
(end-label (gensym "END")))
`(mini::tagbody
,start-label
(mini::if (mini::= ,var ,count) (mini::go ,end-label))
,@body
(mini::set ,var (mini::+ 1 ,var))
(mini::go ,start-label)
,end-label)))
(defmacro mini::when (test &rest body)
`(mini::if ,test (mini::progn ,@body)))
(defmacro mini::unless (test &rest body)
`(mini::if (mini::not ,test) (mini::progn ,@body)))
(defmacro mini::cond (&rest clauses)
(when clauses
`(mini::if ,(caar clauses)
(mini::progn ,@(cdar clauses))
(mini::cond ,@(cdr clauses)))))
--
Peter Seibel pe...@javamonkey.com
Lisp is the red pill. -- John Fraser, comp.lang.lisp