chain of transformations

Jeff Sandys

unread,

Jul 24, 2002, 1:54:29 PM7/24/02

to

I often implement data transformations that are modeled
on sequenced steps of processes: a -> b -> c -> d -> e ...

It is easy to create and debug each transformation step,
as in (defun a2b (in) ..., but then I end up with the
final combined transformation as:
(e2f (d2e (c2d (b2c (a2b in)))))
that looks kind of weird.

Is there a better lisp idiom for a chain of transformations?

Thanks,
Jeff Sandys

Thomas F. Burdick

unread,

Jul 24, 2002, 3:26:55 PM7/24/02

to

Jeff Sandys <san...@juno.com> writes:

I have a functional-composition utility that I use a lot:

(funcall (compose #'e2f #'d2e #'c2d #'b2c #'a2b) in)

A simple implementation would be:

(defun compose (&rest functions)
(let ((functions (reverse functions)))
(lambda (&rest args)
(reduce #'funcall (rest functions)
:initial-value (apply (first functions) args)))))

It's also nice to have a MULTIPLE-VALUE-COMPOSE, so that:

(multiple-value-call a
(multiple-value-call b
(multiple-value-call c
(apply d args))))

Is the same as:

(apply (multiple-value-compose a b c d) args)

And to have compiler-macros that transform things like:

(funcall (compose #'a #'b #'c #'d) in)

to:

(funcall (lambda (#:g0) (a (b (c (d #:g0))))) in)

So that you don't lose efficiency when the call to COMPOSE is just
syntactic sugar (as above).

--
/|_ .-----------------------.
,' .\ / | No to Imperialist war |
,--' _,' | Wage class war! |
/ / `-----------------------'
( -. |
| ) |
(`-. '--.)
`. )----'

Wolfhard Buß

unread,

Jul 24, 2002, 4:25:28 PM7/24/02

to

Jeff Sandys <san...@juno.com> writes:

> I often implement data transformations that are modeled
> on sequenced steps of processes: a -> b -> c -> d -> e ...

t...@apocalypse.OCF.Berkeley.EDU (Thomas F. Burdick) writes:

> I have a functional-composition utility that I use a lot:

:

> A simple implementation would be:
>
> (defun compose (&rest functions)
> (let ((functions (reverse functions)))
> (lambda (&rest args)
> (reduce #'funcall (rest functions)
> :initial-value (apply (first functions) args)))))

Since we should make things as simple as possible, but no simpler ;)

(defun compose (&rest functions)
(lambda (arg)
(reduce #'funcall functions :initial-value arg :from-end t)))

--
"I believe in the horse. The automobile is a passing phenomenon."
-- Kaiser Wilhelm II. (1859-1941)

Kaz Kylheku

unread,

Jul 24, 2002, 9:29:04 PM7/24/02

to

I hacked up a macro which provides this.

For example, you can write:

(filter 3 1+ (expt _ 2) list (cons 'a))

This means start with the value 3, apply the 1+ function to produce
4, and then raise it to the power of 2 to get 16. Turn it into
a list to get (16), and finally cons 'a onto that to get (a 4).

The _ symbol is a notation which is used to refer to the output of
the last filter, when it needs to be inserted in other than the
trailing argument position.

Special notations are supported to avoid writing lambda expression,
so that (expt _ 2) means roughly #'(lambda (_) (expt _ 2)).

Symbols are taken to be function names, so you don't have to hash-quote.

Multiple values are handled, and there are special operators to
split lists and multiple values into sub-filters and then re-combine
the values.

A crude debugging mechanism is provided: use filter-trace
instead of filter to compile a filter which prints the output
of each stage.

(eval-when (:compile-toplevel :load-toplevel :execute)
(defvar *filter-single-value-funcs* '(quote list cons append mapcar
car cdr first rest reverse nreverse))
(defvar *filter-mapping-functions* '(mapcar mapcan find-if remove-if
count-if)))

(defun single-value-p (expr)
(member (first expr) *filter-single-value-funcs*))

(defun tree-find (needle haystack &key (test #'eql))
(dolist (item haystack)
(if (funcall test needle item)
(return item)
(when (consp item)
(let ((find (tree-find needle item :test test)))
(when find
(return find)))))))

(defun expand-filter-expr (expr input)
(cond
((null expr)
input)
((atom expr)
(if (or (atom input) (single-value-p input))
`(funcall (function ,expr) ,input)
`(multiple-value-call (function ,expr) ,input)))
((consp (first expr))
(cond
((= 1 (length (first expr)))
`(let ((,(first (first expr)) ,input) ,@(rest (first expr)))
,@(rest expr)))
((and (every #'symbolp (first expr))
(null (intersection '(&optional &rest &key &body &aux)
(first expr))))
(if (or (atom input) (single-value-p input))
`(let ((,(first (first expr)) ,input) ,@(rest (first expr)))
,@(rest expr))
`(multiple-value-bind ,(first expr) ,input ,@(rest expr))))
(t `(multiple-value-call (lambda ,(first expr)
,@(rest expr)) ,input))))
((eq 'lambda (first expr))
(expand-filter-expr (rest expr) input))
((eq 'function (first expr))
(expand-filter-expr (second expr) input))
((member (first expr) *filter-mapping-functions*)
(cond
((and (consp (second expr))
(eq (first (second expr)) 'function))
`(,(first expr) ,@(rest expr) ,input))
((symbolp (second expr))
`(,(first expr) #',(second expr) ,@(rest (rest expr)) ,input))
(t (let ((input-sym (gensym "INPUT-")))
`(,(first expr) (lambda (,input-sym)
,(expand-filter-expr (second expr) input-sym))
,input)))))
((eq 'split (first expr))
(let ((value-list-sym (gensym "VALUE-LIST-")))
`(multiple-value-call (lambda (&rest ,value-list-sym)
(values ,@(mapcar #'(lambda (filter-exprs)
`(filter (values-list ,value-list-sym)
,@filter-exprs))
(rest expr))))
,input)))
((eq 'value-separate (first expr))
(let ((value-list-sym (gensym "VALUE-LIST-")))
`(multiple-value-call (lambda (&rest ,value-list-sym)
(values ,@(mapcar #'(lambda (filter-exprs)
`(filter (pop ,value-list-sym)
,@filter-exprs))
(rest expr))))
,input)))
((eq 'list-separate (first expr))
(let ((value-list-sym (gensym "VALUE-LIST-")))
`(let ((,value-list-sym ,input))
(list* ,@(mapcar #'(lambda (filter-exprs)
`(filter (pop ,value-list-sym)
,@filter-exprs))
(rest expr))
,value-list-sym))))
((tree-find '_ expr)
`(let ((_ ,input)) ,expr))
((> (length expr) 1)
`(,(first expr) ,@(rest expr) ,input))
(t (let ((input-sym (gensym "INPUT-")))
(if (or (atom input) (single-value-p input))
`(,(first expr) ,@(rest expr) ,input)
`(multiple-value-call (lambda (,input-sym)
(,(first expr) ,@(rest expr) ,input-sym))
,input))))))

(defun filter-declare-single-value (sym)
(push sym *filter-single-value-funcs*))

(defmacro filter (initial-expr &body filter-expr-list)
(if (null filter-expr-list)
initial-expr
`(filter ,(expand-filter-expr (first filter-expr-list) initial-expr)
,@(rest filter-expr-list))))

(defun filter-chatter (&rest args)
(if (= (length args) 1)
(format t "filter trace: ~a~%"
(first args))
(format t "filter trace: ~{~a ~}~%"
args))
(values-list args))

(defmacro filter-trace (initial-expr &body filter-expr-list)
`(filter (multiple-value-call #'filter-chatter ,initial-expr)
,@(mapcan #'(lambda (filter-expr)
(list filter-expr '#'filter-chatter))
filter-expr-list)))

(defmacro filter-declare ((declaration-type &rest args))
(case declaration-type
((single-value)
(when (not (every #'symbolp args))
(error "FILTER-DECLARE: arguments to SINGLE-VALUE must be symbols."))
`(setf *filter-single-value-funcs*
(append ',args *filter-single-value-funcs*)))
((mapping-function)
(when (not (every #'symbolp args))
(error "FILTER-DECLARE: arguments to MAPPING-FUNCTION must be symbols."))
`(setf *filter-mapping-functions*
(append ',args *filter-mapping-functions*)))
(otherwise (error "FILTER-DECLARE: invalid declaration type ~a."
declaration-type))))

Erik Naggum

unread,

Jul 24, 2002, 11:49:44 PM7/24/02

to

* Jeff Sandys

This may be a case for Common Lisp's unique features, for which it seems it
gets harder to argue in the presence of those who appear to believe that
syntax and convenience are entirely irrelevant, but more on that below.

The standard idiom, functional composition inside-out read from right to
left, may be hard to read and follow for a series of transformations, so you
may want to express it entirely differently, like you initially wrote it:

[ in -> a2b -> b2c -> c2d -> d2e -> e2f ]

You could do this with an ordinary macro, too:

(transform in -> a2b -> b2c -> c2d -> d2e -> e2f)

where the symbol -> is a mere noise word or marker, but if you find yourself
programming with such transformations, both order and notation may be shaped
to your convenience with reader macros and supporting code to make it easier
for you to see effortlessly that you have written the code correctly, but if
you do not have enough of these forms in your code, the value of an unusual
form will outweigh the convenience, and the burden of understanding the small
changes you have made to the syntax will cause a maintainer to remove it.
(Much like someone who uses the term "hapax legomenon" will probably use it
only once -- and immediately regret having used up that joke for good.)

If you look for a "Lisp idiom", I think you have preordained your solution
and constricted your options prematurely. If you look for a way to use Lisp
to do precisely what you have in mind, in the way you already think, and you
are reasonably certain that you "think Lisp", I see no problems with creating
a mini-language more suited for the job than more long-hand notations.
However, the fear that you may not "think Lisp" well enough just because you
want to use an (evil?) infix syntax for a clearly defined purpose may be
psychologically stultifying. The key is to maintain a high and consistent
level of aesthetics and not engage in rabid excesses or purposeless changes.
For instance, when I had to deal with C++ a long time ago, the desire to
redesign the language was overpowering and actually got in the way of using
the language. Similarly, extremely tasteless stunts in Common Lisp would
cause other programmers to receive a constant stream of SIGWTFs while reading
your code and that would get in the way of business. (Much like someone who
decides to write his own translation of the Bible, say, and quotes verses in
an unusual form that causes those who thought they knew them to get upset
instead of nodding knowingly to reams of archaic words ending in "eth".)

Having read the meandering thread on Lisp's unique features, I am loth to
conclude that some people consider syntax and convenience entirely irrelevant
and therefore simply do not get the point: What we are more likely to do is
closely related to the effort is required to do it -- some things are simply
not done because the complexity or effort exceeds a threshold (which should
not be interpreted as laziness but as economy). For instance, if you had to
go through a checklist of steps to be executed with precise timing to yield
the desired results every time you needed something essential to your life,
how many times would you repeat it before you went and invented something to
help you achieve the results with less effort -- like, say, a microwave oven
and TV dinners? We know that some people actually repeat complex chains of
steps tens, if not hundreds, of thousands of times and that this has been
going on for hundreds, if not thousands, of generations with no change, so
there is some element of the elusive "human nature" that clearly lets some
people feel comfortable with repeating repetitive tasks endlessly and regard
change as anathema to comfort. I would argue that Common Lisp is for people
who are _not_ like that. Java man evidently enjoyed hard labor with the same
primitive tools for a periods of time indistinguishable from eons, but Homo
sapiens invented Leatherman tools and Common Lisp.

There is, however, little point in a "programmable programming language" if
you never program it (but even less if you feel you have to make some local
changes just to be a member in good standing of the elite users thereof).
The value of programmability becomes visible when your needs are in flux,
such as when the demands are actually unknown at the outset. What mortals
call "applications" are usually just one step away from the programming
language -- the entire application is written in the same language, albeit
with non-linguistic "abstractions". To a Common Lisp programmer inspired to
build the language bottom-up while he solves the problem top-down, the
application his users see is more like a meta-application -- an application
of the application of the programmable programming language. This is
sometimes called "domain-specific" or "fourth-generation" languages by people
who fail to understand that you do not have to build an entire development
environment just because you want a new way to write "struct".

Put another way, if you have only one statue in mind and you know exactly
what it should look like, feel free to carve it out of granite right away
(and to make that more efficient), but if you have to experiment until you
get it right, you would probably find Play-Doh or clay or even a plate of
mashed potatoes more convenient than throwing away 95%-finished granite
statues every decade. As you find experimenting more forgiving of errors,
you would also automate the granite carving process, just as we have done
with compilers and development environments over the years.

There is nonetheless something to be said for stability. (Much like someone
who experiments with different keyboard layouts is expected to become more
efficient rather than spend all his time changing it to become more efficient
in some vaguely defined "future", such as after all the boring tasks have
been completed by others.) The key to successful use of Common Lisp's unique
features is not to be led astray by the plethora of options, but to shape the
language according to actual needs. There is probably no substitute for long
experience and painfully acquired wisdom in this area, but one has to be
aware of the inherent dangers in using a new and improved syntax -- if you
change your mind, leaving behind relics of the past is unacceptable, and you
therefore need a mechanism to update uses of such features -- just like those
who think that SGML or XML are good ideas for long-term document storage will
find that changing your mind becomes exponentially more expensive as the
inherent deficiencies of those languages make it nigh intractable to produce
chained transformations for documents conforming to one DTD to conforming to
each its next revision. Compilers may well deal with source files that use
different versions of the customized, but humans are likely to want to rely
on their memory, lest they never advance beyond the hunt-and-peck mode of
typing on keyboards, either. This, incidentally, is another useful feature
of Common Lisp: internally, the source code ends up as manipulable data in
the language itself. A program that reads and updates source files as you
get one bright syntactic idea after another is eminently doable, lending
itself to achieving stability through malleability.

But To get back to your question -- other than the possibly fancy option, I
might write the steps out thus:

(let* ((in (a2b in))
(in (b2c in))
(in (c2d in))
(in (d2e in))
(in (e2f in)))
(whatever in))

Note that _apparently_ reusing the same variable in let* may be misleading,
as it is not actually reused, but creating new bindings. This would be a
good idea if the abstract "type" of the object handed from transformation to
transformation does not change (which is not the same as the representational
type in the language, naturally). I tend to use this form when there are
multiple arguments to each call (and the chained input is not spottable from
afar) and it may be hard to understand precisely what the returned value is.
Adding a name to it may help in reading and maintaining the code. In that
case, you would not reuse the variable name any more than you would call your
functions "x2y". I prefer to use a variable named after the type when there
are separate lookup-functions, for instance. A fairly unique feature of
Common Lisp lets me get away with a variable named the same as a class --
instead of having to use articles like "a" or "the" in front of them or some
other shenanigans -- reusing a name comes very natural in human languages and
will, when properly used, work to reduce the cognitive load when reading.

Tangentially speaking of "foo2bar"-functions, I have come to hate that way to
name functions. It is reminiscent of the stupid redundancy in Java where you
have to write SomeComplexTypeName foo = new SomeComplexTypeName (bar); -- and
thank St. GNUcius for dynamic abbreviation in Emacs -- you have to keep the
function call in sync with types of _two_ variables and probably have to
write the type names out in full several times. I prefer naming functions
based on _one_ aspect and then to use generic functions or designators to
ensure that it works reasonably for reasonably input types.

To sum up, I think the notation/idiom you end up choosing should be dictated
by your expected need to type, verify, and read the code written using it.
There is nothing wrong, in my view, with a thorny mess warning readers not to
trespass if modifying it would actually endanger the code. (This is not to
be mistaken for the "it was hard to write, it should be hard to read" school
of writing, though.) I prefer a sort of Huffman code for syntactic features
-- the most frequently used gets the shortest and most compact syntax. This
is naturally an empirical issue, and random beliefs in frequency of use are
usually off by several orders of magnitude because our cognitive apparatus
appears to be wired to associate importance with cogntive strain and emotions
like pain, even though they are usually inversely related.

--
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.

Andreas Eder

unread,

Jul 25, 2002, 6:02:58 PM7/25/02

to

Erik Naggum <er...@naggum.net> writes:

> (Much like someone who uses the term "hapax legomenon" will probably use it
> only once -- and immediately regret having used up that joke for good.)

Just like you did, so it was even a meta-joke (and a meta-regret ?).

Anyway I like it.

'Andreas
--
Wherever I lay my .emacs, there愀 my $HOME.

Jeff Sandys

unread,

Jul 25, 2002, 7:50:17 PM7/25/02

to

<snip>

>
> But To get back to your question -- other than the possibly
> fancy option, I might write the steps out thus:
>
> (let* ((in (a2b in))
> (in (b2c in))
> (in (c2d in))
> (in (d2e in))
> (in (e2f in)))
> (whatever in))
>

I don't know why I didn't think of this, we use this idiom with
a chain of algebraic expressions (kind of gives the code a Fortran
look). Also this solves the other half of my problem, some of
the functions have additional inputs. For example:

(e2f (d2e (c2d (b2c (a2b in))) t))
is kind of hard to see that the t is part of the d2e input.

But with:

(let* ((in (a2b in))
(in (b2c in))
(in (c2d in))

(in (d2e in t))
(in (e2f in)))
in)
it is very clear which function has the extra input.

Thanks,
Jeff Sandys