functional programming style.

Charlie

unread,

Nov 21, 2002, 4:41:08 AM11/21/02

to

Just a quick question: I want to avoid using side effects but I find myelf
starting an aufully large number of functions with "(let ((new-item
(copy-item item)))" surely this cannot be an effiecent way to program. What
if I'm working with large arrays; should I just ignore functional
programming and go with side effects for the sake of speed or am I missing
something?

Thanks
Charlie.

Coby Beck

unread,

Nov 21, 2002, 5:26:32 AM11/21/02

to

"Charlie" <char...@zoom.co.uk> wrote in message
news:ari8so$ik24f$1...@ID-155962.news.dfncis.de...

> Just a quick question: I want to avoid using side effects but I find myelf
> starting an aufully large number of functions with "(let ((new-item
> (copy-item item)))" surely this cannot be an effiecent way to program.

sometimes that would be necessary, though for myself I could probably count
with one hand the number of times I've had to do that.

> What
> if I'm working with large arrays;

It is definitely more unusual to be copying arrays and lists as you throw
them around, even when doing destructive operations on them. The standard
idiom is

(setf my-data (frob my-data))

where frob will take it and do whatever to it and return it, without copying
it at any point.

> should I just ignore functional
> programming and go with side effects for the sake of speed or am I missing
> something?

It might be useful to see some example situation where you are doing
(copy-item) to see if it is usual/necessary.

Cheers,

--
Coby Beck
(remove #\Space "coby 101 @ bigpond . com")

Frode Vatvedt Fjeld

unread,

Nov 21, 2002, 6:16:53 AM11/21/02

to

"Charlie" <char...@zoom.co.uk> writes:

This is a bit vague description to give precise advise on. But in
general, I think it's good style to extract the functional
(side-effect free) parts of the problem into side-effect free
functions. The rest is left as side-effecting functions, and will
often have names like (setf foo).

Functional programming is a means, not and end. Copying stuff in order
to be "functional" is most likely a sign of you doing something
wrong. If you provide a more concrete example maybe we can help you
see what it is.

--
Frode Vatvedt Fjeld

Joost Kremers

unread,

Nov 21, 2002, 7:41:30 AM11/21/02

to

you know, i have asked myself this same question the other day. i have
just started learning lisp (working through Paul Graham's "ANSI Common
Lisp", and your question came up when i studied the program for binary
tree creation in section 4.7.

Graham defines a structure to represent a node in a binary tree:

(defstruct (node (:print-function
(lambda (n s d)
(format s "#<~A>" (node-elt n)))))
elt (l nil) (r nil))

and then he gives the following function to add an object to a binary
tree bst:

(defun bst-insert (obj bst <)
(if (null bst)
(make-node :elt obj)
(let ((elt (node-elt bst)))
(if (eql obj elt)
bst
(if (funcall < obj elt)
(make-node
:elt elt
:l (bst-insert obj (node-l bst) <)
:r (node-r bst))
(make-node
:elt elt
:r (bst-insert obj (node-r bst) <)
:l (node-l bst)))))))

BST-INSERT returns a copy of the tree with the new element added to
it. (the same happens with the function BST-REMOVE.) i can't help but
feel that this is very uneconomical if your tree is getting larger. it
takes more time, but it also seems to make ineconomical use of
memory. i mean, copying the entire tree is fine if it only holds a few
dozen elements, but what if you have a tree that holds a few megabytes?

BTW, even though i'm just a newbie, i can't imagine you really need
"(let ((new-item (copy-item item)))" to implement your function
without side-effects. sometimes, perhaps, but not all the time.

--
Joost Kremers http://baserv.uci.kun.nl/~jkremers
lrwxrwxrwx 1 root root 11 nov 2 21:37 vi -> emacsclient*

Charlie

unread,

Nov 21, 2002, 9:16:42 AM11/21/02

to

It's more of a design question than anything too specific. I am
experimenting with capturing variables so that undoing operations becomes
very easy as all you actually have to do is move up one level in the call
stack and you've undone the last operation. Having tried to come up with a
worthwhile example (i.e. one that doesn't have an obvious answer) I can see
that it is for the most part unnecessary and a case of my trying to choose
style over content even though I know that it's hideously ineffecient.
As for the undo idea I'm going to risk asking another (possibly stupid)
question:
Let's say I have created a delayed evaluation stream that would look
something like this:
(lambda () ( ... (lambda () ( ... (lambda () ( (make-variable) ... )) )) ))

Now my question is this: Is this just like a nested list where I can easily
pop the first lambda to undo it or is this also a bad idea in general.

An example of this would be applying 2D transformations where the initial
variable is the identity matrix and the lambdas just apply each a transform
in order.

Thanks
Charlie

"Coby Beck" <cb...@mercury.bc.ca> wrote in message
news:aric8s$1gso$1...@otis.netspace.net.au...

> The standard idiom is
>
> (setf my-data (frob my-data))
>
> where frob will take it and do whatever to it and return it, without
copying
> it at any point.
>

plamen

unread,

Nov 21, 2002, 10:12:21 AM11/21/02

to

Charlie wrote:

There is also an interesting book about algorithms and data structures
implemented in pure functional style. The examples are not in CL, but
the concepts are the same.
See
http://www.amazon.com/exec/obidos/tg/detail/-/0521663504/qid=1037891289/sr=8-1/ref=sr_8_1/102-4393718-3002542?v=glance&s=books&n=507846

>
> Thanks
> Charlie.
>
>
>

Oleg

unread,

Nov 21, 2002, 10:27:23 AM11/21/02

to

Joost Kremers wrote:

>
> BST-INSERT returns a copy of the tree with the new element added to
> it. (the same happens with the function BST-REMOVE.) i can't help but
> feel that this is very uneconomical if your tree is getting larger. it
> takes more time, but it also seems to make ineconomical use of
> memory. i mean, copying the entire tree is fine if it only holds a few
> dozen elements, but what if you have a tree that holds a few megabytes?

You have to understand that BST-INSERT does not deep-copy the entire tree
as you seem to believe. For a tree of _depth_ D, there will be at _most_ D
elementary copy operations (or 3*D, depending on how you count) You can of
course write an imperative version of INSERT.

Each approach has its advantages: the functional insert preserves the
original tree (and as I mentioned it's much more efficient than copying the
entire tree); the imperative insert is a bit more efficient if you only
have one tree that you wish to modify, but it will become incredibly
inefficient if you decide to keep some of the previous trees.

Oleg

Knut Arild Erstad

unread,

Nov 21, 2002, 11:07:56 AM11/21/02

to

[Joost Kremers]
:
: BST-INSERT returns a copy of the tree with the new element added to

: it. (the same happens with the function BST-REMOVE.) i can't help but
: feel that this is very uneconomical if your tree is getting larger. it
: takes more time, but it also seems to make ineconomical use of
: memory. i mean, copying the entire tree is fine if it only holds a few
: dozen elements, but what if you have a tree that holds a few megabytes?

I was also confused by this code the first time I saw it, but it doesn't
copy the whole tree. It only allocates nodes on the search path for the
new node, at most O(log n) if the tree is balanced. So it is actually as
fast as a normal bst-insert. Yes, it does allocate more memory, but again
only O(log n) if the tree is balanced.

I actually think this is a very nice example of a functional-style data
structure, except that it doesn't ensure tree balance. (Then again, that
would be overkill for a introductory text.)

Arrays are a different story, since you actually have to copy the whole
array to do functional-style operations, which is very inefficient. I
believe that is why arrays are mutable objects even in some "almost-pure"
functional languages like caml.

--
Knut Arild Erstad

Don't try to solve serious matters in the middle of the night.
-- Philip K. Dick

Hannah Schroeter

unread,

Nov 21, 2002, 12:47:22 PM11/21/02

to

Hello!

Knut Arild Erstad <knute...@ii.uib.no> wrote:
>[...]

>Arrays are a different story, since you actually have to copy the whole
>array to do functional-style operations, which is very inefficient. I
>believe that is why arrays are mutable objects even in some "almost-pure"
>functional languages like caml.

In fact, there are implementation techniques for purely functional
arrays that also use O(log n) memory for an update if the old
version is retained alongside the new version, and O(log n) time
for access and (purely function) update. One technique is, of course,
simply a balanced search tree. In really-almost-pure FPLs like Haskell
or Clean, I expect there to be both variants of arrays. In fact,
Haskell has the purely functional variant in the standard, while
an imperative variant is only a popular language extension (and the
side effects are controlled by using the IO monad).

Kind regards,

Hannah.

Oleg

unread,

Nov 21, 2002, 2:27:30 PM11/21/02

to

Hannah Schroeter wrote:

> In fact, there are implementation techniques for purely functional
> arrays that also use O(log n) memory for an update if the old
> version is retained alongside the new version, and O(log n) time
> for access and (purely function) update.

Then why would they|you call them "arrays"?

Oleg

Donald Fisk

unread,

Nov 21, 2002, 4:19:12 PM11/21/02

to

Let is just syntactic sugar (or rather a macro) for lambda:

(let ((x (read))) (list x x)) is just

(funcall (lambda (x) (list x x)) (read))

(let ((x 1) (y 2)) (list x y x y))

(funcall (lambda (x y) (list x y x y)) 1 2)

So it's already side-effect free. Also, it's good practice
to assign to a local variable the result of a complex computation
before using it more than once.

As for large arrays, I'd need more details but sometimes
it's better to change array values than make a new an
array with different values. It depends what you want
to do.

> Thanks
> Charlie.

Le Hibou
--
Dalinian: Lisp. Java. Which one sounds sexier?
RevAaron: Definitely Lisp. Lisp conjures up images of hippy coders,
drugs, sex, and rock & roll. Late nights at Berkeley, coding in
Lisp fueled by LSD. Java evokes a vision of a stereotypical nerd,
with no life or social skills.

Joost Kremers

unread,

Nov 21, 2002, 6:33:03 PM11/21/02

to

Knut Arild Erstad wrote:
> [Joost Kremers]
> :
> : BST-INSERT returns a copy of the tree with the new element added to
> : it. (the same happens with the function BST-REMOVE.) i can't help but
> : feel that this is very uneconomical if your tree is getting larger. it
> : takes more time, but it also seems to make ineconomical use of
> : memory. i mean, copying the entire tree is fine if it only holds a few
> : dozen elements, but what if you have a tree that holds a few megabytes?
>
> I was also confused by this code the first time I saw it, but it doesn't
> copy the whole tree. It only allocates nodes on the search path for the
> new node,

yes, of course... i should have realized. thanks for pointing that out.

> at most O(log n) if the tree is balanced. So it is actually as
> fast as a normal bst-insert. Yes, it does allocate more memory, but again
> only O(log n) if the tree is balanced.

still, if you add an element to the tree, part of the tree gets
copied. the root node of the new tree is returned by the
function. suppose i use setf to assign this new root node to the
variable that was holding the old root node. am i correct in assuming
that the garbage collector will eventually clear up the memory that
holds the nodes of the old tree that are not reused in the new tree,
i.e. the nodes along the search path, which were copied?

Kent M Pitman

unread,

Nov 21, 2002, 9:02:38 PM11/21/02

to

"Charlie" <char...@zoom.co.uk> writes:

> Just a quick question: I want to avoid using side effects

Why?

Charlie

unread,

Nov 22, 2002, 6:25:49 AM11/22/02

to

The purpose of that was to learn about functional programming techniques.
Perhaps the first technique should be knowing when _not_ to use functional
programming techniques!

Charlie.

"Kent M Pitman" <pit...@world.std.com> wrote in message
news:sfwsmxu...@shell01.TheWorld.com...

Tim Bradshaw

unread,

Nov 22, 2002, 7:56:14 AM11/22/02

to

* Hannah Schroeter wrote:
> In fact, there are implementation techniques for purely functional
> arrays that also use O(log n) memory for an update if the old
> version is retained alongside the new version, and O(log n) time
> for access and (purely function) update.

I think that if it doesn't take O(1) time for access and update then
it's not an array in any reasonable sense.

--tim

Joe Marshall

unread,

Nov 22, 2002, 9:29:14 AM11/22/02

to

Tim Bradshaw <t...@cley.com> writes:

O (log n) where n has a strict limit (like array-dimension-limit)
*is* O(1). An array is a mapping from a range of non-negative
integers to values.

Joe Marshall

unread,

Nov 22, 2002, 9:29:54 AM11/22/02

to

Joost Kremers <joostk...@yahoo.com> writes:

> still, if you add an element to the tree, part of the tree gets
> copied. the root node of the new tree is returned by the
> function. suppose i use setf to assign this new root node to the
> variable that was holding the old root node. am i correct in assuming
> that the garbage collector will eventually clear up the memory that
> holds the nodes of the old tree that are not reused in the new tree,
> i.e. the nodes along the search path, which were copied?

Provided that had been the only pointer to the old root, yes.

Tim Bradshaw

unread,

Nov 22, 2002, 9:42:29 AM11/22/02

to

* Joe Marshall wrote:
> O (log n) where n has a strict limit (like array-dimension-limit)
> *is* O(1).

Yes of course, but that's a pretty silly point.

> An array is a mapping from a range of non-negative
> integers to values.

So is a linked list (among other things), and linked lists clearly
have O(1) access time for any machine with finite memory. I'd like to
be able to distinguish between linked lists and arrays.

We could now get into a huge debate where I successively try (and
fail) to refine some definition in terms of access time, but I won't
bother, because it will just waste space and bandwidth.

--tim

Kent M Pitman

unread,

Nov 22, 2002, 10:44:46 AM11/22/02

to

Tim Bradshaw <t...@cley.com> writes:

As an aside only for very sophisticated readers...

It was finally pointed out to me that it dosn't take O(1) time for
accessing arrays either. I used to puzzle over this for hours while
working on T. We'd redefined nearly everything in terms of patterns
of closures (structures, for example), but I still couldn't make
arrays that way, because you could do

[Pseudocode offered for sketch value only. I doubt it runs in any lisp
or scheme. I certainly did not try.]

(define (set-array-return-augmented array index-to-set value-to-set)
(lambda (index)
(if (= index index-to-set)
value-to-set
(array index))))

or you could even have foreknowledge of the array's contents and make
stuff like:

(define (make-array-with-contents list-of-contents)
(if (null? list-of-contents)
(lambda (index) (error "out of bounds"))
(let ((n (length (list-of-contents)))
(n/2 (floor (/ n 2))))
(let ((head (subseq list-of-contents 0 n/2))
(tail (subseq list-of-contents n/2 n)))
(lambda (index)
(if (< index n/2)
(head index)
(tail (- index n/2))))))))

but no matter how you arrange it it still feels loggy in time, and never
becomes constant time.

I felt I should be entitled to better. But someone pointed out to me
years later that the idea that arrays are O(1) is an illusion brought
on by doing fixnum access, which hides the O(log(n)) inside the
machine instruction where we count it as a constant time access
because it's a bounded-time log(n) that's fast enough that we slow
down the math that could really be done faster just in case it
contains carries and whatnot that could be needing to slow it down.
It would be more obvious if we allowed bignum indexes that it's a
fallacy that you can have constant time access to a structure of
arbitrary size (selected at runtime), because it presupposes that you
have hardware that will be allocated of analogous size at runtime. We
only get O(1) because of the fixnum size restriction that allows to
pre-allocate a processor that can keep up.

In a sense, the obvious conclusion is that arrays must not only have
O(1) access but a fixed upper size (in order to allow you to sweep the
O(log(n)) under the rug and claim it was really a constant. Arbitrarily
large asymptotic performance problems can be hidden in this way, by
simply substituting huge constant factors. ;) Or another way of saying
it is that in any reasonable sense, according to your criterion, there
is nothing which deserves the title of array. And [barring an upper array
bound size], that's probably also true.

I hope no one who plans to get practical work done with arrays is reading
this. This is just philosophical moaning with little practical import
other than to remind people that the three laws of thermodynamics do,
ultimately, apply here, as they do in so many places. (1. you can't win.
2. you can't break even. 3. you can't get out of the game.)

Kent M Pitman

unread,

Nov 22, 2002, 10:46:00 AM11/22/02

to

Tim Bradshaw <t...@cley.com> writes:

Aw. You guys are so much more concise than me.
I should have read this exchange first instead of boring everyone
with my waste of space and bandwidth.

Nils Goesche

unread,

Nov 22, 2002, 11:13:39 AM11/22/02

to

Tim Bradshaw <t...@cley.com> writes:

> * Joe Marshall wrote:

> > O (log n) where n has a strict limit (like array-dimension-limit)
> > *is* O(1).

> Yes of course, but that's a pretty silly point.

> > An array is a mapping from a range of non-negative integers to
> > values.

> So is a linked list (among other things), and linked lists clearly
> have O(1) access time for any machine with finite memory. I'd like
> to be able to distinguish between linked lists and arrays.

That's right, of course; linked lists, however, have O(n) access time,
and there is a whole world between O(n) and O(log n). For all
practical purposes, O(log n) will indeed be just as good as O(1), I
think. Feel some pity for our poor, crippled Haskellers ;-)

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x0655CFA0

Tim Bradshaw

unread,

Nov 22, 2002, 12:14:13 PM11/22/02

to

* Kent M Pitman wrote:
> In a sense, the obvious conclusion is that arrays must not only have
> O(1) access but a fixed upper size (in order to allow you to sweep the
> O(log(n)) under the rug and claim it was really a constant. Arbitrarily
> large asymptotic performance problems can be hidden in this way, by
> simply substituting huge constant factors. ;) Or another way of saying
> it is that in any reasonable sense, according to your criterion, there
> is nothing which deserves the title of array. And [barring an upper array
> bound size], that's probably also true.

Yes, I agree - ultimately it's not clear to me that anything is faster
than something which has to walk down a tape. But given 64bit fixnums,
then there is a very large range for which it's useful to talk about
constant speed.

However there's a much more important issue: physics. As machines get
faster it will become harder and harder to maintain the illusion that
you can physically do constant-speed access to data, for large enough
amounts of data, because it's just too far from the processor.
Machines do amazing tricks already, but I think that for large
problems you just have to know a lot about memory performance issues,
and memory (and hence arrays) isn't constant-speed access at all but
have some very complicated history-dependent model.

--tim

Joe Marshall

unread,

Nov 22, 2002, 1:35:59 PM11/22/02

to

Tim Bradshaw <t...@cley.com> writes:

> So is a linked list (among other things), and linked lists clearly
> have O(1) access time for any machine with finite memory. I'd like to
> be able to distinguish between linked lists and arrays.
>
> We could now get into a huge debate where I successively try (and
> fail) to refine some definition in terms of access time, but I won't
> bother, because it will just waste space and bandwidth.

What is usenet but a waste of space and bandwidth?

My point was that the difference between O(1) and O(log n) where n is
array-dimension-limit is fairly small. You can't say the same about
O(n).

Tim Bradshaw

unread,

Nov 22, 2002, 2:49:14 PM11/22/02

to

* Joe Marshall wrote:

> My point was that the difference between O(1) and O(log n) where n is
> array-dimension-limit is fairly small. You can't say the same about
> O(n).

Yes, I guess I agree with this. Actually, this is likely a place
where the constant factors matter - assuming (big assumption) a memory
system which can keep up, array access is some very small number of
instructions (maybe 1 in the good case?), which is quite hard to beat.
And people probably really notice for some codes if the number of
instructions goes up much (or at all).

--tim

Erik Naggum

unread,

Nov 22, 2002, 7:56:14 PM11/22/02

to

* Joe Marshall <j...@ccs.neu.edu>

| What is usenet but a waste of space and bandwidth?

/Degrees/ of waste of space and bandwidth.

--
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.

Rob Warnock

unread,

Nov 23, 2002, 4:09:20 PM11/23/02

to

Tim Bradshaw <t...@cley.com> wrote:
+---------------

| However there's a much more important issue: physics. As machines get
| faster it will become harder and harder to maintain the illusion that
| you can physically do constant-speed access to data, for large enough
| amounts of data, because it's just too far from the processor.
| Machines do amazing tricks already, but I think that for large
| problems you just have to know a lot about memory performance issues,
| and memory (and hence arrays) isn't constant-speed access at all but
| have some very complicated history-dependent model.

+---------------

Ah, yez, as for example in the SGI Origin 3000 ccCUMA machine, where
(on an otherwise idle system) memory access time (for a whole cache
line on a "read miss") can range from less than 200ns to nearly 1us,
depending on how many "router hops" the access has to go over.
(It can, of course, be a *lot* worse under a heavy MP load...)
And which node a given page is on depends on a large number of
factors, including which CPU first "touched" it (caused it to
be actually allocated, as opposed to merely malloc'd) and memory
pressure from competeing jobs. (Not to mention automatic page
migration, if they have that turned on yet.)

Fun stuff. Doing repeatable HPC benchmarks is... "challenging".

-Rob

-----
Rob Warnock, PP-ASEL-IA <rp...@rpw3.org>
627 26th Avenue <URL:http://www.rpw3.org/>
San Mateo, CA 94403 (650)572-2607

Tim Bradshaw

unread,

Nov 24, 2002, 8:43:32 AM11/24/02

to

* Nils Goesche wrote:

> That's right, of course; linked lists, however, have O(n) access time,
> and there is a whole world between O(n) and O(log n). For all
> practical purposes, O(log n) will indeed be just as good as O(1), I
> think. Feel some pity for our poor, crippled Haskellers ;-)

Well, I disagree, because (as I said somewhere else) one important
thing about arrays is that the constant factors are tiny - access is
often ~1 instruction, for lots of array-bashing type codes, constant
factors probably matter - if it's 10x slower (so 10 instructions per
access) you suddenly need a machine you can't buy.

--tim