FOLD interface weirdness

Bradd W. Szonye

unread,

Oct 26, 2003, 4:34:28 AM10/26/03

to

SRFI-1 provides FOLD, and PLT Scheme provides the equivalent FOLDL:

FOLD kons knil clist1 clist2 ... => value

The folding function (kons) has the signature

<kons> datum1 datum2 ... knil => next

Putting the seed last is handy for making list-consing folds; for
example, you can write

(fold cons '() l)

to reverse the list l. And putting the seed last isn't a problem for
single-use procedures (i.e., (fold [lambda (d1 d2 d3 seed) ...] ...)).
But it seems inconvenient for "library" fold functions.

Suppose that you find yourself using the same folding function often. It
can operate on any number of lists. Rather than repeat its definition
everywhere, identical except for the number of arguments, you'd like to
define it once and use it everywhere. It seems to me that the natural
signature for this kind of function would be

<kons> knil . data => next

But that's not how it's defined. Is this ever a problem in practice? If
so, how do you work around it? Do you just accept a list and pull out
the last value on each iteration (yuck!)? Do you define macros to deal
with it? Is it always possible to write the code in such a way that knil
naturally fits into the last argument position?

There's a note in SRFI-1 that "MIT Scheme and Haskell flip F's arg order
for their reduce and fold functions." Does that mean that those
languages call kons as (kons knil datum1 datum2 ...)? That seems more
natural to me -- it works better in the general case, rather than
working well for one (very) common case and poorly in others.

In fact, you could get the best of both worlds by using the "seed first"
convention and using a backwards cons as the folding function:

(fold-prime reverse-cons '() l)

SRFI-1 even provides this "backwards cons" as XCONS. Therefore, list
construction would be just as easy with the "seed first" convention!

(fold xcons '() l) == (reverse l)

Even better, this expression is more obvious about what it does -- the
use of xcons makes it more obvious that it's reversing the list rather
than copying it.

So I'm a bit puzzled -- why do SRFI-1 and PLT use the seed-last
convention, when the seed-first convention seems more general (and
therefore superior)? Am I missing something?
--
Bradd W. Szonye
http://www.szonye.com/bradd
My Usenet e-mail address is temporarily disabled.
Please visit my website to obtain an alternate address.

David Rush

unread,

Oct 27, 2003, 2:31:10 PM10/27/03

to

On Sun, 26 Oct 2003 09:34:28 GMT, Bradd W. Szonye
<bradd...@szonye.com.invalid> wrote:
> So I'm a bit puzzled -- why do SRFI-1 and PLT use the seed-last
> convention, when the seed-first convention seems more general (and
> therefore superior)? Am I missing something?

Bad habits and historical compatibility. everything in this space
traces its roots back to MAPCAR. I also find it awkward to have the
function argument be the first argument to map. It is nearly always
the most complex value to express in the call and the length of the
expression tends to obscure the connection with the collection
arguments.

</grumble>

If someone has a real answer please pipe up...

david rush
--
(\x.(x x) \x.(x x)) -> (s i i (s i i))
-- aki helin (on comp.lang.scheme)

Anton van Straaten

unread,

Oct 27, 2003, 4:22:25 PM10/27/03

to

David Rush wrote:
> On Sun, 26 Oct 2003 09:34:28 GMT, Bradd W. Szonye
> <bradd...@szonye.com.invalid> wrote:
> > So I'm a bit puzzled -- why do SRFI-1 and PLT use the seed-last
> > convention, when the seed-first convention seems more general (and
> > therefore superior)? Am I missing something?
>
> Bad habits and historical compatibility. everything in this space
> traces its roots back to MAPCAR. I also find it awkward to have the
> function argument be the first argument to map. It is nearly always
> the most complex value to express in the call and the length of the
> expression tends to obscure the connection with the collection
> arguments.

Having the function argument to fold be first makes sense if you want to be
able to compose a call to fold with its function, e.g. (compose foldl
(lambda (a b) ...)). Haskell, ML etc. all put the function argument first,
probably for that reason.

> If someone has a real answer please pipe up...

Dunno if it qualifies as a real answer, but here are my musings. First, I
suspect the seed-last convention derives directly from the interface of
cons, and in that respect it actually makes sense to me.

Re the comparison to Haskell, because of the static typing in Haskell (and
ML), their foldl can only take a single input list, so the multiple-argument
issue doesn't arise. You need different functions to handle different
numbers of lists, leading to Haskell having functions like zip4, zip5, zip6,
zip7, zipWith4, zipWith5, etc.

Haskell's flipped seed/element order is not universal - SML/NJ uses the
seed-last order, for example, and I'm guessing other MLs do too. I'd be
interested to hear the Haskell rationale for the order it uses - I notice it
means you need to use "foldl (flip (:)) ..." to do an ordinary "cons"-style
fold, i.e. the equivalent to using XCONS.

I don't really see that relating the use of XCONS to the need to reverse the
resultant list makes things any simpler - it seems to me a forced and
partial symmetry which, unless I'm missing something, I can't imagine asking
for as a feature in its own right.

That leaves the question of accessing a seed-last argument generically.
SRFI-1 provides cons*, which addresses one case. You could also do it with
a pattern match, or with something like (lambda args (apply f (reverse
args))). But I would think this kind of thing is only necessary in fairly
rare circumstances: when you're working with multiple input lists
(definitely not the common case), *and* the seed argument needs to be
distinguished, as opposed to just being one of the arguments to an n-ary
operator like '+' or '*'.

I see this as a situation where you have to pick a convention which will
work well for some cases but not as well for others, i.e. there's no perfect
single choice - short of something drastic like named arguments, so that the
order of arguments in the function passed to fold doesn't matter.

Anton

Marcin 'Qrczak' Kowalczyk

unread,

Oct 27, 2003, 6:16:26 PM10/27/03

to

On Mon, 27 Oct 2003 21:22:25 +0000, Anton van Straaten wrote:

> Haskell's flipped seed/element order is not universal - SML/NJ uses the
> seed-last order, for example, and I'm guessing other MLs do too.

Haskell:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldl :: (b -> a -> b) -> b -> [a] -> b

SML:
foldr : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
foldl : ('a * 'b -> 'b) -> 'b -> 'a list -> 'b

OCaml:
fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b
fold_left : ('b -> 'a -> 'b) -> 'b -> 'a list -> 'b

For me it's more intuitive if the function is a,b->b for foldr and b,a->b
for foldl, because it reflects the order in the list: in foldr the b comes
from the elements after the current element and in foldl from the elements
before it.

OCaml extends this to the ordering between the seed and the list. I'm not
sure if I like it or not.

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

Bradd W. Szonye

unread,

Nov 2, 2003, 3:07:32 PM11/2/03

to

Bradd wrote:
>> So I'm a bit puzzled -- why do SRFI-1 and PLT use the seed-last
>> convention, when the seed-first convention seems more general (and
>> therefore superior)? Am I missing something?

David Rush <dr...@aol.net> wrote:
> Bad habits and historical compatibility. everything in this space
> traces its roots back to MAPCAR. I also find it awkward to have the
> function argument be the first argument to map. It is nearly always
> the most complex value to express in the call and the length of the
> expression tends to obscure the connection with the collection

> arguments .... If someone has a real answer please pipe up.

Been thinking about this, albeit slowly.

First, I agree that a complicated, inline map-function or fold-function
disrupts the flow of the code. However, there is a solution to this that
won't obscure anything and that actually improves internal documentation
by giving a name to the function.

(let ((complicated-map-function (lambda args body ...)))
(map complicated-map-function l))

Where the inline function obscures the meaning of the code, this pattern
actually improves readability by clearly stating your intent.

Second, one of the things that bothers me most about SRFI-1 FOLD is that
the list/seed order is swapped between FOLD and the fold-function. That
is, you call (fold f seed l), but FOLD calls (f datum seed). That's an
error-prone interface -- very easy to forget about the switch.

To deal with that, and to deal with the fact that sometimes it's more
convenient to have the seed first, sometimes last, I thought it might be
best to have two folding procedures, CFOLD and XFOLD. They're named
after the folding-function you'd use for simple list-reversal:

(cfold cons l '())
(xfold xcons '() l)

CFOLD calls the folding function with seed last, and XFOLD calls it with
seed first. The interface for both procedures is parallel to the
interface for the folding-function. That is, CFOLD calls (f datum seed),
and XFOLD calls (f seed datum).

I'm not sure what the names and seed order should be for a right fold,
off the top of my head. Likewise, I'm not sure what should change for
REDUCE.

David Rush

unread,

Nov 10, 2003, 4:39:07 AM11/10/03

to

On Sun, 02 Nov 2003 20:07:32 GMT, Bradd W. Szonye <bradd...@szonye.com>
wrote:

> David Rush <dr...@aol.net> wrote:
>> Bad habits and historical compatibility. everything in this space
>> traces its roots back to MAPCAR. I also find it awkward to have the
>> function argument be the first argument to map. It is nearly always
>> the most complex value to express in the call and the length of the
>> expression tends to obscure the connection with the collection
>> arguments .... If someone has a real answer please pipe up.
>

> First, I agree that a complicated, inline map-function or fold-function
> disrupts the flow of the code.

That is not quite what I said. I have a problem with the inline function
as the first argument.

> However, there is a solution to this that
> won't obscure anything and that actually improves internal documentation
> by giving a name to the function.

No it doesn't. You (well I do) very frequently use the combinators to
implement many generic looping situations. Now re-think your suggestion
in C. Your solution in C would look like

some_type* complicated_map_function(...)
int some_other_function(...) {
for(i=0; i<n; n+=1) complicated_map_function(...);
... }

Which some (including moiself) would argue is a good idea if the body of
the for is too complex (say greater than 20 lines). But in the Scheme case
'too complex' becomes more like 5 lines. And if the nesting depth of the
expression is more than 2 or 3 those 5 lines end up having to be very
short.

Now lets recast the Scheme combinators as a C for loop:

for {
int foo = lotsa(...);
c_code(...); }
(x=1, x<n, x+=1);

I don't think anyone would say *that* is a preferable way to arrange the
arguments.

Anyway, I am not arguing about making wholesale changes to the established
combinators (I've got too much code that would break). I'm just pointing
out
that there is good reason to break with the established tradition of
combinator
argument order when designing future libraries.

david rush
--
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Bradd W. Szonye

unread,

Nov 11, 2003, 4:01:36 PM11/11/03

to

> Bradd W. Szonye <bradd...@szonye.com> wrote:
>> First, I agree that a complicated, inline map-function or
>> fold-function disrupts the flow of the code.

David Rush <ku...@gofree.indigo.ie> wrote:
> That is not quite what I said. I have a problem with the inline
> function as the first argument.

Yes, I understand what you meant. It's the combination of using an
inline function *and* putting it first that causes trouble. If it were
the last argument, it wouldn't disrupt the flow.

> Anyway, I am not arguing about making wholesale changes to the
> established combinators (I've got too much code that would break). I'm
> just pointing out that there is good reason to break with the
> established tradition of combinator argument order when designing
> future libraries.

It's more difficult to write the combinator when you put the function
argument last, though. That's because Scheme has a convenient syntax for
putting variable argument lists at the end of the formal arguments, but
it's much more difficult to put them in the middle. Consider the
argument processing for conventional MAP and MAP' (which takes the
map-function last):

(define (map f . lists)
...)

(define (map' . lists-and-function)
(let ((f (car (reverse lists-and-function)))
(lists (reverse (cdr (reverse lists-and-function)))))
...)

You can simplify the syntax for MAP', but it's still inefficient and
cumbersome to pull a single argument off the end of a variable argument
list. You can avoid the problem and obtain a better "map loop" syntax in
one of several ways:

1. Don't define the map-function inline.

(define (complicated-loop-code datum)
body ...)
(map complicated-loop-code '(0 1 2 3))

2. Define a macro to rearrange the arguments from the order that makes
sense for humans to the order that makes sense for the Scheme
translater. For example:

(map-loop over '(0 1 2 3)
with (datum)
body ...)

The macro translates the above code into

(map (lambda (datum) body ...) '(0 1 2 3))

3. Create a named-parameter syntax.

(map-named '(0 1 2 3) mapf: (lambda (datum) body ...))

Each solution has advantages and disadvantages. The first method is
simple, but it still reverses the order of code and list, which you
don't seem to like. The second solution is probably the right way to do
it for most cases -- macros are what you use to translate "makes sense
to humans" into "makes sense to Scheme," after all. The third solution
is similar.

Adrian Kubala

unread,

Nov 12, 2003, 1:41:05 AM11/12/03

to

On Tue, 11 Nov 2003, Bradd W. Szonye wrote:
> It's more difficult to write the combinator when you put the function
> argument last, though.

It's also more awkward to curry. Maybe I'm biased because I saw Haskell
before scheme, but I always assumed this was the reason map is defined
function-argument-first.

David Rush

unread,

Nov 19, 2003, 11:14:38 AM11/19/03

to

Curry, schmurry. Since you have to explicitly write out the new lambda
in scheme anyway this is a major non-issue. I will agree that when you're
using names it reads more like english when the function is first.

Adrian Kubala

unread,

Nov 19, 2003, 12:26:52 PM11/19/03

to

On Wed, 19 Nov 2003, David Rush wrote:
> On Wed, 12 Nov 2003 00:41:05 -0600, Adrian Kubala

> > On Tue, 11 Nov 2003, Bradd W. Szonye wrote:
> >> It's more difficult to write the combinator when you put the function
> >> argument last, though.
> >
> > It's also more awkward to curry. Maybe I'm biased because I saw Haskell
> > before scheme, but I always assumed this was the reason map is defined
> > function-argument-first.
>
> Curry, schmurry. Since you have to explicitly write out the new lambda
> in scheme anyway this is a major non-issue.

Not if you're using a macro to generate the new lambda. OK, we still have
the "cut" SRFI but it's nice to be able to just write (curry map my-func).