Traditionally, pairs in Scheme/Lisp are mutable
data structures. How much breakage would making
pairs immutable cause?
Note: I'm not talking only about getting rid of
set-car! and set-cdr![1]. This is only one part
of the equation. By making pairs immutable, I
mean really immutable. That is:
* cons does not have to allocate, and therefore,
* eq/eqv on two pairs with eq/eqv cars and cdrs
would be undefined
Getting rid of set-car! and set-cdr! is really
easy since you can "grep" for them in your code
and do the necessary changes to eliminate them.
How much breakage would really immutable pairs
cause? Do you depend on identity of pairs even
in the absence of set-car! and set-cdr!?
Aziz,,,
[1]
http://blog.plt-scheme.org/2007/11/getting-rid-of-set-car-and-set-cdr.html
For identity, do you mean something like:
(let ((init (list #f)))
...
init possibly gets re-assigned along the way
....
(if (eq? init foo)
...
... ))
If so, yes I do use it, but in the presence of immutable pairs, using
a vector could also work in the above case, I think.
Cheers
leppie
>How much breakage would really immutable pairs
>cause? Do you depend on identity of pairs even
>in the absence of set-car! and set-cdr!?
I think I can recall one or two programs in which I wrote code that
relied on the EQ-ness of pairs. I am not sure it was the best way
to write things, but I am sure that code out there does exist that
uses this.
Is it that hard to infer the mutability of given pairs inside code
simply by checking whether they are even SET*!'d or used in EQ[V]?
comparisons?
--
Aaron W. Hsu <arc...@sacrideo.us> | <http://www.sacrideo.us>
"Government is the great fiction, through which everybody endeavors to
live at the expense of everybody else." -- Frederic Bastiat
+++++++++++++++ ((lambda (x) (x x)) (lambda (x) (x x))) ++++++++++++++
> By making pairs immutable, I
> mean really immutable. That is:
> * cons does not have to allocate, and therefore,
What do you mean by this?
--
Jens Axel Søgaard
Meaning: the compiler can transform (cons 1 '()) to '(1).
So, if you have a procedure defined as:
(define (f) (cons 1 '()))
You lose the guarantee that (eq? (f) (f)) must evaluate to
#f (which is what Scheme guarantees since cons in Scheme
has to allocate).
Aziz,,,
I have code that uses eq? on pairs too, but the pairs in my
case are immutable. The eq? test is just an optimization.
So, even if equal? pairs get coalesced, lose their identity,
and magically become eq? (say after GC), then all the better.
> Is it that hard to infer the mutability of given pairs inside code
> simply by checking whether they are even SET*!'d or used in EQ[V]?
> comparisons?
Simple data-flow analysis is not that hard, but it doesn't
take you far either. So, you pay the cost for little or no
benefit in general. Compare to some other immutable data in
Scheme like procedures, numbers, characters, booleans, where
the system is free to do copy-propagation, constant-folding,
coalescing, etc., without having to worry about mutations.
Just imagine the horror that a compiler must go through if
numbers are mutable.
Aziz,,,
Okay. This gives the compiler writer the option to
implement cons as hash-cons [1]. That would indeed give
a more functional feel.
Hmm. Would you ban constructs that create, say,
cyclic structures? If not graph traversal is
tricky to implement without a working eq? .
[1]
http://planet.plt-scheme.org/package-source/dyoo/hash-cons.plt/1/0/doc.txt
--
Jens Axel Søgaard
I cannot think of a practical use case for relying on (eq? (f) (f))
equal to #f. It is surely a matter of ignorance on my part, but I
would really like to see a meaningful use case. Does anybody have one?
Michele Simionato
In various contexts there's often a need to have a unique tag that is
guaranteed to be distinct from all other values. In pre-R6RS Scheme a
portable way to have such a tag is simply:
(list 'tag-name)
where 'tag-name is just an informal description, and the true
uniqueness is guaranteed by the allocation behavior of cons.
Lauri
> Okay. This gives the compiler writer the option to
> implement cons as hash-cons [1]. That would indeed give
> a more functional feel.
It does give the option, yes. But it also allows the
compiler to translate, say,
(define r
(let ([x (+ 1 2)]
[y (/ 1 2)])
`(,x ,y))
(define s (f r))
(define t (apply + r))
to
(define r '(3 1/2))
(define s (f r))
(define t '7/2)
and thus you get all the benefits of many optimizations
that only apply to numbers, characters, and booleans, to
aggregate lists almost for free.
> Hmm. Would you ban constructs that create, say,
> cyclic structures?
It's tricky. The compiler can block-allocate pairs and
wire them together to make cycles without mutation in
the same way it can block-allocate recursive procedures.
But the decisions for how to make recursive procedures
is done at compile time, so it's relatively easy to emit
code that does the wiring. Not so if you want to make
cyclic lists at runtime.
> If not graph traversal is
> tricky to implement without a working eq? .
Traversing cyclic graphs is hard even with a "working"
eq?. :-) But that's why there are other mutable data
structures that can be used for that.
Aziz,,,
> How much breakage would really immutable pairs
> cause? Do you depend on identity of pairs even
> in the absence of set-car! and set-cdr!?
I think this idiom would no longer work:
(let ((probe (assq key alist)))
(when probe
... do something with (cdr probe) ...
(set! alist (remq probe alist))))
Don't know how often this is used.
Helmut.
Well, the meaning of the procedure f there is: *make* me
a new pair whose car and cdr are 1 and ().
So, you can rely on doing:
(let ([t (f)])
(set-cdr! t (f))
t)
to create (1 . (1 . ())).
In the absence of mutation, Lauri's example applies. You
can call (f) to return a new "tag" that's distinct from
all other tags in memory. It's not ideal since you lose
the tag if you write it out to file and read it back later.
(and this is why I always use gensyms for that purpose)
One more use of these "tags" is to define:
(define (hashtable-exists? h x)
(let ([t (cons 1 1)])
(not (eq? t (hashtable-ref h x t)))))
If pairs are immutable, then there might be a key in the
table that maps to (1 . 1), and you'd incorrectly get #f
in that case.
Aziz,,,
> In various contexts there's often a need to have a unique tag that is
> guaranteed to be distinct from all other values. In pre-R6RS Scheme a
> portable way to have such a tag is simply:
>
> (list 'tag-name)
>
> where 'tag-name is just an informal description, and the true
> uniqueness is guaranteed by the allocation behavior of cons.
The question is: can you easily identify the places in your
code where you do this so that you can change them to, say,
(vector 'tag-name)? Or would making pairs immutable break
your code in unpredictable ways that you cannot easily repair?
Aziz,,,
Ok, I see what you mean. This is a possible implementation of the
sentinel pattern
in absence of gensym. I have always used gensym for this kind of
things and it would
never have occurred to me to use conses. I suspect that all the usages
of comparing
conses with eq are of this kind. I have no idea how serious the
breaking would be in
legacy Scheme code. Certainly it would not break anything I have ever
written
(for the record, I am strongly in favor of immutable pairs).
Michele Simionato
> (and this is why I always use gensyms for that purpose)
Just in case you're not familiar with the concept, a gensym is
a special kind of symbol whose whole purpose in life is to be
different from everything else (including other gensyms).
Chez and Ikarus preseve gensym identity when you write them
out to file and read them later, and if you compile a library
whose data contains a gensym and reload it in another session.
Aziz,,,
>> If not graph traversal is
>> tricky to implement without a working eq? .
>
> Traversing cyclic graphs is hard even with a "working"
> eq?. :-) But that's why there are other mutable data
> structures that can be used for that.
It would slightly annoying to risk length, display
and others to choke on cyclic lists [not that
I can remember whether it is mandatory to signal
an error, but most implementations do].
--
Jens Axel Søgaard
I think we're agreeing that having lists that are both
immutable and cyclic is problematic. Having one (x)or
the other is fine by itself. Right?
Aziz,,,
> Okay. This gives the compiler writer the option to
> implement cons as hash-cons [1].
Hash-consing is one "potential" benefit as it does not
always pay for itself, e.g., cons is always faster than
hash-cons. Using hash-tables at the user level is both
slow and yields memory leaks unless additional support
is provided by the memory management system. Anyways,
the ML implementors have beaten this topic to death by
now. One can probably start with Appel and Goncalves's
http://www.cs.princeton.edu/research/techreps/TR-412-93
and trace the references and citations for what's good
and what's bad about hash-consing.
There are of course many other benefits to eliminating
mutations or making pairs immutable. No need to site
them now. I was just wondering about what problems
people may have.
Aziz,,,
What would the result of optimization look like if you just took the
'ban set-car! and set-cdr!' route? Can you do as well as:
(define r (list 3 1/2))
(define s (f r))
(define t '7/2)
sam th
> What would the result of optimization look like if you just took the
> 'ban set-car! and set-cdr!' route? Can you do as well as:
>
> (define r (list 3 1/2))
> (define s (f r))
> (define t '7/2)
Yes you can, and that would be better than what we have.
There are many cases that I sometimes try to write and I
know are trivial to optimize when lists are immutable but
cannot be optimized without very involved analysis if one
has to preserve eq?-ness even though none of these cases
care about eq?-ness at all.
Aziz,,,
> Here's something (hopefully) useful to discuss.
>
> Traditionally, pairs in Scheme/Lisp are mutable
> data structures. How much breakage would making
> pairs immutable cause?
>
> Note: I'm not talking only about getting rid of
> set-car! and set-cdr![1]. This is only one part
> of the equation. By making pairs immutable, I
> mean really immutable. That is:
> * cons does not have to allocate, and therefore,
> * eq/eqv on two pairs with eq/eqv cars and cdrs
> would be undefined
>
> Getting rid of set-car! and set-cdr! is really
> easy since you can "grep" for them in your code
> and do the necessary changes to eliminate them.
>
> How much breakage would really immutable pairs
> cause?
>
I've quickly grepped through my scheme code in snd (snd.sf.net),
and it seems like I've only used set-cdr! for changing
elements in assocation lists. Mostly doing something
like this:
(define (add-stalin-func name body)
(let ((hit (assq name stalin-funcs)))
(if hit
(set-cdr! hit (list body))
(push! (list name body) stalin-funcs))))
So it would break, but only code which should have
used hash tables instead. :-)
set-car! is used a lot more though. But I also have a big
problem understanding what is actually happening
those places set-car! is used, so I guess the absence
of it could have forced me to write better code.
In addition, set-car! is most likely responsible for
a bug I haven't bothered to fix yet, causing
this hack instead:
...
(or cached
(let ((new-rt (rt-2 (deep-list-copy term)))) ;; <- Hmm. Thats really bad. Seems like a set-car! is performed
(hash-set! *rt-cached-funcs* key new-rt) ;; on term somewhere. Fix it with a deep-list-copy for now.
new-rt))))
...
"rt-2" is a function compiling a scheme-like
input into a c-like output. Unfortunately, "rt-2"
seems to perform a set-car! on the input somewhere,
for some reason, causing me to take a deep copy of
the input first to make memoization work.
> Do you depend on identity of pairs even
> in the absence of set-car! and set-cdr!?
Doubt it.
If I had immutable pairs from the first place, I wouldn't have missed
mutable pairs;
for the cases I'm using them I'd be able to use other mutable
structures instead.
The change of legacy code, though, isn't simple, I'm afraid. There
seems quite
a lot of cruft of habit of mutable pairs as convenient and fast
mutable structure.
Using circular lists as ring buffers, and using a list whose last cdr
is
terminated by a generator as a fast alternative of lazy streams, are a
couple of
examples I can remember I've used in the production code I don't dare
to touch it now :-)
--shiro
> I think we're agreeing that having lists that are both
> immutable and cyclic is problematic. Having one (x)or
> the other is fine by itself. Right?
Right.
--
Jens Axel Søgaard
Right, in general it is a bad idea. It just happens
that I am playing with some symbolic algebra code
right now, and in this area it has some benefits.
--
Jens Axel Søgaard
>One more use of these "tags" is to define:
>(define (hashtable-exists? h x)
> (let ([t (cons 1 1)])
> (not (eq? t (hashtable-ref h x t)))))
>If pairs are immutable, then there might be a key in the
>table that maps to (1 . 1), and you'd incorrectly get #f
>in that case.
On the topic of hashtables, doesn't Chez Scheme also do something
like return a cons pair of a key and a value, where the value can
be SET-CDR!'d to change the value in the hashtable? I seem to
remember using this a few times as well. I am sure you could use
another structure for this task, then, but ... what, now we have
mcons and cons like in PLT? Seems... confusing.
Do you have any of these examples handy? I haven't thought of any in
the 30 seconds I tried. :)
sam th
>The question is: can you easily identify the places in your
>code where you do this so that you can change them to, say,
>(vector 'tag-name)? Or would making pairs immutable break
>your code in unpredictable ways that you cannot easily repair?
I would think that for the most part, you should be abel to easily
identify places in your code where this could be a potential problem,
but that's just knowing the code you write, which every good
programmer would be able to do. On the other hand, I question
whether we would lose some convenience in being able to say, wrap
objects up in pairs, to pass them around, SET them and do the like,
and then be able to plop them nicely into lists later on or compose
them naturally, which is somewhat unnatural to do with VECTORs.
I'm not saying that it couldn't be done naturally, but to me at
least, it would feel weird to have to start having one set of
procedures for handling my mutable stuff, and the other for my
immutable. I kind of like being able to use the same operations
on both. *shrug*
Still, maybe the benefits are worth it. Is there a real world
benefit that I would be able to see?
> Abdulaziz Ghuloum wrote:
>
>> By making pairs immutable, I
>> mean really immutable. That is:
>> * cons does not have to allocate, and therefore,
>
> What do you mean by this?
One thing it could mean is that cons cells might be interned,
so that when you call 'cons' on two things for which a cons
cell already exists, the cons cell you get back is the one
that was allocated the first time. In the absence of mutation,
this has the same effective semantics as allocating a new cons
cell.
Bear
What's the point of getting rid of mutable pairs if you still have other
mutable data structures, like vectors?
ISTM that you either want to go full-blown functional, or there's no
point to this. What's so special about pairs that they should be
immutable?
--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
I was just (15 minutes ago) writing a macro that generates:
(define names '(a b c d ---)
(define labels (list (ct-gensym a) (ct-gensym b) ---)
(define names->labels-alist (map cons names labels))
(ct-gensym a) expands to a quoted fresh gensym, so, I end up with
(define names '(a b c d ---)
(define labels (list 'g0 'g1 'g2 ---))
(define names->labels-alist (map cons names labels))
In the absence of mutation and hard eq?-ness requirements,
the above would become:
(define names '(a b c d ---)
(define labels '(g0 g1 g2 ---))
(define names->labels-alist '([a . g0] [b . g1] [c . g2] ---))
and all other optimizations would cascade naturally, like:
(cond
[(assq 'a names->labels->list) => cdr]
--)
would simply fold to just: 'g0
without having to do anything fancy beyond what any compiler
already knows how to handle. In ikarus, I just have to mark
the primitives map, cons, and list "foldable" and it would
just work. It's a petty that it doesn't.
Aziz,,,
> On the topic of hashtables, doesn't Chez Scheme also do something
> like return a cons pair of a key and a value, where the value can
> be SET-CDR!'d to change the value in the hashtable?
Maybe long time ago, before the implementation of hash table
became "generational" which required new kinds of chain links
for recording feedback from the GC. See "Generation-friendly
Eq hash tables" http://sfp2007.ift.ulaval.ca/procPaper3.pdf
Aziz,,,
Right. Though it would be hard if you're maintaining somebody
else's code.
> On the other hand, I question
> whether we would lose some convenience in being able to say, wrap
> objects up in pairs, to pass them around, SET them and do the like,
> and then be able to plop them nicely into lists later on or compose
> them naturally, which is somewhat unnatural to do with VECTORs.
I agree. I didn't say making pairs immutable would be convenient
or natural in all places. Being able to do
(cond
[(assq x ls) => (lambda (p) (set-cdr! p y))]
---)
is very convenient.
> I'm not saying that it couldn't be done naturally, but to me at
> least, it would feel weird to have to start having one set of
> procedures for handling my mutable stuff, and the other for my
> immutable. I kind of like being able to use the same operations
> on both. *shrug*
Well yes. But I doubt you actually use the same operation on both
pairs that you do mutate and those that you don't mutate. Like
you never mutate a list while mapping over it, right?
> Still, maybe the benefits are worth it. Is there a real world
> benefit that I would be able to see?
Many, and I won't even begin to list them all. Matthew Flatt's
blog post lists some of the benefits. Hash-consing is potentially
another. Being able to constant-fold "cons" and copy-propagate
the result is another. Better garbage collection behavior for
immutable data (maybe). Apply/variadic-lambda can be done way
more efficiently (O(N) vs. O(N^2) if you use them recursively).
You get simpler semantics and more efficient implementation for
all