I have mostly stayed out of the mutable pair discussion taking place here and on Scheme-reports because I've been too busy. However, I would like to point out a few things about the discussion that I have not seen mentioned yet. I've included the closest relevant message set I could find below, but I'll be talking about the whole discussion.
Firstly, let's talk about the code in question: MAP. The proposed dangerous code is:
Aside from the obvious point that you wouldn't actually want to check if it were a list before you started processing. People seem to have ignored what R6RS did to address these issues when it comes to the MAP function. The R5RS definition is flawed because it is underspecified. It doesn't explain what to do in cases when the PROC is mutating the list. At least, I don't see any language in there to deal with that. On the other hand, let's take a look at the R6RS language for this procedure:
> The lists should all have the same length. Proc should accept as many > arguments as there are lists and return a single value. Proc should not > mutate any of the lists.
> The map procedure applies proc element-wise to the elements of the lists > and returns a list of the results, in order. Proc is always called in the > same dynamic environment as map itself. The order in which proc is applied > to the elements of the lists is unspecified. If multiple returns occur > from map, the values returned by earlier returns are not mutated.
> (map cadr ’((a b) (d e) (g h)))
> ⇒ (b e h)
> (map (lambda (n) (expt n n))
> ’(1 2 3 4 5))
> ⇒ (1 4 27 256 3125)
> (map + ’(1 2 3) ’(4 5 6)) ⇒ (5 7 9)
> (let ((count 0))
> (map (lambda (ignored)
> (set! count (+ count 1)) > count)
> ’(a b))) ⇒ (1 2) or (2 1)
> Implementation responsibilities: The implementation should check that the > lists all have the same length. The implementation must check the > restrictions on proc to the extent performed by applying it as described. > An implementation may check whether proc is an appropriate argument before > applying it.
Let's first notice the primary characteristics:
*) MAP does not have a specified order of evaluating the list elements. *) PROC should not mutate the list *) MAP must return a fresh copy of the lists when it returns multiple times. *) Implementations must check the list lengths *) Implementations must check the PROC but only to the extent of applying it.
I'm not sure what the last point means, but it appears that even the most compliant of Scheme implementations doesn't actually check the entire procedure code for assignments to the list elements, or anything of that nature, so I'm going to wait until one of the designers of R6RS explains what that means.
In the mean time, let's revisit the above example of MAP. Obviously it doesn't match the above specification, but it can avoid some of this because it's only dealing with one list. From a standards point of view, we can assume that the procedure doesn't mutate the list. And then we can say that its the programmer's fault if something bad happens because a bad PROC was passed to MAP. From the discussion, I gather that this sort of thinking is frowned upon by some.
Nonetheless, that's the only problem I see with the above definition. I've seen statements that the MAP *over* specifies because it enforces an order of evaluation on the arguments, but that's not true. In fact, you aren't guaranteed to get an error when you run this function with a bad PROC.
Well wait a tick, wasn't this supposed to fail? At least, that's what we've been told would happen. The example code above says that we should have encountered an unexpected error that we haven't handled. Moreover, the claim that we have overspecified the MAP isn't accurate, because that's exactly what's happening here, we're evaluating the order in the opposite direction, right to left.
If we evaluate the order from right to left instead of left to right, then we can avoid this entire problem. If we grab all the pairs in the list before we ever apply the procedure, we can avoid the mutation of the list causing us problems like an unencountered error. We can still see the effects of the mutation to the list if, instead, we mutate the CAR cell. However, this is much less problematic.
So, we can write a "safe" MAP, and it's not that hard either. Unfortunately, this would require us to specify an order of evaluation on the lists. Still, no one has pointed this out yet, that I can see. Forgive me if someone already has.
So, there is still some benefit to immutable pairs, because it let's us use the straightforward idiom for these sorts of things. Still, I don't think this is strong enough a reason. Especially -- and now we're moving into opinion territory -- I really don't think that having two different copies of the exact same structure, with one mutable, and the other not, is a good idea. So we should have immutable bytevectors, and mutable ones? Vectors? Strings? I'm sorry, but I don't think that we should have this sort of duality by forcing two disjoint types to exist for every primitive datatype for which we could do so. I'm saying that on a point of design I don't like the choice to go with two different data structures as the solution.
Another point that I don't think people have brought up, though David, I think, alluded to it, is that if the procedure you're passing can mutate this list, it's safe to assume that the list came from the outside world. That is, you can't construct a situation in which your private structures are in danger because a procedure from the outside world is altering some lists. Thus, it is possible to ensure that your code and data is safe, because if the malicious code at hand can already mutate your data, it can already do so; it doesn't need to use MAP.
This does require some programming discipline: we expect a programmer not to write a procedure that mutates the lists also being passed. Scheme is full of these situations. We expect them not to call string-set! on vectors, for example. I'm not saying that it isn't an issue, but I don't think it's an issue strong enough to warrant breaking backwards compatibility and introducing a separate, redundant data type that is exactly the same as the previous one, except that you can't mutate it. I'm not in favor of this as a core approach to the language in the standard.
Still, I think Eli is right that we shouldn't be so quick to dismiss immutable pairs outright. We need to be honest about the issues they address and their advantages, of which there are plenty of good things.
Aaron W. Hsu
On Monday 14 June 2010 10:07:38 Benjamin L. Russell wrote:
> Alex Shinn <alexsh...@gmail.com> writes: > > On Mon, Jun 14, 2010 at 10:08 AM, Brian Harvey <b...@cs.berkeley.edu> wrote: > >>> From: DekuDekup...@Yahoo.com (Benjamin L. Russell) > >>> Wow, you're really in favor of giving PLT the proverbial kick in the > >>> rear-end.
> >> That is not how I would characterize the situation. PLT have chosen, > >> as they say explicitly, to design a language that is based on Scheme > >> but is not Scheme. More power to them! May the best language win.
> > For those not subscribed, one of the PLT developers has responded
> To quote Barzilay's response in part: > > The problem is demonstrated by Matthew's post, the one that Benjamin > > mentioned, but in a later post he (Benjamin) made the mistake of
> > saying: > > | [...] allowing proofs of correctness for Scheme programs to be > > | written much more easily [...]
> > and following this line, you reply: > > > Over fifty years ago, John McCarthy brilliantly invented a > > > programming language loosely based on lambda calculus, but designed > > > to be used, not just to be reasoned about [...]
> > and all of this is as if immutable pairs are something that fills a > > kind of a theoretical/masochistic need for self punishment in the form > > of correctness proofs. Something that *real* programmers never have > > to deal with, only "theoreticians of bondage and discipline > > languages". That couldn't be more *wrong*. On the side of proofs of > > correctness, there is nothing interesting, really. The bottom line is
> > can be proved as a correct implementation of map *assuming* no changes > > to the list ever occur in the call to `f', or more generally during > > the dynamic extent of the call to `map'. If you're one of these > > theoretical people, there's not much else to see here.
> > The problem is on the side of the *working* programmer -- the one who > > lives in the *real* world. These people will often make the same > > assumption (and as Matthew points out, they almost
> This does require some programming discipline: we expect a programmer not to > write a procedure that mutates the lists also being passed. Scheme is full of > these situations. We expect them not to call string-set! on vectors, for > example.
This is exactly the gist of the problem, however; how do you know whether a given programmer follows this discipline? The whole point of functional programming is to make it less likely for the programmer to write dangerous code.
While maintaining backwards-compatibility with legacy code is important, I don't see why it should constrain the evolution of the language.
This issue is as much a social as a technology issue, if not more so; the choice of a language where programmers can do almost anything that they want (including shooting themselves in the foot) vs. a language designed so that programmers normally cannot shoot themselves in the foot is a social one, not a technological one. It comes down to how intelligent we assume the programmer to be.
Of course if we assume that the programmer is intelligent enough not to shoot himself/herself in the foot, then we would prefer the language with more freedom to do anything, but mistaken assumptions can occur, and how do we guarantee that they won't occur? Are we to assume that programmers never use Scheme for mission-critical applications, or that all Scheme programmers are intelligent enough never to shoot themselves in the foot? Schemers must be allowed to be as creative as possible, but I don't see why mutable lists promotes a better style of programming than immutable lists for the potential future.
If all Schemers were star programmers who never made mistakes, then why did anybody bother to invent immutable lists in one implementation with many users in the first place? At least some users of that implementation thought that the problem was significant enough to conduct an experiment and make lists immutable by default, even at the cost of changing the language into something that was no longer Scheme.
While I *don't* think that immutable lists are a good idea now for the standard, I *do* think that they are a good goal for the future. Backwards-compatibility with legacy code must be maintained, but Scheme should also evolve toward becoming a language which reduces unnecessary burden on programmers. I think that the way to do this now is to keep mutable lists as default for this standard, but relegant set-car! and set-cdr! to modules, and to make those modules optional, so that implementations are less likely to include them in the future.
> Still, I think Eli is right that we shouldn't be so quick to dismiss immutable > pairs outright. We need to be honest about the issues they address and their > advantages, of which there are plenty of good things.
There seems to have been a lot of misunderstanding in the related discussion thread on Scheme-Reports at http://lists.scheme-reports.org/pipermail/scheme-reports/2010-June/00... (for some reason, the site seems unavailable at the moment). Just for the record, let me point out that Barzilay was only opposed to mutable *lists* by default, not to other mutable data structures.
> On Monday 14 June 2010 10:07:38 Benjamin L. Russell wrote: >> Alex Shinn <alexsh...@gmail.com> writes: >> > On Mon, Jun 14, 2010 at 10:08 AM, Brian Harvey <b...@cs.berkeley.edu> wrote: >> >>> From: DekuDekup...@Yahoo.com (Benjamin L. Russell) >> >>> Wow, you're really in favor of giving PLT the proverbial kick in the >> >>> rear-end.
>> >> That is not how I would characterize the situation. PLT have chosen, >> >> as they say explicitly, to design a language that is based on Scheme >> >> but is not Scheme. More power to them! May the best language win.
>> > For those not subscribed, one of the PLT developers has responded
>> To quote Barzilay's response in part: >> > The problem is demonstrated by Matthew's post, the one that Benjamin >> > mentioned, but in a later post he (Benjamin) made the mistake of
>> > saying: >> > | [...] allowing proofs of correctness for Scheme programs to be >> > | written much more easily [...]
>> > and following this line, you reply: >> > > Over fifty years ago, John McCarthy brilliantly invented a >> > > programming language loosely based on lambda calculus, but designed >> > > to be used, not just to be reasoned about [...]
>> > and all of this is as if immutable pairs are something that fills a >> > kind of a theoretical/masochistic need for self punishment in the form >> > of correctness proofs. Something that *real* programmers never have >> > to deal with, only "theoreticians of bondage and discipline >> > languages". That couldn't be more *wrong*. On the side of proofs of >> > correctness, there is nothing interesting, really. The bottom line is
>> > can be proved as a correct implementation of map *assuming* no changes >> > to the list ever occur in the call to `f', or more generally during >> > the dynamic extent of the call to `map'. If you're one of these >> > theoretical people, there's not much else to see here.
>> > The problem is on the side of the *working* programmer -- the one who >> > lives in the *real* world. These people will often make the same >> > assumption (and as Matthew points out, they almost always do), so they >> > write the above definition and then they're done with it, or rather >> > they *think* that they're done. But the problem starts with the fact >> > that Scheme is a language with first-class values -- and `map' happens >> > to be a function that uses that: it is a higher order function. This >> > means that you're passing a mutable value to random code -- and this >> > random code can now break invariants that your own code (`map', in >> > this case) depends on unless you're extra careful.
>> Point acknowledged. So Barzilay's view is that the main problem with >> mutable pairs is not about the difficulty of writing proofs of >> correctness, but about introducing "mutable value[s] to random code," >> which can now "break invariants that your own code ... depends on." For >> a "working programmer," this is the main concern.
-- Benjamin L. Russell / DekuDekuplex at Yahoo dot com http://dekudekuplex.wordpress.com/ Translator/Interpreter / Mobile: +011 81 80-3603-6725 "Furuike ya, kawazu tobikomu mizu no oto." -- Matsuo Basho^
>While maintaining backwards-compatibility with legacy code is important, >I don't see why it should constrain the evolution of the language.
I think this is a red herring. All the arguments I've seen have been for the /desirability/ of retaining programmer choice, not about the need to maintain reluctant compatibility with old code.
I vote that set-car! and set-cdr! be mandatory and not hidden in optional modules in WG1. Not for the sake of old code but for the sake of what I guess is an old philosophy: that computer programs should not think they're smarter than their users.
> This issue is as much a social as a technology issue, if not more so; > the choice of a language where programmers can do almost anything that > they want (including shooting themselves in the foot) vs. a language > designed so that programmers normally cannot shoot themselves in the > foot is a social one, not a technological one. It comes down to how > intelligent we assume the programmer to be.
This, I think, is a key observation: not to make it impossible, just less likely.
I think there's a common misconception that (if I may exaggerate!) the choice is between languages that give you raw pointers and the ability to self-modify your code through destructive mutation of procedures... and languages that don't give you conditionals as they make static analysis difficult and the programmer might forget to do something important in one of the branches so they might introduce bugs.
The suggestion here isn't to make mutation impossible - merely to make it require more explicit effort, so it is more 'obvious', and therefore less likely to happen in unexpected places.
Whether to make pairs immutable by default, or just their cdrs, or just to make set-car! and set-cdr! require an explicit import to 'enable' them in any given module, is really just the detail of how this might be done.
The fact that much library code might be surprised by list mutations in code from outside the library is a good argument in favour of making list mutations more difficult; however, the burden can also be placed on the other side - somebody who mutates a list should, perhaps, as that is the exception rather than the rule, take the time to carefully consider who else might have a reference to that list, and how they might be affected by the mutation. That suggests the "make set-*! be in a module that needs explicit loading" approach.
However, making cons cells by immutable *has* worked well for PLT, as far as I have heard - apparently the uses of cons cell mutation are few, far between, and often handled well by other approaches; which does indeed suggest that making too many sacrifices (both in terms of the complexity of analysis cases due to the possibility of mutation, and in terms of implementation efficiency by prohibiting various tricks that can be performed when things are immutable) might not actually be worth it.
It's a tough decision; certainly not an open-and-shut case!
> While I *don't* think that immutable lists are a good idea now for the > standard, I *do* think that they are a good goal for the future. > Backwards-compatibility with legacy code must be maintained, but Scheme > should also evolve toward becoming a language which reduces unnecessary > burden on programmers. I think that the way to do this now is to keep > mutable lists as default for this standard, but relegant set-car! and > set-cdr! to modules, and to make those modules optional, so that > implementations are less likely to include them in the future.
Yes, I also think that's quite a pragmatic solution for the short term.
> However, making cons cells by immutable *has* worked well for PLT, as > far as I have heard - apparently the uses of cons cell mutation are few,
That's my understanding, as well. That's what I heard from the few PLT people I talked with the other day, and is pretty much my own experience.
I did feel some major pain having to convert *one* library of mine not to return mutable pairs. It was the very first Scheme library I ever wrote, which wasn't what I'd call idiomatic Scheme. My CSV-parsing library required a somewhat unfortunate but trivial change. The only other pain I felt was in making a DrScheme language for SICP exercises.
Also, if I had to choose between immutable and mutable pairs, I would choose immutable. For example, before PLT moved the default for pairs to be immutable, I had already begin using immutable pairs in one of my libraries, for time&space-efficient representation of paths of Web URLs, especially when relative URLs are resolved wrt a base URL. (Using mutable pairs would've been a nightmare, as exposing the representation could mean that one not-unlikely oops by a library user could corrupt every URL in the system, or it would've meant lots and lots more consing to expose only copies and would've taken away one of my algorithmic efficiencies from shared tails as well.) The specifics here are just one concrete example of why I like immutable pairs -- I've often found similar situations when trying to write robust libraries that share lists with calling code.
On 16 June 2010 10:38, Neil Van Dyke <n...@neilvandyke.org> wrote:
> Alaric Snell-Pym wrote at 06/16/2010 05:16 AM:
> However, making cons cells by immutable *has* worked well for PLT, as far >> as I have heard - apparently the uses of cons cell mutation are few,
> That's my understanding, as well. That's what I heard from the few PLT > people I talked with the other day, and is pretty much my own experience.
Well pretty much my initial experience was a large disaster, wasting hours of my time trying to figure out what was going on, why it was going on, and if I should actually care about it. And all of this just so that I could use PLT's debugger. My personal suspicion is that it has worked well - for people who are exclusive users of PLT. I tend to write in an R4RS dialect because I actually work with several different platforms and the *fastest* implementations are all compatible with no report much than that.
I tend to think that ASSOC should just work, and that I can SET-CAR!/SET-CDR! (depending on the representation) the cons cells in my a-lists with impunity. I guess that wanting to modify a dictionary without an O(N) cost is one of those things that makes me a troglodyte programmer, but you know, I do get paid to grab problems by the hair and drag them into the light.
Now don't get me wrong, if PLT want to diverge on this, then more power to them. I've long advocated the notion of Scheme as a diverse family of languages, but I do feel that, in a language which has mutability semantics designed in, that to start making aggregate objects immutable somehow violates those original language design decisions in a decidedly non-trivial way. Perhaps we should just issue a report which says:
"Scheme has now become SML with an s-expression surface syntax and a top-level union type which is comprised of all other types"
There we go. Job over. (and personally, I *really like* SML, so this is not pejorative although it may so seem). No more mutable cons cells. In fact, in order to make *anything* mutable you have to explicitly allocate a box for it. That will be so much safer than all the unconstrained mutation of variables going on in current Scheme programs. No more libraries will be crashing because someone wrote an idiotic element function which was passed to a combinator. Life will be good.
> Maybe, but SET-CAR! and SET-CDR! are not bound by default in an R6RS > top-level. That certainly invokes a level of annoyance that makes them as > good as optional for most practical purposes.
Given the required machinery of an R6RS program (R6RS doesn't have a REPL), adding one extra import isn't a big deal. I'm proposing that for WG1 purposes, a REPL is allowed to pre-import an implementation-defined set of modules.
> Module-izing SET-CAR! and SET-CDR! is more than a little bit different from > module-izing IO.
Doubtless. But what differences do you have in mind?
-- "Well, I'm back." --Sam John Cowan <co...@ccil.org>
> *) Implementations must check the PROC but only to the extent of > applying it.
> I'm not sure what the last point means, but it appears that even the > most compliant of Scheme implementations doesn't actually check the > entire procedure code for assignments to the list elements, or anything > of that nature, so I'm going to wait until one of the designers of > R6RS explains what that means.
Actually it says "check the restrictions on PROC", not "check the PROC". The restrictions on PROC are that it is actually a procedure, that it accepts the relevant number of arguments, and that it returns exactly one result. These all can be checked by applying the procedure, but may be checked in advance of applying it.
-- John Cowan co...@ccil.org http://www.ccil.org/~cowan Most languages are dramatically underdescribed, and at least one is dramatically overdescribed. Still other languages are simultaneously overdescribed and underdescribed. Welsh pertains to the third category. --Alan King
> I vote that set-car! and set-cdr! be mandatory and not hidden in optional > modules in WG1. Not for the sake of old code but for the sake of what I > guess is an old philosophy: that computer programs should not think > they're smarter than their users.
I do not think that word ("mandatory") means what you think it means. To make something mandatory is to say that an implementation that does not provide it is non-conformant. Put otherwise, an optional feature is one that a conformant program cannot rely on a conformant implementation providing. R5RS examples are COS, MAKE-RECTANGULAR, WITH-INPUT-FROM-FILE, and LOAD.
But perhaps I miss your point here.
-- Here lies the Christian, John Cowan judge, and poet Peter, http://www.ccil.org/~cowan Who broke the laws of God co...@ccil.org and man and metre.
> I do not think that word ("mandatory") means what you think it means. > To make something mandatory is to say that an implementation that does > not provide it is non-conformant.
That's exactly what I meant. An implementation that does not provide SET-CAR! and SET-CDR! automatically (that is, without the user having to import modules or anything else other than just use them) should be, I propose, non-conformant. Not Scheme. Some other language, maybe even a good language, but not Scheme.
IMO this should be true of both WG1 and WG2 dialects. Nothing called "Scheme" is allowed not to provide SET-CAR! and SET-CDR!.
On Thursday 17 June 2010 00:32:38 John Cowan wrote:
> Actually it says "check the restrictions on PROC", not "check the PROC". > The restrictions on PROC are that it is actually a procedure, that it > accepts the relevant number of arguments, and that it returns exactly > one result. These all can be checked by applying the procedure, but > may be checked in advance of applying it.
Looking at the documentation, it says that PROC should not mutate the lists. That's also a restriction on PROC.
On Thursday 17 June 2010 01:55:49 Brian Harvey wrote:
> John Cowan: > > I do not think that word ("mandatory") means what you think it means. > > To make something mandatory is to say that an implementation that does > > not provide it is non-conformant.
> That's exactly what I meant. An implementation that does not provide > SET-CAR! and SET-CDR! automatically (that is, without the user having > to import modules or anything else other than just use them) should be, > I propose, non-conformant. Not Scheme. Some other language, maybe even > a good language, but not Scheme.
> IMO this should be true of both WG1 and WG2 dialects. Nothing called > "Scheme" is allowed not to provide SET-CAR! and SET-CDR!.
I agree that every Scheme should provide mutable pairs. It doesn't make sense to allow them to get ride of these. However, as you will have read in my other proposal, I think that it makes sense to allow other Schemes to choose to make immutable pairs the default if they so choose. This won't adversely affect you, but it will allow others to take advantage of immutable pairs in their own code if they want.