Total complexity of Scheme, or "It's Always Snowing Somewhere"

Lassi Kortela

unread,

Apr 21, 2020, 7:13:12 AM4/21/20

to scheme-re...@googlegroups.com

Hello,

Finally joined this group, and trying to stay in a more passive
observer's role compared to SRFI work.

What spurred me to join (asbestos suit on) is to ask: Do we have
effective tools to track the total complexity of the language?

As mentioned elsewhere occasionally, I strongly believe that simple
languages are fun to use and the extra conceptual load in complex
languages has not been useful enough to justify its presence in my
experience. That's why I would advocate for a total complexity budget
for R7RS-large (by some metric yet to be determined - preferably a
quantifiable one, though an informal metric is better than none).

I'm not at all concerned about the number of procedures in the language,
but about the number data types. It's fine to have 50 or even 100
procedures to operate on one data type as long as that type is simple
and serves a purpose clearly distinct from other types.

Another thing that concerns me is the complexity of the evaluation
model. As an advocate of keyword arguments, I freely admit that I'm
adding to the problem as much as anyone. As a rule, any new language
construct comes with a few problems that must be solved by adding
something concrete to the language (a procedure, a macro, read syntax,
or an evaluation rule). The unfortunate corollary is that any new
construct interacts with other constructs in nontrivial ways, thereby
making the language itself more complex. A construct has to be useful in
a variety of situations to justify the inevitable problems it brings.
Since "useful" basically means "solves problems with the existing
constructs", it's easy even for expert designers to be coerced by
circumstances into a patchwork mentality in successive versions of a
language.

Scheme still has an exceptionally clean core language. That's our main
technical and cultural asset against competing languages like Rust, C++,
Haskell, Racket, Common Lisp, and Clojure. (I'm not sure how complex
OCaml and Erlang are in their latest editions, but I suspect they are in
a similar situation as Scheme: simple, but on track to become complex.)

John mused in another WG2 thread: "Interestingly, I think the
combination of homoiconicity *and* hygiene is the only remaining thing
unique to Scheme." I believe this appraisal is too modest - a
significant part of what makes Scheme unique and well-liked is what it
doesn't have. Language design sets the tone for the culture around the
language; a simple language encourages programmers to avoid fancy tricks
and get to the point in their own code as well. There's a special joy in
using a language that generally offers one obvious way to solve a problem.

If we don't fight to preserve that feel, R7RS-large is on track to be
Racket-lite. Racket, in turn, can now be considered either CL-lite or
CL-large depending on who you ask. I'd love to be proven wrong about
that but can't see any other conclusion. It seems any growing Lisp
dialect becomes either a snowball or a mudball. CL has been overtly
muddy from the start. Racket started with snow but muddied itself; the
same can now be cautiously said of Clojure. Scheme is still snow but we
are swiftly following suit and muddying it. Is there an ecological niche
for a muddy Scheme that is different from the one occupied by Racket and
CL? Is a muddy Scheme still Scheme?

Pondering these issues, and going out on a limb for a bit, the saying
"It's Always Snowing Somewhere" is on my mind. It's an expression of
affection among skiers and snowboarders, acknowledging that no matter
how dull and uninspiring the everyday environment that others seem to
take for granted, there is always a place where they can sneak off to do
their favorite thing and find other like-minded people if they just make
the necessary personal sacrifices. I like it as a metaphor for Scheme
and think it would make for a good slogan. Scheme's ecological niche:
the Lisp that doesn't muddy.

Back on a more technical track, the particular features imminently
adding to Scheme's complexity are:

- Keyword arguments (my fault).

- More complex macro systems.

- Different kinds of strings.

- Mutable vs immutable data.

- Variety of collection objects with overlapping purposes.

If WG2 is concerned about exceeding a complexity budget, the concrete
steps to stay snowy would be:

- Figure out a way to measure complexity (taking a page from Alan Perlis
and Rich Hickey: the number of elements does not cause complexity, but
the number of their interactions).

- Figure out what to existing features to remove and what new features
to keep out in other to stay in budget.

If the language charter gets in the way, I would strongly suggest
finding a way to amend it. It's not my decision, but I believe in the
end everyone involved with Scheme wants above all to arrive at a good
language that clearly stands out from other languages in a positive way.
As agreed several times before, R7RS-large2 or R8RS would not be a good
move - if we want to build morale and confidence in Scheme, we have to
stick with R7RS-large and get it right. We are fortunate to have a
simple foundation with an excellent library system in R7RS-small so any
necessary charter amendments will hopefully be modest.

All comments on the above are gratefully accepted, even flamey ones.

Marc Nieper-Wißkirchen

unread,

Apr 21, 2020, 7:49:53 AM4/21/20

to scheme-re...@googlegroups.com

Thank you for sharing your thoughts, Lassi. I agree with you that
simplicity versus complexity is an important asset of the Scheme
language and that we mustn't make R7RS-large more complex than
necessary. Probably all Schemers think this way. Where thoughts will
differ, however, what is supposed to be simple and what is supposed to
be complex. Then there is simplicity/complexity from a Scheme user's
point of view and then there is simplicity/complexity from an
implementer's point of view.

Personally, I don't think that a simple, not even a single metric will
suffice to measure the complexity of the emerging standard. Different
aspects have to be measured individually. There is the evaluation
model. There is the syntactic model. There is the lexical syntax.
There are libraries.

To me, what is important is that we have a core with clear semantics
in terms of which the rest of the features can be implemented as
libraries. Minimalism in the sense of Scheme does not necessarily
mean minimalism of some implementation; it means that everything can
be reduced or explained in terms of a simple core. If this is the
case, I do not care in principle whether R7RS-large ships with 100 or
1000 libraries or whether these libraries define 100 or 1000 types.
Interoperability and clear separation of concern are more important
than the sheer numbers.

To assure that we have clear semantics of the evaluation model, it
would help if someone tried to write down the semantics formally as it
is done for R7RS-small in the report. By writing down things
formally, unpolished edges are most easily perceived.

You have talked about finding a metric to measure complexity, Lassi.
An important point to be aware of is that such a metric if it is a
good one, won't be monotone in the number of features. Some features
are so good when added that they decrease the overall complexity
because other features become explainable in terms of them. One
example that comes to mind are proper tail calls, which allow
explaining iteration through recursion. Another example is a
sufficiently rich macro system. It's important that the macro system
itself has clear semantics. When it is rich enough, it will allow the
reduction of my other advanced features to core features, decreasing
the overall complexity.

Sometimes adding a feature may simplify the overall picture because it
increases the expressiveness of the language. Scheme has first-class
continuations. R7RS-large could get delimited continuations as well.
I don't think that this would make R7RS-large more complex (unless,
maybe, I am an implementer). On the contrary, it would allow the user
to more clearly state their intent when they are using continuations
as a number of first-class continuations we currently use should be
delimited continuations on theoretical and practical grounds.

You have given keyword arguments as another example. Whether their
addition would complicate the language or not is one thing, but what's
more important to me is whether they would be added uniformly to the
language and whether all libraries would make use of them in a uniform
sense. Some existing R7RS-large libraries already use symbols to add
optional flags to procedures. We could use symbols in plist form as
well for optional arguments (which is, apart from the lexical syntax,
one of the two runtime models keyword arguments we have discussed).
But if we choose proper keyword arguments for the latter, we should
probably revisit the older libraries.

I think that R7RS-large has once been advertised by John as a language
larger than Common Lisp. That's not bad per se, not at all. A
snowflake or a crystal has no inherent size limit.

Marc

P.S.: Can you explain why you think that Racket has muddied itself?

Arne Babenhauserheide

unread,

Apr 21, 2020, 3:56:06 PM4/21/20

to scheme-re...@googlegroups.com

Marc Nieper-Wißkirchen <marc....@gmail.com> writes:

> Thank you for sharing your thoughts, Lassi. I agree with you that
> simplicity versus complexity is an important asset of the Scheme
> language and that we mustn't make R7RS-large more complex than
> necessary. Probably all Schemers think this way. Where thoughts will
> differ, however, what is supposed to be simple and what is supposed to
> be complex. Then there is simplicity/complexity from a Scheme user's
> point of view and then there is simplicity/complexity from an
> implementer's point of view.

I don’t see forced complexity in many interactions. If these interaction
follow a uniform scheme (no pun intended), then many of them collapse
to the same concept, so there’s no added complexity.

Complexity arises, where the interactions do not follow uniform patterns
and where they have edge-cases.

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein
ohne es zu merken

Lassi Kortela

unread,

Apr 21, 2020, 4:13:41 PM4/21/20

to scheme-re...@googlegroups.com

Thank you once again for your thoughtful remarks.

> Personally, I don't think that a simple, not even a single metric
> will suffice to measure the complexity of the emerging standard.
> Different aspects have to be measured individually. There is the
> evaluation model. There is the syntactic model. There is the lexical
> syntax. There are libraries.

The problem is that users don't keep those things separate in their
heads. In order to use a language effectively, we need to keep all of
them in mind at once. The problem is more severe for people who visit
the language only casually now and then - they forget more easily.

The libraries are somewhat an exception - as long as you understand
what all the data types do, it doesn't matter if you don't know all
about the procedures for manipulating them. Those don't touch other
aspects of the language so you don't need to be aware of them unless
you specifically need them.

We agree that the more of the language that can be stashed away in
libraries that don't affect other libraries, the better. However, if
there are very many of them it brings a related problem: they have
overlapping functionality which makes it hard to select the right one.
So even if the libraries are disjoint on a technical level, they are
not disjoint on a common sense level. The Platonic ideals of a list and
a vector are almost identical, for example. That's the level on which
we think about most stuff in our lives, and mainly experts think with
finer distinctions. One of the tenets of good design is to make
something non-experts can comfortably use to achieve practical goals.
I'm not suggesting to get rid of vectors, but to avoid adding many
more datatypes in this vein without taking anything out in return.

> To assure that we have clear semantics of the evaluation model, it
> would help if someone tried to write down the semantics formally as
> it is done for R7RS-small in the report. By writing down things
> formally, unpolished edges are most easily perceived.

This is fine as long as the semantics are simple enough.

> You have talked about finding a metric to measure complexity, Lassi.
> An important point to be aware of is that such a metric if it is a
> good one, won't be monotone in the number of features. Some features
> are so good when added that they decrease the overall complexity
> because other features become explainable in terms of them. One
> example that comes to mind are proper tail calls, which allow
> explaining iteration through recursion. Another example is a
> sufficiently rich macro system.

Fully agree with all of this. Tail calls and closures are two examples
of absolutely wonderful abstractions.

> It's important that the macro system itself has clear semantics.
> When it is rich enough, it will allow the reduction of my other
> advanced features to core features, decreasing the overall
> complexity.

As a user I like both symbol macros (identifier syntax) and
syntax-case, but every time I use them I wonder whether they decrease
overall system complexity (where system = language + program).

> Sometimes adding a feature may simplify the overall picture because
> it increases the expressiveness of the language. Scheme has
> first-class continuations. R7RS-large could get delimited
> continuations as well. I don't think that this would make R7RS-large
> more complex (unless, maybe, I am an implementer).

That may be true - delimited continuations are out of my technical
depth. Real continuations are difficult on JVM, CLR and probably
JavaScript targets. So far, Scheme seems to have done well by having
continuations in the standard, but working without them as well.

> You have given keyword arguments as another example. Whether their
> addition would complicate the language or not is one thing, but
> what's more important to me is whether they would be added uniformly
> to the language and whether all libraries would make use of them in
> a uniform sense. Some existing R7RS-large libraries already use
> symbols to add optional flags to procedures. We could use symbols in
> plist form as well for optional arguments (which is, apart from the
> lexical syntax, one of the two runtime models keyword arguments we
> have discussed). But if we choose proper keyword arguments for the
> latter, we should probably revisit the older libraries.

Agree with all of this, except for one thing - the manual arg parsing
does seem too spartan to me. If everybody does it in practice, the
language should offer some help. But if we add keyword arguments, they
are similar (Platonic ideals) to records, hashtables, alists, and
plists. I'm critiquing my own favorite here - I'm not entirely happy
with it. Just as an example that this general direction (reducing the
number of similar-but-different concepts in the language) would
probably be fruitful.

> I think that R7RS-large has once been advertised by John as a
> language larger than Common Lisp. That's not bad per se, not at all.
> A snowflake or a crystal has no inherent size limit.

Enthusiastically agreed. Size is no problem at all, and may in fact be
a good thing. It's only the size of the core that causes problems,
since every core feature is prone to interact with many others and
users will eventually need to understand those interactions.

In fact, the original metaphor ("Scheme is like a ball of snow. You can
add any amount of snow to it and it still looks like a ball of snow.
Moreover, snow is cleaner than mud.") can be read as encouraging the
writing of big systems in Scheme. R7RS-large is an enabler of that, and
that's what makes it a laudable effort.

Marc Nieper-Wißkirchen

unread,

Apr 23, 2020, 4:09:09 PM4/23/20

to scheme-re...@googlegroups.com

Am Di., 21. Apr. 2020 um 22:13 Uhr schrieb Lassi Kortela <la...@lassi.io>:

[...]

> > Personally, I don't think that a simple, not even a single metric
> > will suffice to measure the complexity of the emerging standard.
> > Different aspects have to be measured individually. There is the
> > evaluation model. There is the syntactic model. There is the lexical
> > syntax. There are libraries.
>
> The problem is that users don't keep those things separate in their
> heads. In order to use a language effectively, we need to keep all of
> them in mind at once. The problem is more severe for people who visit
> the language only casually now and then - they forget more easily.

I disagree. Well, I disagree with respect to well-designed languages.
If the language semantics are irregular or if the libraries do not
have a clear dependence or purpose, the problem you speak of may
arise. Otherwise, a library a user doesn't know about (or has
forgotten about) is no worse than a library that has never been there.

The beginner may start writing programs with R7RS-small's list
procedures. Later, the programmer may discover SRFI 1 and some
functional procedures like `fold' and will start to employ this. At an
even later step when there may be performance problems, the user can
discover hash tables and, finally, persistent data structures like
those in SRFI 146. The existence of any these libraries SRFI 1, SRFI
125 or SRFI 146 will have made any program written in the process
worse.

[...]

> We agree that the more of the language that can be stashed away in
> libraries that don't affect other libraries, the better. However, if
> there are very many of them it brings a related problem: they have
> overlapping functionality which makes it hard to select the right one.
> So even if the libraries are disjoint on a technical level, they are
> not disjoint on a common sense level. The Platonic ideals of a list and
> a vector are almost identical, for example. That's the level on which
> we think about most stuff in our lives, and mainly experts think with
> finer distinctions. One of the tenets of good design is to make
> something non-experts can comfortably use to achieve practical goals.
> I'm not suggesting to get rid of vectors, but to avoid adding many
> more datatypes in this vein without taking anything out in return.

There are programmers who understand the difference between O(1) and
O(n). Others may not or may not care. For the first class of
programmers, lists and vectors are two very different Platonic ideals.
For the second class, it may not be the case but it doesn't matter.
The second class won't write programs where the difference between
`list-ref' or `vector-ref' would make much of a difference so it
actually doesn't matter which library they choose. (If it starts to
make a difference, these programmers have to learn about algorithmic
complexity anyway.)

> As a user I like both symbol macros (identifier syntax) and
> syntax-case, but every time I use them I wonder whether they decrease
> overall system complexity (where system = language + program).

The majority may not ever write any macro themselves at all. For these
people, the language doesn't become more complicated by the addition
of `syntax-rules' because it is something that takes place on another
planet. The same analogy works for users for which `syntax-rules'
macros are enough. Whether `syntax-case' or `er-macro-transformer' is
part of the language or not is mostly irrelevant to them.

A good macro system can, however, help to make libraries more uniform.
All `syntax-rules' macros behave similarly (and hygienic) so all
syntax defined in terms of them behaves predictably. Macros written
with `syntax-case' can easily catch syntax errors and the error
reporting is easily on an equal footing with the error reporting of
native syntax, etc.

> That may be true - delimited continuations are out of my technical
> depth. Real continuations are difficult on JVM, CLR and probably
> JavaScript targets. So far, Scheme seems to have done well by having
> continuations in the standard, but working without them as well.

To me that Scheme has first-class continuations is a defining property
of the language; it's as defining as proper tail calls and even more
defining than hygienic macros, which weren't standardized before R5RS.
The Python way is to change the implementation and add new native
syntax to implement something like PEP 255; the Scheme way is to write
just 10 magic lines with call/cc to arrive at the
`make-coroutine-generator' procedure of SRFI 158 (maybe with some
syntactic sugar from SRFI 190).

[...]

> Agree with all of this, except for one thing - the manual arg parsing
> does seem too spartan to me. If everybody does it in practice, the
> language should offer some help. But if we add keyword arguments, they
> are similar (Platonic ideals) to records, hashtables, alists, and
> plists. I'm critiquing my own favorite here - I'm not entirely happy
> with it. Just as an example that this general direction (reducing the
> number of similar-but-different concepts in the language) would
> probably be fruitful.

The question of parsing at the callee side is orthogonal to the
question of how the call looks at the call side. We have already seen
`case-lambda', `let-optionals', `let-keywords', which is possibly
syntax to abstract the parsing but is independent of a new keyword
object type.

[...]

Arne Babenhauserheide

unread,

Apr 24, 2020, 9:18:15 AM4/24/20

to scheme-re...@googlegroups.com

This is where I see the added value of r7rs large over SRFIs: r7rs large
can choose and meld the possibilities of SRFIs such that they form a
consistent language with uniform patterns.

We already have SRFIs. Their complexity is quite large. r7rs large can
reduce it.

John Cowan

unread,

Apr 25, 2020, 4:42:48 PM4/25/20

to scheme-re...@googlegroups.com

On Tue, Apr 21, 2020 at 7:13 AM Lassi Kortela <la...@lassi.io> wrote:

I'm not at all concerned about the number of procedures in the language,
but about the number data types. It's fine to have 50 or even 100
procedures to operate on one data type as long as that type is simple
and serves a purpose clearly distinct from other types.

I think that is too much to ask. It is sufficient to have just one compound type in the system, namely pairs. Everything else is for the sake of either convenience or efficiency, and as a result they will overlap. We have alists, plists, hash tables, mappings, hashmaps, and OKVSes, which are all overlapping; there are reasons to use one rather than another, but I wouldn't want to say that there is a rigid formula dictating which one to use in any given situation.

It seems any growing Lisp
dialect becomes either a snowball or a mudball. CL has been overtly
muddy from the start. Racket started with snow but muddied itself; the
same can now be cautiously said of Clojure. Scheme is still snow but we
are swiftly following suit and muddying it.

There's always going to be a certain amount of mud if you care about backward compatibility at all. Nobody realized how muddy mutability was when Scheme was designed, so everything is mutable with a handful of exceptions: globals, lexical bindings, pairs, strings. Unless we start again from scratch as Kernel did, we aren't going to be able to change that. But we can provide modular alternatives.

By adding things as modularly as possible, we minimize the dependencies. If you don't care about the mutability of strings, you can use them in an immutable way and not worry about it. If it's critical to your application, you have texts, which also provide certain guarantees that strings cannot. As far as I can think of, the only novel widespread dependency is comparators. Dictionaries may become a dependency too, so that key-value stores can be passed between procedures without having to agree on a specific type of dictionary.

Back on a more technical track, the particular features imminently
adding to Scheme's complexity are:

- Keyword arguments (my fault).

I really really want these for dictionaries in some form. The register-dictionary! procedure has about 30 arguments, the methods you can perform on a given dictionary type, all but six of which are optional. I may go with pure portable let-keywords, but I'd rather have SRFI-177, if the (ugly) parenthesized-keywords syntax can be eliminated in favor of a fence symbol like &kw. Alex Shinn says it can be done with syntax-rules, in which case it would be more portable.

- More complex macro systems.

That ship has emphatically sailed: essentially every Scheme has one or more of them except for some toy systems (even Chibi has two). The question that remains is: syntax-case or explicit renaming or both.

- Different kinds of strings.

Retrofitting Unicode into Scheme has made that inevitable, given that people want O(1) behavior and don't want UTF-32. It's possible to have a union type of mutable and immutable strings, but not portably.

- Mutable vs immutable data.

The R5RS all-mutable system is obsolete, and the attempt of R6RS to segregate string and pair mutability into separate libraries is purely formal: no implementation I know of is capable of exploiting the absence of those libraries to have a different and more efficient runtime.

- Figure out what to existing features to remove and what new features
to keep out in other to stay in budget.

Backward incompatibility has been a major problem for both R6RS and Python 3. Let's not go there.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org
Babies are born as a result of the mating between men and women,
and most men and women enjoy mating.
--Isaac Asimov in Earth: Our Crowded Spaceship

John Cowan

unread,

Apr 25, 2020, 4:43:16 PM4/25/20

to scheme-re...@googlegroups.com

On Tue, Apr 21, 2020 at 7:49 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

To me, what is important is that we have a core with clear semantics
in terms of which the rest of the features can be implemented as
libraries. Minimalism in the sense of Scheme does not necessarily
mean minimalism of some implementation; it means that everything can
be reduced or explained in terms of a simple core.

That's only possible for the "computer science" part of Scheme. A simple core can't handle SRFI 170, for instance.

R7RS-large could get delimited continuations as well.
I don't think that this would make R7RS-large more complex (unless,
maybe, I am an implementer). On the contrary, it would allow the user
to more clearly state their intent when they are using continuations
as a number of first-class continuations we currently use should be
delimited continuations on theoretical and practical grounds.

I agree. We should also have one-shot continuations, delimited and undelimited, because of their greater efficiency. They can be defined in terms of normal continuations, but implemented better.

what's
more important to me is whether they would be added uniformly to the
language and whether all libraries would make use of them in a uniform
sense.

I think that is neither practical (we have too many libraries already that *don't* use them) nor even desirable. Keywords are a way of working around the problem of too many arguments, but many functions don't *have* too many arguments. We don't want to write (car :pair foo) or (cons :car 1 :cdr 2). If you want Smalltalk, you know where to find it (and even Smalltalk special-cases one and two arguments).

Some existing R7RS-large libraries already use symbols to add

optional flags to procedures. [...]

But if we choose proper keyword arguments for the latter, we should
probably revisit the older libraries.

Which ones do you have in mind?

The existence of any these libraries SRFI 1, SRFI
125 or SRFI 146 will have made any program written in the process
worse.

Will not have made any program worse, I suppose.

To me that Scheme has first-class continuations is a defining property
of the language; it's as defining as proper tail calls and even more
defining than hygienic macros, which weren't standardized before R5RS.

I agree, except I think that 20+ years with syntax-rules has made it defining by now.

On Fri, Apr 24, 2020 at 9:18 AM Arne Babenhauserheide <arne...@web.de> wrote:

This is where I see the added value of r7rs large over SRFIs: r7rs large
can choose and meld the possibilities of SRFIs such that they form a
consistent language with uniform patterns.

As far as possible, but no further. There are and will remain minor inconsistencies between, say, (scheme list) and (scheme vector), or between (scheme set) and (scheme integer-set), that I am not going to make any effort to smooth out. A few idiosyncrasies give the language flavor, and even may even act as a bit of a memory hook.

Marc Nieper-Wißkirchen

unread,

Apr 25, 2020, 5:01:50 PM4/25/20

to scheme-re...@googlegroups.com

Am Sa., 25. Apr. 2020 um 22:43 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

>> To me, what is important is that we have a core with clear semantics
>> in terms of which the rest of the features can be implemented as
>> libraries. Minimalism in the sense of Scheme does not necessarily
>> mean minimalism of some implementation; it means that everything can
>> be reduced or explained in terms of a simple core.
>
> That's only possible for the "computer science" part of Scheme. A simple core can't handle SRFI 170, for instance.

Agreed. And also that R7RS-large is more than the pure language you
would teach in CS classes. Nevertheless, SRFI 170 is not the worst
example for the point I have tried to make: Assuming that an FFI were
feasible, the procedures of SRFI 170 could (conceivably) be reduced to
POSIX calls mapped to the Scheme world through the FFI so that all one
has to understand is the FFI mapping and the POSIX spec. (This is just
hypothetically because we have no such FFI but I think it is a good
example of what I mean by reductions to some core.)

[...]

>> what's
>> more important to me is whether they would be added uniformly to the
>> language and whether all libraries would make use of them in a uniform
>> sense.
>
>
> I think that is neither practical (we have too many libraries already that *don't* use them) nor even desirable. Keywords are a way of working around the problem of too many arguments, but many functions don't *have* too many arguments. We don't want to write (car :pair foo) or (cons :car 1 :cdr 2). If you want Smalltalk, you know where to find it (and even Smalltalk special-cases one and two arguments).

Haha, no, I didn't mean to add keywords where they don't make sense.
By "uniformly" I meant to use one uniform mechanism to handle "too
many" (optional) arguments across libraries.

>> Some existing R7RS-large libraries already use symbols to add
>> optional flags to procedures. [...]
>> But if we choose proper keyword arguments for the latter, we should
>> probably revisit the older libraries.
>
>
> Which ones do you have in mind?

For example, `make-hash-table' from SRFI 125 suggests using symbols to
tweak the hash table. Or `string-join' has a symbol flag, the grammar,
which is a symbol. This is one reason why I find using the simple
plist-symbol-approach to optional keyword arguments not too unsexy
because symbols (and not keyword objects) have been used as flags so
far. The optional keyword arguments we are talking about are nothing
but such flags together with a value.

[...]

>> The existence of any these libraries SRFI 1, SRFI
>> 125 or SRFI 146 will have made any program written in the process
>> worse.
>
>
> Will not have made any program worse, I suppose.

Haha, yes!

Marc Nieper-Wißkirchen

unread,

Apr 25, 2020, 5:25:11 PM4/25/20

to scheme-re...@googlegroups.com

Am Sa., 25. Apr. 2020 um 22:42 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

> I think that is too much to ask. It is sufficient to have just one compound type in the system, namely pairs. Everything else is for the sake of either convenience or efficiency, and as a result they will overlap. We have alists, plists, hash tables, mappings, hashmaps, and OKVSes, which are all overlapping; there are reasons to use one rather than another, but I wouldn't want to say that there is a rigid formula dictating which one to use in any given situation.

I agree with John here.

>>
>> It seems any growing Lisp
>> dialect becomes either a snowball or a mudball. CL has been overtly
>> muddy from the start. Racket started with snow but muddied itself; the
>> same can now be cautiously said of Clojure. Scheme is still snow but we
>> are swiftly following suit and muddying it.

I still don't see why Racket is particularly muddy.

> There's always going to be a certain amount of mud if you care about backward compatibility at all. Nobody realized how muddy mutability was when Scheme was designed, so everything is mutable with a handful of exceptions: globals, lexical bindings, pairs, strings. Unless we start again from scratch as Kernel did, we aren't going to be able to change that. But we can provide modular alternatives.

Kernel is a nice idea, but I think it has gone a different direction
than Scheme. With Kernel's fexprs etc., it is highly dynamic, while
Scheme with its module system, the way `eval' is defined and its macro
system is much more static and suitable for compilation.

[...]

>> Back on a more technical track, the particular features imminently
>> adding to Scheme's complexity are:
>>
>> - Keyword arguments (my fault).
>
>
> I really really want these for dictionaries in some form. The register-dictionary! procedure has about 30 arguments, the methods you can perform on a given dictionary type, all but six of which are optional. I may go with pure portable let-keywords, but I'd rather have SRFI-177, if the (ugly) parenthesized-keywords syntax can be eliminated in favor of a fence symbol like &kw. Alex Shinn says it can be done with syntax-rules, in which case it would be more portable.

Even with a fence symbol, you won't get rid of the `call/kw' (or
however you name it) in front of the call. And that's really ugly.
It's okay if one wants to write code that can address various (native)
keyword systems at once, but the keyword system that will finally be
chosen for R7RS-large mustn't depend on this in my opinion.

[...]

> That ship has emphatically sailed: essentially every Scheme has one or more of them except for some toy systems (even Chibi has two). The question that remains is: syntax-case or explicit renaming or both.

We have already seen a number of SRFIs or ideas (e.g. while discussing
SRFI 177) that can only be coded in a library with a macro system
strictly more powerful than syntax-rules. Thus, it makes a lot of
sense to have a more powerful macro system in R7RS-large.

> The R5RS all-mutable system is obsolete, and the attempt of R6RS to segregate string and pair mutability into separate libraries is purely formal: no implementation I know of is capable of exploiting the absence of those libraries to have a different and more efficient runtime.

As soon as you have access to `eval' and therefore, potentially, to
every library, the segregation is also moot.

> Backward incompatibility has been a major problem for both R6RS and Python 3. Let's not go there.

I agree! Unfortunately, R7RS-small has also introduced a number of
incompatibilities with R6RS where they wouldn't have been necessary.
Wherever possible, I hope that R7RS-large is able to fill some gaps
between R6RS and R7RS.

Per Bothner

unread,

Apr 25, 2020, 8:42:18 PM4/25/20

to scheme-re...@googlegroups.com

On 4/25/20 1:42 PM, John Cowan wrote:
> Backward incompatibility has been a major problem for both R6RS and Python 3. Let's not go there.

If the lesson from Python 3 is "never break compatibility" then I think that is the wrong lesson.
Python 3 was a mess for a number of reasons: It was a big change that affected almost all code,
and migration was needlessly painful: It was difficult (at least for a long time) to write
code that worked on both Python 2 and Python 3; Python 3 for a while was missing some
functionality (for dealing with bytestrings); etc etc.

For Scheme parsimony and elegance are even more important than for Python. Hence
(to repeat a broken record) sometimes breaking compatibility (as in SRFI 140) is
a lesser evil than extra complexity and inelegance (as in SRFI 135).
--
--Per Bothner
p...@bothner.com http://per.bothner.com/

Marc Nieper-Wißkirchen

unread,

Apr 26, 2020, 3:45:08 AM4/26/20

to scheme-re...@googlegroups.com

If the library facility of Scheme can be employed so that restoring
compatibility would just consist of replacing one import with another,
this would be fine. In one regard, the current module system is not
powerful enough, namely with respect to the lexical syntax for
strings, which should produce immutable strings in the "new" standard
and (immutable) mutable strings in a compatibility mode. However, this
could easily be achieved by introducing a reader flag. This is much
better than using "strange" unicode characters as suggested in SRFI
135 for texts.

(By the way, I don't think that mutable strings should go away in
Scheme; there are use cases for them. However, their mutability would
have to be extended so that one can insert arbitrary strings, delete
substrings, etc. In other words, so that they could serve as a buffer
for an editor.)

Arne Babenhauserheide

unread,

Apr 26, 2020, 4:24:41 AM4/26/20

to scheme-re...@googlegroups.com, John Cowan

John Cowan <co...@ccil.org> writes:
> On Fri, Apr 24, 2020 at 9:18 AM Arne Babenhauserheide <arne...@web.de>
> wrote:
>
>> This is where I see the added value of r7rs large over SRFIs: r7rs large
>> can choose and meld the possibilities of SRFIs such that they form a
>> consistent language with uniform patterns.
>
> As far as possible, but no further. There are and will remain minor
> inconsistencies between, say, (scheme list) and (scheme vector), or between
> (scheme set) and (scheme integer-set), that I am not going to make any
> effort to smooth out. A few idiosyncrasies give the language flavor, and
> even may even act as a bit of a memory hook.

Some of these inconsistencies/non-uniformities are due to actual
differences in requirements — so they are fundamental complexity in
solving the problem efficiently. We cannot get rid of those.

Also we can’t just add fixes that break existing code.

As an aside: One thing that has been jarring for me in the beginning is
that the order of arguments is different in assoc and list-ref:

cons key list ; => list
member key list ; => sublist or #f
assoc key alist ; => cons pair or #f
alist-cons key value alist ; => alist (srfi-1)
vs.
list-ref list index ; => list-element
take list count ; => sublist (srfi-1)

But I now think that that’s something we will need more work to explain
properly, because these are used differently so they are optimized for
their use-case.

I’ve been planning to write a <datastructure>-item SRFI for a while that
returns a cons-pair just like assoc (I repeatedly hit the case where I
want to check for membership and then use the result, and assoc fits
that case well, but I don’t have "vector-assoc-by-index").

John Cowan

unread,

Apr 26, 2020, 7:17:17 PM4/26/20

to scheme-re...@googlegroups.com

On Sun, Apr 26, 2020 at 3:45 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

If the library facility of Scheme can be employed so that restoring
compatibility would just consist of replacing one import with another,
this would be fine.

Actually, such a design would have severe problems in a conventional separate-compilation environment, as illustrated by the numeric-tower library in Chicken 4. The core implementation provided only fixnums and flonums; if you imported the library, you got bignums, ratios, and exact and inexact complex numbers as well, and all of Scheme's usual generic functions were redefined by the module system. (The advantage of a simple fixnum-flonum tower is that all numeric operations run in a fixed amount of time.)

The difficulty arose when trying to mix libraries that were tower-aware with those that did not. Passing a ratio to a procedure compiled without the tower would not work even if the ratio happened to be integral: the procedure would throw an error to the effect that its argument was not in number. To make matters worse, the C FFI was not tower-aware, so that if a C function returned a 64-bit integral value that was too large to fit in a 63-bit Chicken fixnum, it would be converted to a flonum, which is not what the caller (if tower-aware) would expect.

In Chicken 5, the internals were redesigned to handle intermediate bignums very efficiently, and the full tower became part of the core. All these problems went away. But it was no small effort to convert *all* of Chicken's C internals, including the FFI, to handle all the representations of numbers correctly. This is exactly parallel to what would happen in an attempt to have a string type which in some modules is only the standard mutable strings and in other modules is a union type of mutable and immutable strings. In the end it would be necessary for all conformant Schemes to rework their internals as Chicken 5 did, and the more of that there is, the fewer conformant Schemes there will be in the world. That's an outcome I wish to avoid.

This is much
better than using "strange" unicode characters as suggested in SRFI
135 for texts.

I'm surprised that anyone familiar with a language other than English would call guillemets "strange". But perhaps #"..." would be better, or at least easier to type.

(By the way, I don't think that mutable strings should go away in
Scheme; there are use cases for them. However, their mutability would
have to be extended so that one can insert arbitrary strings, delete
substrings, etc. In other words, so that they could serve as a buffer
for an editor.)

The API in SRFIs 118, 140, and 185 specify the ability to replace any substring (inserting and deleting being degenerate cases of this). But they do not require the O(n) behavior you'd want in an editor.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

Why are well-meaning Westerners so concerned that the opening of a
Colonel Sanders in Beijing means the end of Chinese culture? [...]
We have had Chinese restaurants in America for over a century,
and it hasn't made us Chinese. On the contrary, we obliged the Chinese
to invent chop suey. --Marshall Sahlins

Arne Babenhauserheide

unread,

Apr 27, 2020, 4:44:37 PM4/27/20

to scheme-re...@googlegroups.com, John Cowan

John Cowan <co...@ccil.org> writes:
> I'm surprised that anyone familiar with a language other than English would
> call guillemets "strange". But perhaps #"..." would be better, or at least
> easier to type.

I think that limiting the special characters starting extensions to # is
a good thing. For me one of the main contributors to the clean feeling
of Scheme is that it mostly avoids giving special meanings to
characters.

Having the symbol syntax |symbol| already strikes me as odd — it strays
far from the common characters in prose. I much preferred the #{symbol}#
form in Guile, even though it is more verbose and does not look at nice.
It stayed closer to the idea that # is what starts a special form.

"if you see # you need to look out for special handling" is easy to
remember.

I’m also somewhat annoyed by seeing => as a shorthand, because it
reminds me of the -> pointer syntax in C that’s a big impediment for
reading as a newcomer. It makes code feel very different from prose.

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 2:44:39 AM4/28/20

to scheme-re...@googlegroups.com

Am Mo., 27. Apr. 2020 um 01:17 Uhr schrieb John Cowan <co...@ccil.org>:

> Actually, such a design would have severe problems in a conventional separate-compilation environment, as illustrated by the numeric-tower library in Chicken 4. The core implementation provided only fixnums and flonums; if you imported the library, you got bignums, ratios, and exact and inexact complex numbers as well, and all of Scheme's usual generic functions were redefined by the module system. (The advantage of a simple fixnum-flonum tower is that all numeric operations run in a fixed amount of time.)

Separate-compilation environments also have other problems as soon as
low-level macros come into play. As shown in the paper "Implicit
phasing for R6RS libraries" by Dybvig/Ghuloum, a Scheme library must
only be expanded once so that all parts of the program use the same
expansion. This implies, in particular, that as soon as you change one
library you also have to recompile all libraries dependent on this
one.

>> This is much
>> better than using "strange" unicode characters as suggested in SRFI
>> 135 for texts.
>
>
> I'm surprised that anyone familiar with a language other than English would call guillemets "strange". But perhaps #"..." would be better, or at least easier to type.

"Strange" in the sense that they are not part of the ASCII charset,
which I can easily access from my keyboard.

A specific problem with the guillemets is that they are used in French
in the form « bizarre » and in German in the form »seltsam«.

The idea of using #"..." looks much better to me (although I still
would like to have a reader flag with which I can switch between
mutability and immutability of the "..." syntax).

Now that I think about it, I am wondering whether it has been
discussed to use symbols for immutable strings. We would get O(1)
comparison for free and we have already a reader syntax, namely |...|.

>> (By the way, I don't think that mutable strings should go away in
>> Scheme; there are use cases for them. However, their mutability would
>> have to be extended so that one can insert arbitrary strings, delete
>> substrings, etc. In other words, so that they could serve as a buffer
>> for an editor.)
>
>
> The API in SRFIs 118, 140, and 185 specify the ability to replace any substring (inserting and deleting being degenerate cases of this). But they do not require the O(n) behavior you'd want in an editor.

Exactly. Once we have agreed upon how to handle immutable strings
(most use cases), we should impose bounds for the algorithmic
complexity for these replacement operations.

-- Marc

Lassi Kortela

unread,

Apr 28, 2020, 3:39:47 AM4/28/20

to scheme-re...@googlegroups.com

> If the lesson from Python 3 is "never break compatibility" then I think
> that is the wrong lesson.

And Scheme has far fewer users than Python, so any breakage has much
less of an impact.

> Python 3 was a mess for a number of reasons: It was a big change that
> affected almost all code,

Yes

> and migration was needlessly painful: It was difficult (at least for a
> long time) to write
> code that worked on both Python 2 and Python 3;

Bingo. This was by far the biggest mistake, and one we should take care
not to repeat with Scheme. IMHO it's fine to be a little incompatible as
long as there are reasonable workarounds.

> For Scheme parsimony and elegance are even more important than for
> Python. Hence
> (to repeat a broken record) sometimes breaking compatibility (as in SRFI
> 140) is
> a lesser evil than extra complexity and inelegance (as in SRFI 135).

+1

Arne Babenhauserheide

unread,

Apr 28, 2020, 3:46:05 AM4/28/20

to scheme-re...@googlegroups.com, Marc Nieper-Wißkirchen

Marc Nieper-Wißkirchen <marc....@gmail.com> writes:

> Am Mo., 27. Apr. 2020 um 01:17 Uhr schrieb John Cowan <co...@ccil.org>:
> Now that I think about it, I am wondering whether it has been
> discussed to use symbols for immutable strings. We would get O(1)
> comparison for free and we have already a reader syntax, namely |...|.

I thought we could already do this …

All we need is a SRFI that maps string operations to symbol operations,
right?

And we would also get all alist key operations for free.

Lassi Kortela

unread,

Apr 28, 2020, 4:00:09 AM4/28/20

to scheme-re...@googlegroups.com

>> I'm surprised that anyone familiar with a language other than English would call guillemets "strange".

> "Strange" in the sense that they are not part of the ASCII charset,
> which I can easily access from my keyboard.
>
> A specific problem with the guillemets is that they are used in French
> in the form « bizarre » and in German in the form »seltsam«.
>
> The idea of using #"..." looks much better to me

This is interesting, but way over the level of sophistication that
belongs in a spec :) Strings should be one of the most ordinary things
in a language, literally [no pun intended] "hello world"-level stuff.
Please, let's stick with Scheme's old string syntax for immutable ones
as well.

> (although I still
> would like to have a reader flag with which I can switch between
> mutability and immutability of the "..." syntax).

Indeed a reader flag seems like the least obtrusive choice. I don't see
any obvious problems with it.

> Now that I think about it, I am wondering whether it has been
> discussed to use symbols for immutable strings. We would get O(1)
> comparison for free and we have already a reader syntax, namely |...|.

Marc Feeley suggested it. The trouble is, symbols traditionally have a
different connotation than strings in Lisp. They are mostly used for
program-level stuff, where as strings are for user-level stuff. Using
vertical-bar notation for immutable strings would still look weird, even
though a vertical bar is only one ASCII character.

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 4:18:37 AM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 10:00 Uhr schrieb Lassi Kortela <la...@lassi.io>:

[...]

> > Now that I think about it, I am wondering whether it has been
> > discussed to use symbols for immutable strings. We would get O(1)
> > comparison for free and we have already a reader syntax, namely |...|.
>
> Marc Feeley suggested it. The trouble is, symbols traditionally have a
> different connotation than strings in Lisp. They are mostly used for
> program-level stuff, where as strings are for user-level stuff. Using
> vertical-bar notation for immutable strings would still look weird, even
> though a vertical bar is only one ASCII character.

I am not so sure about this "traditionally". LISP 1.5 defines an
atomic symbol as a "string of no more than thirty numerals and capital
letters". In fact, these are the only types of strings in LISP 1.5.

Anyway, I don't think that would be many problems with respect to
different connotations. When you see a symbol written as 'foo, it will
most likely be what you call "program-level stuff"; when you see a
symbol written as |foo|, it will most likely be what you call
"user-level stuff".

In fact, I have seen a number of beginners who used (mutable) strings
for data-directed programming because (coming from other languages)
the concept of symbols was new to them. For these people, it would
become easier.

I agree that the vertical-bar notation is still weird, but this can be
solved optionally with a reader flag as dicussed before.

Marc

Lassi Kortela

unread,

Apr 28, 2020, 4:33:58 AM4/28/20

to scheme-re...@googlegroups.com

> I am not so sure about this "traditionally". LISP 1.5 defines an
> atomic symbol as a "string of no more than thirty numerals and capital
> letters". In fact, these are the only types of strings in LISP 1.5.

OK, that's going all the way back :)

> Anyway, I don't think that would be many problems with respect to
> different connotations. When you see a symbol written as 'foo, it will
> most likely be what you call "program-level stuff"; when you see a
> symbol written as |foo|, it will most likely be what you call
> "user-level stuff".

There would still be two kinds of read syntax for strings, and the
runtime distinction between program-level and user-level stuff would be
lost. `symbol?` could no longer be used to test for program-level stuff.

And you could basically call a string :)

(define (|join two strings| a b)
(string-append (symbol->string a) (symbol->string b)))

(|join two strings| '|hello| '|world|)

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 4:44:07 AM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 10:33 Uhr schrieb Lassi Kortela <la...@lassi.io>:

[...]

> There would still be two kinds of read syntax for strings, and the
> runtime distinction between program-level and user-level stuff would be
> lost. `symbol?` could no longer be used to test for program-level stuff.

It's a bit weird if such tests are needed.

Lassi Kortela

unread,

Apr 28, 2020, 4:49:03 AM4/28/20

to scheme-re...@googlegroups.com

> It's a bit weird if such tests are needed.

It's common to have S-expression-based syntax where symbols represent
keywords of the schema, and strings represent user data. At least that's
what I've always used that distinction for:

(connect "example.com" port "http")
(error 404 message "Not found")

and things like that. To me, it would seem weird if ("connect" "foo")
and (connect "foo") mean the same thing in a DSL. To me symbols are an
integral part of DSLs, which in turn are one of the hallmarks of Lisp.

Arne Babenhauserheide

unread,

Apr 28, 2020, 4:59:41 AM4/28/20

to scheme-re...@googlegroups.com, Lassi Kortela

Lassi Kortela <la...@lassi.io> writes:

>> If the lesson from Python 3 is "never break compatibility" then I
>> think that is the wrong lesson.
>
> And Scheme has far fewer users than Python, so any breakage has much
> less of an impact.

This is only true, if you look at all of humanity.

The impact is on the people you least want to use: Your existing users.
Any breakage kills a lot of momentum, and you need that momentum — your
current users — to reach more users.

And that’s just as bad for Scheme as it was for Python.

>> and migration was needlessly painful: It was difficult (at least for
>> a long time) to write
>> code that worked on both Python 2 and Python 3;
>
> Bingo. This was by far the biggest mistake, and one we should take
> care not to repeat with Scheme. IMHO it's fine to be a little
> incompatible as long as there are reasonable workarounds.

Things you should never do — painfully learned — are things which
prevent flagship programs from updating.

That Python made it a huge task for Mercurial to switch to Python 3, one
of the programs that did everything the Pythonic way and achieved
speed competitive with tools written in C, was a big mistake.

That Guile 2 broke Lilypond — the one Guile-using tool which reigns
supreme in its domain — costed Guile dearly and blocked adoption of
Guile 2 for a long time. Part of this block was a performance regression
that’s resolved with Guile 3, other parts were painfully resolved over
several years.

Other things you should not do is to break user-scripts which have to be
touched by everyone who bought into your system.

Any time people have to invest work to keep using your system, you lower
the bar of moving to something else. It creates a breaking point where
your existing users might wander off.

Also see Volatile Software, that describes this much better than I do:
http://stevelosh.com/blog/2012/04/volatile-software/

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 5:07:30 AM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 10:49 Uhr schrieb Lassi Kortela <la...@lassi.io>:

> It's common to have S-expression-based syntax where symbols represent
> keywords of the schema, and strings represent user data. At least that's
> what I've always used that distinction for:
>
> (connect "example.com" port "http")
> (error 404 message "Not found")

These are examples of simple plists. Every second value is a symbol
describing what is right of it.

You also want to have symbols as associated values; for example with
only a small number of "ports", it makes sense to turn "http" into the
symbol 'http. When doing a lookup, you want a simple pointer
comparison and do not want to compare strings character by character.
(Moreover, 'http can be used in case-constructs while "http" cannot).

Arne Babenhauserheide

unread,

Apr 28, 2020, 5:36:05 AM4/28/20

to scheme-re...@googlegroups.com, Lassi Kortela

Lassi Kortela <la...@lassi.io> writes:

>> It's a bit weird if such tests are needed.
>
> It's common to have S-expression-based syntax where symbols represent
> keywords of the schema, and strings represent user data. At least
> that's what I've always used that distinction for:
>
> (connect "example.com" port "http")

You could simply use 'symbol for keywords and |…| for the data to see
this distinction in code.

And validate not with symbol? but with the name of the symbol.

I still think that |my symbol with special chars| looks strange, but I
think it should work.

John Cowan

unread,

Apr 28, 2020, 1:09:29 PM4/28/20

to scheme-re...@googlegroups.com

On Tue, Apr 28, 2020 at 4:00 AM Lassi Kortela <la...@lassi.io> wrote:

> A specific problem with the guillemets is that they are used in French
> in the form « bizarre » and in German in the form »seltsam«.

True enough, though Germans do seem to be getting used to directional quotes like “this text” as opposed to „dieser Text“. Note that both styles of German / Eastern European quotes point inwards, though it's more obvious with guillemets. I think it's clear that the French style (but without spaces) was intended; that's also the style used by the markup language FtanML, though |...| is also correct.

This is interesting, but way over the level of sophistication that
belongs in a spec :) Strings should be one of the most ordinary things
in a language, literally [no pun intended] "hello world"-level stuff.
Please, let's stick with Scheme's old string syntax for immutable ones
as well.

The guillemets or #"..." syntax are not meant for immutable strings: they are meant for texts in a SRFI 135 environment in which texts and strings are disjoint. In any case, string literals are supposed to be immutable, though not even R6RS says MUST, only SHOULD.

Indeed a reader flag seems like the least obtrusive choice. I don't see
any obvious problems with it.

The problem with making use of such flags in code (as opposed to reading S-expressions at runtime) is the lexical syntax analogue of phasing in low-level macros. Such a flag would have to be set by code, but the code has not yet run when the compiler or interpreter is reading. SRFI 10 has the same problem: various tags in #,(tag datum ...) can be predefined by the implementation, but there is no standard way for a user to define more of them. <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/LexmacsCowan.md> is my attempt to resolve this problem by providing a portable layer that can be injected between standard `read` and `write` and compilation, interpretation, or the consumption and production of data at runtime.

> Now that I think about it, I am wondering whether it has been
> discussed to use symbols for immutable strings. We would get O(1)
> comparison for free and we have already a reader syntax, namely |...|.

There is a fundamental problem with this idea that is common to all Lisps: symbols are interned in a dictionary in order to make them eq? without examining all the characters (in CL there are multiple dictionaries exposed to the user as packages). In some Lisp implementations, symbols are never garbage collected. It's straightforward to remove them in Scheme because they can be freely re-created as needed, but in CL the symbol and function values must be unbound and the property list empty for symbol GC to be possible, and in some older Lisps there was a special procedure to do so called `gctwa` (Garbage Collect Truly Worthless Atoms). mainly used to recover from typos at the REPL that were occupying your precious bodily fluids^W^Wmeasly kilobytes of memory.

That is precisely why strings were added to both Scheme and CL, and why vertical bars are much less important than they were in Maclisp. Lua interns all strings, but I think that is because saving space is more important in typical use cases than saving time.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

I should say generally that that marriage was best auspiced, for the
achievement of happiness, which contemplated a relation between a man and a
woman in which the independence was equal, the dependence mutual, and the
obligations reciprocal. --Louis Anspacher (1944)

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 1:35:53 PM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 19:09 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

>> Indeed a reader flag seems like the least obtrusive choice. I don't see
>> any obvious problems with it.
>
>
> The problem with making use of such flags in code (as opposed to reading S-expressions at runtime) is the lexical syntax analogue of phasing in low-level macros. Such a flag would have to be set by code, but the code has not yet run when the compiler or interpreter is reading. SRFI 10 has the same problem: various tags in #,(tag datum ...) can be predefined by the implementation, but there is no standard way for a user to define more of them. <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/LexmacsCowan.md> is my attempt to resolve this problem by providing a portable layer that can be injected between standard `read` and `write` and compilation, interpretation, or the consumption and production of data at runtime.

I don't see the problem when one does not try to implement such a
reader flag by a library. If you wire `#!strings-as-text' into the
native reader as much as `#!fold-case' has to be wired into the native
reader, everything is fine. For that, immutable strings/texts have to
be a native data type, of course. But there's no principle problem.

[...]

>> > Now that I think about it, I am wondering whether it has been
>> > discussed to use symbols for immutable strings. We would get O(1)
>> > comparison for free and we have already a reader syntax, namely |...|.
>
>
> There is a fundamental problem with this idea that is common to all Lisps: symbols are interned in a dictionary in order to make them eq? without examining all the characters (in CL there are multiple dictionaries exposed to the user as packages). In some Lisp implementations, symbols are never garbage collected. It's straightforward to remove them in Scheme because they can be freely re-created as needed, but in CL the symbol and function values must be unbound and the property list empty for symbol GC to be possible, and in some older Lisps there was a special procedure to do so called `gctwa` (Garbage Collect Truly Worthless Atoms). mainly used to recover from typos at the REPL that were occupying your precious bodily fluids^W^Wmeasly kilobytes of memory.

How is this related to a Scheme implementing R7RS-large? In principle,
a garbage collection of unused symbols that have an empty property
list is always possible (and I would consider every Scheme
implementation that does not GC symbols incomplete).

Marc

Arne Babenhauserheide

unread,

Apr 28, 2020, 2:29:02 PM4/28/20

to scheme-re...@googlegroups.com, John Cowan

John Cowan <co...@ccil.org> writes:
>> > Now that I think about it, I am wondering whether it has been
>> > discussed to use symbols for immutable strings. We would get O(1)
>> > comparison for free and we have already a reader syntax, namely |...|.

> There is a fundamental problem with this idea that is common to all
> Lisps: symbols are interned in a dictionary in order to make them eq?
> without examining all the characters (in CL there are multiple dictionaries
> exposed to the user as packages). In some Lisp implementations, symbols
> are never garbage collected. It's straightforward to remove them in Scheme

I don’t understand why this is a problem for Scheme then. What did I
miss?

John Cowan

unread,

Apr 28, 2020, 3:47:39 PM4/28/20

to scheme-re...@googlegroups.com

On Tue, Apr 28, 2020 at 1:35 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

I don't see the problem when one does not try to implement such a
reader flag by a library. If you wire `#!strings-as-text' into the
native reader as much as `#!fold-case' has to be wired into the native
reader, everything is fine. For that, immutable strings/texts have to
be a native data type, of course. But there's no principle problem.

Quite true. I'll be addressing this problem in the thread on backward incompatibility.

How is this related to a Scheme implementing R7RS-large? In principle,
a garbage collection of unused symbols that have an empty property
list is always possible

Note: standard Scheme does not have symbol property lists, though some Schemes such as Chicken do have them.

(and I would consider every Scheme
implementation that does not GC symbols incomplete).

Again from the standards perspective, there is nothing wrong with such an implementation, because the standard is silent about garbage collection. The Lisp Machine (I think) never did get a GC; it just ran for a few days and then was out of memory and you had to reboot (it was file based, not image based). At most this is a quality-of-implementation issue.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

"Any legal document draws most of its meaning from context. A telegram
that says 'SELL HUNDRED THOUSAND SHARES IBM SHORT' (only 190 bits in
5-bit Baudot code plus appropriate headers) is as good a legal document
as any, even sans digital signature." --me

Arthur A. Gleckler

unread,

Apr 28, 2020, 3:54:45 PM4/28/20

to scheme-re...@googlegroups.com

On Tue, Apr 28, 2020 at 12:47 PM John Cowan <co...@ccil.org> wrote:

Again from the standards perspective, there is nothing wrong with such an implementation, because the standard is silent about garbage collection. The Lisp Machine (I think) never did get a GC; it just ran for a few days and then was out of memory and you had to reboot (it was file based, not image based). At most this is a quality-of-implementation issue.

They did get a GC, but were running for a while without one because it hadn't yet been written. One beautiful thing was that you could run it with GC turned off. Once it got close to running out, you would get a warning. You could choose to continue. If you kept going, you'd get a final warning that you were about to run out of space, and that there wouldn't be enough space to GC if you didn't do it right away. If you ignored that warning, you'd eventually get another warning that you were out of space. At that point, you were given the option of installing an external disk drive onto which to GC. So you were never truly out of options until you refused that last time.

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 4:06:49 PM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 21:47 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

>> How is this related to a Scheme implementing R7RS-large? In principle,
>> a garbage collection of unused symbols that have an empty property
>> list is always possible
>
>
> Note: standard Scheme does not have symbol property lists, though some Schemes such as Chicken do have them.

Guile and Chez Scheme can also store properties along with a symbol.
From a technical point of view, to have symbol properties is the same
as maintaining a hash table, but one with strong keys because a symbol
key can always be forged by `string->symbol'. The existence of the
keys will make sure that the symbols won't be garbage collected, so
the problem does not really differ between Schemes with and without
property lists.

>>
>> (and I would consider every Scheme
>> implementation that does not GC symbols incomplete).
>
>
> Again from the standards perspective, there is nothing wrong with such an implementation, because the standard is silent about garbage collection. The Lisp Machine (I think) never did get a GC; it just ran for a few days and then was out of memory and you had to reboot (it was file based, not image based). At most this is a quality-of-implementation issue.

Note that I wrote "I consider" on purpose. :) However, the standard is
not completely silent about garbage collection. It says something in
the fourth paragraph of section 1.1 of the R7RS, for example. I know,
it says "permitted" and not "required". But then it says in the fifth
paragraph that implementations are required to be properly
tail-recursive, which in the absence of reification in the form of,
say, SRFI 157, just says something about the storage space used by the
Scheme system. Together with a literal reading of the fourth
paragraph, the requirements of the fifth paragraph seem meaningless.

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 4:10:31 PM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 21:54 Uhr schrieb Arthur A. Gleckler
<a...@speechcode.com>:

[...]

> They did get a GC, but were running for a while without one because it hadn't yet been written. One beautiful thing was that you could run it with GC turned off. Once it got close to running out, you would get a warning. You could choose to continue. If you kept going, you'd get a final warning that you were about to run out of space, and that there wouldn't be enough space to GC if you didn't do it right away. If you ignored that warning, you'd eventually get another warning that you were out of space. At that point, you were given the option of installing an external disk drive onto which to GC. So you were never truly out of options until you refused that last time.

Sweet! Very sweet!

This reminds me of footnote (13) in the bible (*):
https://mitpress.mit.edu/sites/default/files/sicp/full-text/book/book-Z-H-33.html#call_footnote_Temp_758

-- Marc

(*) The one besides /the/ bible and
https://en.wikipedia.org/wiki/Gravitation_(book).

John Cowan

unread,

Apr 28, 2020, 4:27:05 PM4/28/20

to scheme-re...@googlegroups.com

On Tue, Apr 28, 2020 at 3:39 AM Lassi Kortela <la...@lassi.io> wrote:

IMHO it's fine to be a little incompatible as
long as there are reasonable workarounds.

I want to talk a bit about what incompatibilities are okay and what are not okay in my opinion.

1) What used to be an error is now not an error. Example: R6RS `log` was modified to accept the base as a second argument rather than always using base e. This is fine as long as you are not writing (log 1 2) in order to deliberately generate an arity error.

A more probable case would be when you are implementing a new kind of numbers and you want to signal an implementation-specific divide by 0; writing (/ 1 0) is a portable way to achieve that. I think this kind of break is not a problem almost all the time. (Indeed, I read R7RS to say that (log 1 2 3) and (/ 1 0) can be an outright compiler error and not just a compiler warning of a runtime error if the compiler chooses to make it so, though not everyone agrees.)

2) What used to not be an error is now an error. Example: R6RS removed the use of # in an inexact number for an unspecified digit (it was a wildcard, so 10.3## could be anything from 10.300 to 10.399, usually the first one), which R5RS had required support for. This kind of change is annoying to the user, but can be found and fixed, especially as most Scheme code either remains with the user or is open source. (Probably nobody much used # anyway.)

3) What used to mean one thing now means another. This is a silent breaking change and is VERY BAD. Example: R6RS changed the meaning of `real?` so that while 3+0i is still real, 3.0+0.0i is not, because you can't be sure that 0.0 isn't really a positive or negative value that is too small to represent, but that pushes you off the real number line. The claim (according to Will Clinger) was that most people's mental model of a real number didn't care about this case, and those who did care would prefer the exact-only version. R6RS provided the R5RS semantics under the name `real-valued?`, but it's not easy to remember which is which, and nobody has come up with a better name.

In R7RS-small we returned to the R5RS definition: if the imaginary part is zero the number is real, whether the zero is exact or inexact. (Will says that R7RS is ambiguous here because one of the examples I copied blindly from R6RS has the R6RS interpretation; I say that examples are not normative parts of a spec. The dispute remains unresolved.)

4) The way the implementer used to do it won't work any more. An example here is the R6RS exception system. R5RS had no exception system, so individual Schemes adopted their own, sometimes written up in a SRFI, sometimes not. The requirements that R6RS imposed on conformant exception systems pretty much excluded all existing systems, so to move to R6RS (as opposed to implementing it de novo) was a lot of work; indeed, Guile had until very recently separate throwers and catchers for native and R6RS exceptions. Implementers of Scheme mostly weren't willing to do that work, and their loyalty was (correctly) to the users of their particular implementation, so R6RS uptake among existing systems was pretty much confined to those who had participated in the R6RS work and knew what to expect.

Important note: I am using R6RS as a handy source of examples, not in order to beat it to death. There are lots of excellent ideas in R6RS, and I was glad to adopt some for R7RS-small and more for R7RS-large. But I still think it was the right decision for the R7RS-small WG to start with R5RS, not R6RS, because R6RS had so many type 4 issues.

It looks to me offhand like SRFI 140 and friends will have compatibility problems of type 1 (not a problem), type 2 (a limited problem for users, though it will involve recompiling or the equivalent of existing programs), and type 4. It is type 4 that concerns me even more than type 2. I do not want to see R7RS-large mostly ignored because implementers vote with their feet, deciding that the necessary work on their core Scheme to distinguish between mutable and immutable strings, to make the first type length-adjustable, and to make most string functions accept either type, is more than they are willing to commit to. It's one thing to add a Posix interface (or more likely a shim to the existing Posix interface); it's another to change things like pointer tags that distinguish between Scheme types.

As Charlie Brown says: "It's one thing to get a dog to chase a ball ... another thing to get him to bring it back ... and yet a third to make him drop it."

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

Does anybody want any flotsam? / I've gotsam.
Does anybody want any jetsam? / I can getsam.
--Ogden Nash, No Doctors Today, Thank You

Marc Nieper-Wißkirchen

unread,

Apr 28, 2020, 5:04:32 PM4/28/20

to scheme-re...@googlegroups.com

Am Di., 28. Apr. 2020 um 22:27 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

> 2) What used to not be an error is now an error. Example: R6RS removed the use of # in an inexact number for an unspecified digit (it was a wildcard, so 10.3## could be anything from 10.300 to 10.399, usually the first one), which R5RS had required support for. This kind of change is annoying to the user, but can be found and fixed, especially as most Scheme code either remains with the user or is open source. (Probably nobody much used # anyway.)

As R7RS does not prescribe that an error is signaled, an
implementation is even free to continue to support the R5RS style of
numbers. So this is even less a problem.

> 3) What used to mean one thing now means another. This is a silent breaking change and is VERY BAD. Example: R6RS changed the meaning of `real?` so that while 3+0i is still real, 3.0+0.0i is not, because you can't be sure that 0.0 isn't really a positive or negative value that is too small to represent, but that pushes you off the real number line. The claim (according to Will Clinger) was that most people's mental model of a real number didn't care about this case, and those who did care would prefer the exact-only version. R6RS provided the R5RS semantics under the name `real-valued?`, but it's not easy to remember which is which, and nobody has come up with a better name.

I don't want to argue which version of `real?' makes more sense (Will
Clinger seems to say the R6RS version) but when languages are
standardized errors that are recognized as such only much later are
unavoidably made. For example, a version `real?' may be in an older
report where nowadays everyone agrees that it got the semantics wrong.
It should be allowed to correct such mistakes in a later report
without sticking to what is now clearly seen as bad. (I should come up
with a better example than `real?'.) Scheme is neither JavaScript nor
PHP.

However, (3) happens much more often than one may think. Consider the
`member' procedure of R5RS, which takes two arguments there. Some
implementations may have extended the domain of that procedure to
three arguments. Now, SRFI 1 and R7RS come along and extend the domain
in a possibly in compatible way. You have the same effect; it is a
silent breaking change for these implementations.

> 4) The way the implementer used to do it won't work any more. An example here is the R6RS exception system. R5RS had no exception system, so individual Schemes adopted their own, sometimes written up in a SRFI, sometimes not. The requirements that R6RS imposed on conformant exception systems pretty much excluded all existing systems, so to move to R6RS (as opposed to implementing it de novo) was a lot of work; indeed, Guile had until very recently separate throwers and catchers for native and R6RS exceptions. Implementers of Scheme mostly weren't willing to do that work, and their loyalty was (correctly) to the users of their particular implementation, so R6RS uptake among existing systems was pretty much confined to those who had participated in the R6RS work and knew what to expect.

This is the least convincing point for me. If R7RS stuck to the R6RS
condition system (maybe too large for the small language), more and
more implementations would have gradually adopted it and we could now
have a rich condition system that is sharable between a number of
major implementations. The system we currently have in R7RS-small is
really just the greatest common divisor. It's nice but not more than
this.

> Important note: I am using R6RS as a handy source of examples, not in order to beat it to death. There are lots of excellent ideas in R6RS, and I was glad to adopt some for R7RS-small and more for R7RS-large. But I still think it was the right decision for the R7RS-small WG to start with R5RS, not R6RS, because R6RS had so many type 4 issues.

In retrospective (I know that it is unfair), I have doubts. And the
type 4 issues could have been outsourced to optional libraries or to
WG 2. And the "mustard" (as Will Clinger calls it) could have simply
be removed.

Let me add one more compatibility problem that can happen, namely the
addition of procedures. R7RS defines what is in `(scheme base)'. As
the language evolves, a new edition may want to add a new binding to
`(scheme base)'. This can, in principle, break programs, namely when a
program written for the previous version defines an identifier that is
introduced in the new version.

> It looks to me offhand like SRFI 140 and friends will have compatibility problems of type 1 (not a problem), type 2 (a limited problem for users, though it will involve recompiling or the equivalent of existing programs), and type 4. It is type 4 that concerns me even more than type 2. I do not want to see R7RS-large mostly ignored because implementers vote with their feet, deciding that the necessary work on their core Scheme to distinguish between mutable and immutable strings, to make the first type length-adjustable, and to make most string functions accept either type, is more than they are willing to commit to. It's one thing to add a Posix interface (or more likely a shim to the existing Posix interface); it's another to change things like pointer tags that distinguish between Scheme types.

What Scheme systems do you expect to adopt R7RS-large (or do you want to adopt)?

If one of the goals of R7RS-large is to support practical programming
that would otherwise have to be done in other languages but Scheme, we
need excellent implementations and even implementations that can
compete in stability and performance with, say, V8 or SpiderMonkey in
JavaScript world.

Per Bothner

unread,

Apr 28, 2020, 6:46:12 PM4/28/20

to scheme-re...@googlegroups.com

On 4/28/20 1:26 PM, John Cowan wrote:

> It looks to me offhand like SRFI 140 and friends will have compatibility problems of type 1 (not a problem), type 2 (a limited problem for users, though it will involve recompiling or the equivalent of existing programs), and type 4. It is type 4 that concerns me even more than type 2. I do not want to see R7RS-large mostly ignored because implementers vote with their feet, deciding that the necessary work on their core Scheme to distinguish between mutable and immutable strings, to make the first type length-adjustable, and to make most string functions accept either type, is more than they are willing to commit to. It's one thing to add a Posix interface (or more likely a shim to the existing Posix interface); it's another to change things like pointer tags that distinguish between Scheme types.

Is there any "non-small" Scheme (i.e. that might conceivably implement R7RS-large) that
does not distinguish mutable and immutable strings?

Implementing length-adjustability would require a redesign in some implementations,
primarily those that allocate string header and string characters in a single "blob"
*and* that allocate 4 (or 3) bytes for each character (since otherwise you
will require length-adjustment on string-set! too).

So there is a empiric question: How many Scheme implementers (who would otherwise be
open to implementing R7RS-large) are in a situation where they would be reluctant to
implement SRFI-140-style strings? I have no idea. The answer probably depends on what
is the alternative: implementing (and documenting) a new "text" type - perhaps
with a new lexical syntax? I suspect the latter might be more work, and the resulting
language less attractive to use, teach, and maintain, even if it might be easier
in the short term.

Perhaps we need a survey of implementers of the alternatives you are considering?

When it comes to Kawa, I don't see myself attempting to support R7RS-large as
a whole, though of course someone else might Pertly because I'm not spending
as much time on Kawa these days (I'm spending more time on DomTerm), but more
because R7RS-large doesn't look very attractive to me: It's too big, and I
disagree with the 50-distinct-procedures-for-each-datatype approach.
SRFI-135-style texts would be one more step in the wrong direction, IMO.

John Cowan

unread,

Apr 28, 2020, 7:46:12 PM4/28/20

to scheme-re...@googlegroups.com

On Tue, Apr 28, 2020 at 5:04 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

As R7RS does not prescribe that an error is signaled, an
implementation is even free to continue to support the R5RS style of
numbers. So this is even less a problem.

Minimizing type 2 issues is precisely why R7RS-small has only the error-is-signaled clauses of R5RS, plus those for the environment constructors where the specified standard is not 5, because if you try to load an R4RS environment and don't have one, further progress is unlikely to be useful.

It should be allowed to correct such mistakes in a later [specification]

without sticking to what is now clearly seen as bad.

A counterexample is seek(), a Unix system call that could offset the file pointer by up to 2^15 bytes in either direction and returned the new file pointer as a signed int. When it became clear that both disks and files would be much larger in the future, then rather than breaking backward compatibility, a similar system call lseek() for "long seek" was added. Seek() is long since forgotten, but the new name served its purpose at the time and the extra letter is now of no consequence to anyone.

However, (3) happens much more often than one may think. Consider the
`member' procedure of R5RS, which takes two arguments there. Some
implementations may have extended the domain of that procedure to
three arguments. Now, SRFI 1 and R7RS come along and extend the domain

in a possibly incompatible way. You have the same effect; it is a

silent breaking change for these implementations.

That's of course true, and there are occasions when we incorporated R6RS procedures but changed them to follow existing common practice, specifically SRFI 1 (from another point of view, it was R6RS that had changed SRFI 1!) Standards can't always deal with the unbounded variation of existing implementations, though standards groups often collect such information. For Scheme iwhat information we had can be found in the many pages linked from <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/ImplementationContrasts.md>, and it was quite influential for the R7RS-small WG as well as for some SRFIs.

This is the least convincing point for me. If R7RS stuck to the R6RS
condition system (maybe too large for the small language), more and
more implementations would have gradually adopted it

Or, as my father (a lawyer) used to say whenever I said anything dogmatically, perhaps they would not. WG1 really did work very hard to minimize the cost of conversion.

In retrospective (I know that it is unfair), I have doubts.

Well, let me urge you to look at <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/SixRejection.md>, which is a precis of the comments, mostly objections, given by the R6RS voters. (There is a link to the originals.) At that time, negative votes were required to have comments; affirmative votes were not. There is also information on how R7RS-small responded to the objections and how R7RS-large might do so. (Note that this is a 2017 document and some of my opinions have changed.)

Also of interest may be <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/WG1vsR6RSDiff.md>, an attempt to explain not only what features of R6RS were and were not added to R7RS, but also why. Opinions are my own.

Let me add one more compatibility problem that can happen, namely the
addition of procedures. R7RS defines what is in `(scheme base)'. As
the language evolves, a new edition may want to add a new binding to
`(scheme base)'. This can, in principle, break programs, namely when a
program written for the previous version defines an identifier that is
introduced in the new version.

Indeed it can, and at this point I consider the libraries of R7RS-small frozen, with the possible exception of (scheme base), which is a special case due to the fact that REPLs must provide it at least. I hope though that it can be kept intact, and everything new put in a library somewhere. This makes for arbitrary-looking separations between procedures in an older library from their close relatives in a newer one, but we can't have everything.

While R7RS-large is under development, I am allowing new identifiers to be added to later editions superseding older ones, but always in a backward compatible way, and never without an explicit ballot vote. I assume they will be frozen when R7RS-large reaches its end (or is abandoned).

What Scheme systems do you expect to adopt R7RS-large (or do you want to adopt)?

Well, Per has ruled out Kawa, and given its design that makes sense. Fortunately, many SRFIs have portable sample implementations, which makes it possible for Kawa users to add them if they want.

That is a sign that the adoption of R7RS-small was a Good Thing.

I hope that Arew ("R7RS over Chez") will progress further to become possibly the fastest available R7RS Scheme, and indeed I hope that Loko will become R7RS-capable as well, in the style of Larceny or Sagittarius.

If one of the goals of R7RS-large is to support practical programming
that would otherwise have to be done in other languages but Scheme, we
need excellent implementations and even implementations that can
compete in stability and performance with, say, V8 or SpiderMonkey in
JavaScript world.

Few application domains have the stability constraints of the browser, fortunately.

On Tue, Apr 28, 2020 at 6:46 PM Per Bothner <p...@bothner.com> wrote:

Implementing length-adjustability would require a redesign in some implementations,
primarily those that allocate string header and string characters in a single "blob"
*and* that allocate 4 (or 3) bytes for each character (since otherwise you
will require length-adjustment on string-set! too).

I don't have detailed information on that, but I know that Chicken is one such implementation. In addition, Chicken's core strings are 8-bit, with a loadable module that treats them as UTF-8 instead. This can cause mojibake when libraries that use UTF-8 strings and core strings try to interoperate. This also means that the UTF-8 version of `string-set!` triggers a minor GC when the length of a string in bytes (as opposed to characters) has to change so that all pointers to the old string can be updated behind the scenes.

So there is a empiric question: How many Scheme implementers (who would otherwise be
open to implementing R7RS-large) are in a situation where they would be reluctant to
implement SRFI-140-style strings? I have no idea.

Nor do I, but we can start with those that do and do not already support the immutability of literals, which is a big step forward. That information is available at <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/ImmutableStrings.md>, one of the pages linked from ImplementationContrasts. It shows that the immutability of literals is orthogonal to what standard each implementation is aimed at.

The answer probably depends on what
is the alternative: implementing (and documenting) a new "text" type - perhaps
with a new lexical syntax? I suspect the latter might be more work,

I don't see how, given the SRFI and Clinger's various portable implementations. Only the lexical syntax would need incorporating, and as chair I am postponing all lexical syntax proposals to the end of the R7RS-large process, so that if implementers need to change `read`, they can do so all at once. I do not expect more than a few of the proposals at <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/LexicalDocket.md> to pass.

Perhaps we need a survey of implementers of the alternatives you are considering?

Indeed. The scheme-im...@googlegroups.com mailing list could serve this purpose; you are a member of it.

R7RS-large doesn't look very attractive to me: It's too big, and I
disagree with the 50-distinct-procedures-for-each-datatype approach.
SRFI-135-style texts would be one more step in the wrong direction, IMO.

Well, to each their own. Personally I believe that providing a hierarchy of record types, much less prescribing a specific hierarchy in a standard, is a very bad type 4 issue, worse even than a hierarchy of conditions.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

Said Agatha Christie / To E. Philips Oppenheim
"Who is this Hemingway? / Who is this Proust?
Who is this Vladimir / Whatchamacallum,
This neopostrealist / Rabble?" she groused.

--George Starbuck, Pith and Vinegar

Per Bothner

unread,

Apr 28, 2020, 10:33:56 PM4/28/20

to scheme-re...@googlegroups.com

On 4/28/20 4:46 PM, John Cowan wrote:

> What Scheme systems do you expect to adopt R7RS-large (or do you want to adopt)?
>
>
> Well, Per has ruled out Kawa, and given its design that makes sense. Fortunately, many SRFIs have portable sample implementations, which makes it possible for Kawa users to add them if they want.

Well, I have over the years bundled various libraries (SRFI and otherwise) with Kawa,
depending on whether I felt like it, people have requested it, people have provided ports,
or there is a sample implementations that I feel is appropriate for Kawa (clean, portable, efficient).
So - who knows? But aiming to claim "Kawa implements R7RS-large" is remote and low priority
as long as I'm the primary maintainer. OTOH I do claim "Kawa implements R7RS (small)",
with some limitations/bugs.

Per Bothner

unread,

Apr 28, 2020, 10:58:08 PM4/28/20

to scheme-re...@googlegroups.com

There is one other kind of incompatibility I'm concerned about:

5. An existing in-use API or datatype is believed to be no longer appropriate,
perhaps because of performance issues, new hardware, new standards (such as Unicode), etc.
Modifying or extending the old API seems non-feasible without breaking compatibility
and/or major conceptual re-design, so instead one designs a new separate API, possibly
with a bridge/wrapper to the old API. This has happened a number of times in the Java world,
notably with AWT/Swing/JavaFX and java.io/java.nio.

While this doesn't break existing programs or tests, it does have the risk of
splitting the community, slowing down adoption, complicating teaching resources,
and discouraging migration form the old API to the new API. People have to choose the old
or the new API, and even if there are wrappers and conversion routines, it becomes a pain.
Migrating from the old to the new API can be a lot of work, and code using the APIs will not
work on old implementations.

I think SRFI-135 fails disastrously in this respect. I really don't expect
there will be a lot of migration of code to use text/textual - and code that
does will feel the need preserve support for the old R7RS string API, which will
make teh code uglier.

So in practice the compatibility problems of SRFI 135 may be much worse than SRFI 140;
I think migrating code to benefit from SRFI 140 would be much easier than migrating
to SRFI 135.

Lassi Kortela

unread,

Apr 29, 2020, 3:45:20 AM4/29/20

to scheme-re...@googlegroups.com

> What Scheme systems do you expect to adopt R7RS-large (or do you
> want to adopt)?
>
> Well, Per has ruled out Kawa, and given its design that makes sense.
> Fortunately, many SRFIs have portable sample implementations, which
> makes it possible for Kawa users to add them if they want.
> That is a sign that the adoption of R7RS-small was a Good Thing.
>
> I hope that Arew ("R7RS over Chez") will progress further to become
> possibly the fastest available R7RS Scheme, and indeed I hope that Loko
> will become R7RS-capable as well, in the style of Larceny or Sagittarius.

Gambit is faster than Chez overall in r7rs-benchmarks, its git master
supports about all of R7RS-small at this point and is being actively
developed. If something doesn't work, please file a bug. We're just
starting to add more SRFIs and keeping the R7RS-large voting in mind.

If Chez gets R7RS support as well, so much the better.

Guile 3 has R7RS support built in:
`docker run -it schemers/guile:3 guile --r7rs`

Digamma is a new hybrid R6RS/R7RS Scheme from the author of Ypsilon (the
soft-real-time R6RS interpreter made for pinball games). He's making
progress almost daily right now. <https://github.com/fujita-y/digamma>

"Digamma implements mostly concurrent garbage collector that achieves a
remarkably short GC pause time, implements separate compilation thread
to incrementally generate native code in background, implements on the
fly FFI with LLVM."
No news about Loko and R7RS. Göran is a strong adherent of R6RS, but
also of compatibility.

Gauche and Sagittarius are improving almost daily as well. Cyclone is
getting commits about weekly.

@okuoku from Japan is doing impressively thorough work on R6RS/R7RS
compatibility: <https://github.com/okuoku/yuni>. We should recruit him
to work with us directly on R7RS-large and other compatibility stuff.

P.S. We now have Docker containers of the git master of many Schemes:

docker run -it schemers/chibi:head
docker run -it schemers/cyclone:head
docker run -it schemers/digamma:head
docker run -it schemers/gambit:head
docker run -it schemers/gauche:head
docker run -it schemers/sagittarius:head

Lassi Kortela

unread,

Apr 29, 2020, 3:50:53 AM4/29/20

to scheme-re...@googlegroups.com

Oh, and R7RS implementation for MIT Scheme is ongoing as well.

Duy Nguyen

unread,

Apr 29, 2020, 5:12:34 AM4/29/20

to scheme-re...@googlegroups.com

On Wed, Apr 29, 2020 at 2:45 PM Lassi Kortela <la...@lassi.io> wrote:
>
> > What Scheme systems do you expect to adopt R7RS-large (or do you
> > want to adopt)?
> >
> > Well, Per has ruled out Kawa, and given its design that makes sense.
> > Fortunately, many SRFIs have portable sample implementations, which
> > makes it possible for Kawa users to add them if they want.
> > That is a sign that the adoption of R7RS-small was a Good Thing.
> >
> > I hope that Arew ("R7RS over Chez") will progress further to become
> > possibly the fastest available R7RS Scheme, and indeed I hope that Loko
> > will become R7RS-capable as well, in the style of Larceny or Sagittarius.
>

> ...

>
> Gauche and Sagittarius are improving almost daily as well. Cyclone is
> getting commits about weekly.

Gauche already supports both red and tangerine in 'master'. Daily
improvements (besides bug fixes) are not about r7rs-large anymore.
--
Duy

Marc Nieper-Wißkirchen

unread,

Apr 29, 2020, 10:13:52 AM4/29/20

to scheme-re...@googlegroups.com

Am Mi., 29. Apr. 2020 um 01:46 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

> A counterexample is seek(), a Unix system call that could offset the file pointer by up to 2^15 bytes in either direction and returned the new file pointer as a signed int. When it became clear that both disks and files would be much larger in the future, then rather than breaking backward compatibility, a similar system call lseek() for "long seek" was added. Seek() is long since forgotten, but the new name served its purpose at the time and the extra letter is now of no consequence to anyone.

And now we also have llseek(). This doesn't make a language more
beautiful or easier to use.

I understand that there weren't many options for the C programming
language as maintaining backward compatibility is very important there
(as it is for the TeX engine, say). But Lisp and Scheme are neither C
nor TeX. One shouldn't easily break backward compatibility but not
allowing it altogether would be detrimental to Scheme as a programming
language of elegance and that, at least in the past, had set out to
uncharted territory.

>> This is the least convincing point for me. If R7RS stuck to the R6RS
>> condition system (maybe too large for the small language), more and
>> more implementations would have gradually adopted it
>
>
> Or, as my father (a lawyer) used to say whenever I said anything dogmatically, perhaps they would not. WG1 really did work very hard to minimize the cost of conversion.

This couldn't have been the only goal. :) Otherwise, you have come up
with R5RS. ;)

[...]

>> In retrospective (I know that it is unfair), I have doubts.
>
>
> Well, let me urge you to look at <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/SixRejection.md>, which is a precis of the comments, mostly objections, given by the R6RS voters. (There is a link to the originals.) At that time, negative votes were required to have comments; affirmative votes were not. There is also information on how R7RS-small responded to the objections and how R7RS-large might do so. (Note that this is a 2017 document and some of my opinions have changed.)

I do not subscribe to everything the negative votes to R6RS complained
about. (Nor do I agree with R6RS in every aspect.) Many objections
have their roots in the size of the R6RS standard (a fair point!), but
to account for these, one could have simply made some libraries (e.g.
syntax-case or eval or the reader or writer optional).

> Also of interest may be <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/WG1vsR6RSDiff.md>, an attempt to explain not only what features of R6RS were and were not added to R7RS, but also why. Opinions are my own.

I agree with the change from LIBRARY to DEFINE-LIBRARY for the reasons
given, especially because it is easy for a system to support both
conventions. Packaging the predefined identifiers differently in the
standard libraries is also fine as is providing the same functionality
under different names (cf. the division operators).

Identifier syntax is not too important for a system supporting only
SYNTAX-RULES. In conjunction with a more powerful system, they have
too many sensible uses that one shouldn't leave them out (the point
that this would make macros weaker is not convincing to me; with the
same reasoning, a standard should only support one thread...).

There is one thing, however, where R7RS definitely did it wrong and
also broke compatibility with R6RS needlessly, namely that local
syntax definitions shouldn't affect earlier expressions in the same
local group. I know your rationale for this, but it just makes one
blind because it sounds so convincing. (And even with the R6RS
semantics, no one forces you to write a program that does not
introduce the syntax first.)

Here are four reasons why the R6RS semantics are the "right" and why
the intended R7RS semantics are fundamentally flawed:

(1) Will Clinger's idea that R7RS is a superset of R6RS (without the
mustard) would be flawed. Schemes supporting both R6RS and R7RS will
most likely implement only one of the two semantics. Larceny uses the
R6RS semantics, for example. Sagittarius as well, I think.

(2) Writing an expander that adheres to the R7RS semantics is more
complicated than to write an expander for the R6RS semantics, showing
that the R7RS model is actually logically more complicated.

(3) Some kind of referential transparency is lost: Consider the following macro:

(define-syntax define-even-odd
(syntax-rules ()
((_ even? odd?)
(begin
(define (even? x) (or (zero? x) (odd? (- x 1))))
(define (odd? x) (and (not (zero? x)) (even? (- x 1))))))))

It is supposed to be used in a definition context as in the following code:

(lambda (x)
(define-even-odd even? odd?)
(even? x))

Will it work? Well, it depends on the lexical context with the R7RS
model: If `odd?' is defined as a macro outside the lambda, the code
will break.

(4) In any sensible use of Identifier macros the defined syntax should
behave as a variable (reference). It cannot, however, when R7RS makes
the superficial distinction between one type of syntax, namely
variable reference, and all other types of syntax.

For example, a common idiom (when you have identifier macros) to
support not-so-much-optimizing compilers is:

(define-syntax pi (identifier-syntax 3.14159))

This should, in all aspects, behave as the definition of an immutable
variable. It doesn't though in the presence of the R7RS semantics. The
following code in the same module would break:

(lambda (x)
(define (qoppa x) (or (zero? x) (pi (- x 1))))
(define (pi x) (and (not (zero? x)) (qoppa (- x 1))))
(pi x))

Therefore, I advise everyone to read the R7RS in a way that leads to
the R6RS semantics. :) Moreover, I would suggest that R7RS-large
corrects the mistake. I believe that no existing program would break.

[...]

> Indeed it can, and at this point I consider the libraries of R7RS-small frozen, with the possible exception of (scheme base), which is a special case due to the fact that REPLs must provide it at least. I hope though that it can be kept intact, and everything new put in a library somewhere. This makes for arbitrary-looking separations between procedures in an older library from their close relatives in a newer one, but we can't have everything.

Are you speaking about R7RS-large here or over some imagined future R8RS?

> I hope that Arew ("R7RS over Chez") will progress further to become possibly the fastest available R7RS Scheme, and indeed I hope that Loko will become R7RS-capable as well, in the style of Larceny or Sagittarius.

Two more reasons to revert to the R6RS semantics when it comes to
deferring the expansion of the right hand sides.

[...]

Marc

John Cowan

unread,

Apr 29, 2020, 2:41:34 PM4/29/20

to scheme-re...@googlegroups.com

On Tue, Apr 28, 2020 at 10:58 PM Per Bothner <p...@bothner.com> wrote:

5. An existing in-use API or datatype is believed to be no longer appropriate,

If there was anything I would have liked to change in R7RS-small, it was the character datatype as distinct from strings of length 1, and the mutability of strings. But I could see that it was not to be: politics is the art of the possible.

While this doesn't break existing programs or tests, it does have the risk of
splitting the community, slowing down adoption, complicating teaching resources,
and discouraging migration form the old API to the new API.

Fortunately, Scheme does not have a corporate sponsor who can threaten to deprecate or remove the old API, and there are enough implementations that it will never entirely die.

I think SRFI-135 fails disastrously in this respect. I really don't expect
there will be a lot of migration of code to use text/textual

I agree 100% about migration to texts. By the same token, I do not expect massive migration from mutable hash tables (SRFI 125) to SRFI 146 ordered or hash mappings. Both are there in R7RS-large whenever they are wanted, but even in places where SRFI 146 would be superior, the costs of conversion to a completely new datatype may well be too high. If anything, texts are somewhat better in this case, since all the procedures accept strings but do not produce them (with the trivial exceptions of the string<->text conversions).

Clinger specifies five advantages of SRFI 135, by which he really means his three sample implementations, since the SRFI 140 sample implementation is also an implementation of 135 (though not portable). These are space efficiency, faster sequential access, faster random access, fast extraction of subtexts, and faster concatenation; the first three with respect to a representation based on UTF-8 or UTF-16 only. There is no real advantage over pure 8-bit or 16-bit non-Unicode representations or for UTF-32, but these are increasingly rare. I suspect that those and only those programs which critically depend on these improvements will actually be migrated from traditional strings.

These considerations are precisely why I wrote SRFI 153 (which never gets no respect): in order to make sure that R7RS-large can have (if the voters agree) a decent minimum of string procedures operating on the traditional mutable strings of Scheme with a completely portable sample implementation, a subset of SRFI 13.

- and code that
does will feel the need preserve support for the old R7RS string API, which will
make teh code uglier.

I'm not sure I understand this. Why would code need to preserve support for the string API as distinct from support for strings?

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo

Per Bothner

unread,

Apr 29, 2020, 3:13:40 PM4/29/20

to scheme-re...@googlegroups.com

On 4/29/20 11:41 AM, John Cowan wrote:
> While this doesn't break existing programs or tests, it does have the risk of
> splitting the community, slowing down adoption, complicating teaching resources,
> and discouraging migration form the old API to the new API.
>
>
> Fortunately, Scheme does not have a corporate sponsor who can threaten to deprecate or remove the old API, and there are enough implementations that it will never entirely die.

I don't see how that avoids the problems I list - quite the opposite.

> These considerations are precisely why I wrote SRFI 153 (which never gets no respect): in order to make sure that R7RS-large can have (if the voters agree) a decent minimum of string procedures operating on the traditional mutable strings of Scheme with a completely portable sample implementation, a subset of SRFI 13.

I assume you meant to write SRFI 152.

> - and code that
> does will feel the need preserve support for the old R7RS string API, which will
> make teh code uglier.
>
>
> I'm not sure I understand this. Why would code need to preserve support for the string API as distinct from support for strings?

I meant that if an application/library uses SRFI-135 texts for performance, they will probably also
implement a shim or alternative code-path so it works on implementations without texts. This is the
ugliness I referred to.

John Cowan

unread,

Apr 29, 2020, 6:19:19 PM4/29/20

to scheme-re...@googlegroups.com

On Wed, Apr 29, 2020 at 10:13 AM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

One shouldn't easily break backward compatibility but not
allowing it altogether would be detrimental to Scheme as a programming
language of elegance and that, at least in the past, had set out to
uncharted territory.

Well, that is a political viewpoint rather than a technical one, though not inherently the worse for that. We have a mechanism for resolving (if not necessarily dissolving) political disagreements: we vote on things when they come up.

> WG1 really did work very hard to minimize the cost of conversion.

This couldn't have been the only goal. :) Otherwise, you have come up
with R5RS. ;)

There was a respected minority opinion that advocated for just that, with the addition of a minimal module system as demanded by our charter.

Identifier syntax is not too important for a system supporting only
SYNTAX-RULES. In conjunction with a more powerful system, they have
too many sensible uses that one shouldn't leave them out

The only use I know of is converting accessors to variables in a local scope like a class, but I'd like to hear about others.

(the point
that this would make macros weaker is not convincing to me; with the
same reasoning, a standard should only support one thread...).

Having only one thread happens to be in fact my own view, but again I bow to other people's views.

(And even with the R6RS
semantics, no one forces you to write a program that does not
introduce the syntax first.)

I continue to think it simplifies both implementation and human understanding, exactly because they can be done in one pass, and any unknown identifier can be presumed to be a variable.

Here are four reasons why the R6RS semantics are the "right" and why
the intended R7RS semantics are fundamentally flawed:

(1) Will Clinger's idea that R7RS is a superset of R6RS (without the
mustard) would be flawed. Schemes supporting both R6RS and R7RS will
most likely implement only one of the two semantics. Larceny uses the
R6RS semantics, for example. Sagittarius as well, I think.

As you pointed out, R7RS puts a restriction on users that R7RS does not, so any R6RS system is very close to compliant with R7RS as well. The only exception is when the same name is defined by an outer syntax definition and an inner one, and the inner one is used before it is defined. I doubt it would occur to anyone to do this, although your point 3 provides a (thoroughly artificial) example.

If I had my druthers, I'd put alpha-conversion on programmers' shoulders, requiring them not to use the same name both globally and locally. That's why I have a Favorite Toy Language where I can put these ideas. I'll be doing well to write it down, though, never mind implementing it.

(lambda (x)
(define (qoppa x) (or (zero? x) (pi (- x 1))))
(define (pi x) (and (not (zero? x)) (qoppa (- x 1))))

(pi x))''

Although Chez supports this, Guile treats it as an error. I have not been able to try it on any other Schemes (where are all those Docker images again?)

> Indeed it can, and at this point I consider the libraries of R7RS-small frozen, with the possible exception of (scheme base), which is a special case due to the fact that REPLs must provide it at least. I hope though that it can be kept intact, and everything new put in a library somewhere. This makes for arbitrary-looking separations between procedures in an older library from their close relatives in a newer one, but we can't have everything.

Are you speaking about R7RS-large here or over some imagined future R8RS?

Either.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

Mark Twain on Cecil Rhodes: I admire him, I freely admit it,
and when his time comes I shall buy a piece of the rope for a keepsake.

John Cowan

unread,

Apr 29, 2020, 6:27:15 PM4/29/20

to scheme-re...@googlegroups.com

On Wed, Apr 29, 2020 at 3:13 PM Per Bothner <p...@bothner.com> wrote:

> Fortunately, Scheme does not have a corporate sponsor who can threaten to deprecate or remove the old API, and there are enough implementations that it will never entirely die.

I don't see how that avoids the problems I list - quite the opposite.

Well, at least migration is not mandatory, as it can be with Java.

I meant that if an application/library uses SRFI-135 texts for performance, they will probably also
implement a shim or alternative code-path so it works on implementations without texts.

Ah, I see. However, there is really no reason why an R7RS-small implementation of Scheme shouldn't have texts, as the SRFI 135 sample implementation is portable. If the implementer does not supply it, the programmer can download it and use it. When we get out of the realm of portable code, the problem gets worse and much more caution needs to be applied, which is one of the reasons I postponed non-portable libraries to the Green and Olive dockets, some ways away yet. (An exception will be macro systems, which will be in Yellow, the next docket after the current one.)

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

What asininity could I have uttered that they applaud me thus?
--Phocion, Greek orator

John Cowan

unread,

Apr 29, 2020, 11:47:02 PM4/29/20

to scheme-re...@googlegroups.com

On Wed, Apr 29, 2020 at 3:45 AM Lassi Kortela <la...@lassi.io> wrote:

docker run -it schemers/chibi:head
docker run -it schemers/cyclone:head
docker run -it schemers/digamma:head
docker run -it schemers/gambit:head
docker run -it schemers/gauche:head
docker run -it schemers/sagittarius:head

Thanks again! I note that "docker search schemers" turns up a whole bunch more, but ":head" doesn't always work, whereas ":latest" always does. Does that mean that the ones without ":head" aren't considered complete?

Lassi Kortela

unread,

Apr 30, 2020, 3:13:22 AM4/30/20

to scheme-re...@googlegroups.com

> docker run -it schemers/chibi:head
> docker run -it schemers/cyclone:head
> docker run -it schemers/digamma:head
> docker run -it schemers/gambit:head
> docker run -it schemers/gauche:head
> docker run -it schemers/sagittarius:head
>
> Thanks again! I note that "docker search schemers" turns up a whole
> bunch more, but ":head" doesn't always work, whereas ":latest" always
> does. Does that mean that the ones without ":head" aren't considered
> complete?

Glad to hear we have more than two users :D

The collection is not complete and probably won't be for a long time, if
ever. It'd be nice to have every major version of every Scheme, as well
as the version control head, but it will take months. All containers
that do exist are working. I use them daily. If something is a priority,
let me know. Some Schemes are trivial to add; others very tricky.

The schemers front page at Docker Hub
<https://hub.docker.com/u/schemers> lists all the implementations: 44 in
total. We should add as many as we can -- the most recently updated ones
are listed first, so unmaintained ones will eventually slide down to the
end of the list.

When you click on a Scheme, you get to its dedicated page. Go to the
"Tags" tab and you'll find all the available flavors of that Scheme.
These are the things that are valid to type after the ":".

* "latest" is Docker's default container (i.e. schemers/guile is an
alias for schemers/guile:latest). This an alias to the release with the
highest version number.

* "0", "1", "2", etc. are the latest release of that major version. E.g.
schmemers/guile:2 currently gets you Guile 2.2 whereas schemers/guile:3
is Guile 3.0. Likewise schemers/chicken:4 and schemers/chicken:5, etc.

* "head" is the version control head (usually git master). We have to
rebuild these manually but that's one button press in Docker Hub. Only
available for a subset of Schemes at the moment; more on the way.

As always, more helping hands are welcome.
<https://github.com/scheme-containers>

NOTE: If Docker gobbles up too many gigabytes, `docker system prune` is
the safe and easy way to free some space. `docker system prune -af` to
wipe everything.

Marc Nieper-Wißkirchen

unread,

Apr 30, 2020, 3:22:28 AM4/30/20

to scheme-re...@googlegroups.com

Am Do., 30. Apr. 2020 um 09:13 Uhr schrieb Lassi Kortela <la...@lassi.io>:
>
> > docker run -it schemers/chibi:head
> > docker run -it schemers/cyclone:head
> > docker run -it schemers/digamma:head
> > docker run -it schemers/gambit:head
> > docker run -it schemers/gauche:head
> > docker run -it schemers/sagittarius:head

[...]

This is probably a stupid question and definitely a noob one: How safe
is it to run binaries through docker?

Marc

Lassi Kortela

unread,

Apr 30, 2020, 4:26:25 AM4/30/20

to scheme-re...@googlegroups.com

> This is probably a stupid question and definitely a noob one: How safe
> is it to run binaries through docker?

Not at all stupid.

The processes in a container share the Linux kernel of the host OS, but
Docker uses the kernel's isolation features to give each container its
own private filesystem and process ID namespaces. So processes in the
container can't see files and processes outside the container.

There's a command line flag to mount directories from the host OS into
the container's file system if you want to do that, and of course there
can be bugs in Linux or Docker (as there can be in the JVM, VirtualBox,
FreeBSD jails, or anything else) that let malicious processes escape the
isolation. But in general, it's the easiest and safest form of full
isolation from the host OS.

Lassi Kortela

unread,

Apr 30, 2020, 4:36:05 AM4/30/20

to scheme-re...@googlegroups.com

However, be forewarned that Docker is often buggy on both Linux and Mac
under heavy use. Sometimes it hangs no obvious reason and stops working.
It generally doesn't break other software, but I've found it best not to
use it on my main computer or on a critical server since its
unpredictable resource consumption can make slow down your computer when
you're trying to get work done.

Running lots of containers can also easily gobble up so much disk space
that you run out of space on a small server. Docker doesn't clean
anything up by default. `docker system prune` is your friend.

In case you're running Debian, I recommend `apt-get install docker.io`
instead of messing with the unofficial APT repositories. Likewise for
other distros; the distro-packaged Docker engine should be good enough.

John Cowan

unread,

Apr 30, 2020, 11:01:24 AM4/30/20

to scheme-re...@googlegroups.com

On Thu, Apr 30, 2020 at 3:13 AM Lassi Kortela <la...@lassi.io> wrote:

All containers
that do exist are working. I use them daily. If something is a priority,
let me know. Some Schemes are trivial to add; others very tricky.

Thanks for all your work and for the explanations. In (small) return, here's a dinky shell script based on the one I used when I (tried to) keep all the Schemes listed in ImplementationContrasts installed locally through many changes of machine. I have it installed as /usr/local/bin/schemers, but of course anywhere on your PATH will do.

When run without arguments, it lists all available "schemers" images in alphabetical order. When run with the name of a container, it runs that container interactively. When run with "-a", it runs all containers interactively, one by one. This makes it easy, given a snippet of Scheme, to test it on many implementations by pasting it into each REPL in turn.

--------------------- cut here ---------------------

#!/bin/sh
if [ -z "$1" ]; then
docker search schemers | sed 1d | sort
elif [ "$1" = "-a" ]; then
for s in $(docker search schemers | awk -F '[/ ]' '{print $2}'); do
echo '========================================================'
echo " $(echo $s | tr '[a-z]' '[A-Z]')"
echo '========================================================'
docker run -it schemers/$s
done
else
docker run -it schemers/$1
fi

--------------------- cut here ---------------------

If something is a priority,
let me know. Some Schemes are trivial to add; others very tricky.

Here's a list of the Schemes given in <https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/ImplementationContrasts.md> that don't (I think) already have images:

racket scsh sisc foment scm ypsilon nexj jscheme ksi sigscheme shoe mini-scheme scheme9 rscheme s7 unlikely siod bdc xlisp rep elk umb llava sxm sizzle lmu dfsch inlab oaklis

Links to their home pages can be found at the above link.

I would say the Racket REPL has the highest priority: only the current version really matters here, though perhaps both standard Racket and Racket CS are worth having. SCM is important because it provides full access to SLIB, which has many good ideas in it that haven't really been mined yet. Oaklisp is the most deviant Scheme: it branched off after R3RS, and represents a complete re-engineering of Scheme over an OO core. Other than these there is no priority as far as I am concerned: create images whenever you feel like it. New Schemes not on the above list are also of interest, of course.

Marc N-W: I would also add to Lassi's remarks that Docker on Mac and on Windows are very safe: they use a Linux VM to run the containers on.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

XQuery Blueberry DOM
Entity parser dot-com
Abstract schemata / XPointer errata
Infoset Unicode BOM --Richard Tobin

Marc Nieper-Wißkirchen

unread,

Apr 30, 2020, 12:20:19 PM4/30/20

to scheme-re...@googlegroups.com

Am Do., 30. Apr. 2020 um 00:19 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

> Well, that is a political viewpoint rather than a technical one, though not inherently the worse for that. We have a mechanism for resolving (if not necessarily dissolving) political disagreements: we vote on things when they come up.

:-)

>
>>
>> > WG1 really did work very hard to minimize the cost of conversion.
>>
>>
>> This couldn't have been the only goal. :) Otherwise, you have come up
>> with R5RS. ;)
>
>
> There was a respected minority opinion that advocated for just that, with the addition of a minimal module system as demanded by our charter.

Don't get me wrong; I like everything that R7RS added to R5RS. I also
like what WG1 added and what was missing in R6RS (parameter objects
and DELAY-FORCE, for example; SRFI 155 wasn't in existence at that
time; otherwise, promises could have become even more elegant and
DELAY-FORCE could have been removed altogether).

>> Identifier syntax is not too important for a system supporting only

>> SYNTAX-RULES. In conjunction with a more powerful system, they have>> too many sensible uses that one shouldn't leave them out
>
>
> The only use I know of is converting accessors to variables in a local scope like a class, but I'd like to hear about others.

We have already discussed a number of good uses privately some time
ago (and you seemed to agree); SRFI 190 is another example where this
feature is needed. (I'll make a summary when I start the discussion on
macro systems as you have assigned to me.)

> Having only one thread happens to be in fact my own view, but again I bow to other people's views.

For R7RS-small this may be fine; R7RS-large really needs SRFI 18 (or
some alternative) to accommodate its intended use cases.

>
>> (And even with the R6RS
>> semantics, no one forces you to write a program that does not
>> introduce the syntax first.)
>
>
> I continue to think it simplifies both implementation and human understanding, exactly because they can be done in one pass, and any unknown identifier can be presumed to be a variable.

Contrary to what it seems, the implementation has to make two passes
for the R7RS model, while one pass suffices for the R6RS model (the
latter is clear from the operational semantics given in the R6RS). The
essential point why the R7RS model is flawed and should be abandoned
is that variable references are syntax as well. By making a
distinction between this kind of syntax and other kinds of syntax, one
actually complicates everything (leading to a number of issues as I
demonstrated).

Besides its problems, The R7RS model doesn't fulfill its promises as
we can see by looking at the following code:

(let ((g (* x x)))
(lambda ()
(define x (g 2))
...
x))

Does this expression evaluate to 4? We cannot tell unless we examine
the contents of "...". The result depends on whether "..." contains a
local variable definition of "g" or not. The artificial restriction
that unknown identifiers have to be variables does not help.

Unless forward references are completely restricted to explicit letrec
constructs, there are cases where we will have to live with the fact
that we have to read the left-hand sides of definitions up the end
before we take a look at the right-hand sides (still one pass over
each item!).

I'm not saying that one should write programs in a way that put the
definition of a variable far below the actual use; this is a bad
programming style. But as the R7RS model does not prevent bad
programming style, the R6RS model does not prevent a good programming
style.

>> Here are four reasons why the R6RS semantics are the "right" and why
>> the intended R7RS semantics are fundamentally flawed:
>>
>> (1) Will Clinger's idea that R7RS is a superset of R6RS (without the
>> mustard) would be flawed. Schemes supporting both R6RS and R7RS will
>> most likely implement only one of the two semantics. Larceny uses the
>> R6RS semantics, for example. Sagittarius as well, I think.
>
>
> As you pointed out, R7RS puts a restriction on users that R7RS does not, so any R6RS system is very close to compliant with R7RS as well. The only exception is when the same name is defined by an outer syntax definition and an inner one, and the inner one is used before it is defined. I doubt it would occur to anyone to do this, although your point 3 provides a (thoroughly artificial) example.

A programming language doesn't become beautiful if its model only
works 99% of the time, which is the case for the R7RS model.

From a practical point of view, you are right that interchanging both
models won't affect virtually any real program. This, however, shows
that the choice of R7RS to break compatibility with R6RS (needlessly)
cannot be grounded on practical reasons. Another reason to choose the
simpler model.

(Truth be told, I was once bitten by the R7RS model when I wrote a
library in which I tried to order the definitions so that documenting
the code became straightforward. It worked until I moved some group of
syntax definitions further down to the end...)

[...]

>> (lambda (x)
>> (define (qoppa x) (or (zero? x) (pi (- x 1))))
>> (define (pi x) (and (not (zero? x)) (qoppa (- x 1))))
>> (pi x))''
>
>
> Although Chez supports this, Guile treats it as an error. I have not been able to try it on any other Schemes (where are all those Docker images again?)

This is an incompatibility of the Guile top-level with the R6RS. Have
you tried to enclose everything (including the definition of "pi" as a
constant) in a "let* ()"?

>> Are you speaking about R7RS-large here or over some imagined future R8RS?
>
>
> Either.

Okay. This makes sense after abandoning the R6RS versioning scheme.
(Did anyone use it actually?)

John Cowan

unread,

Apr 30, 2020, 2:44:50 PM4/30/20

to scheme-re...@googlegroups.com

On Thu, Apr 30, 2020 at 12:20 PM Marc Nieper-Wißkirchen <marc....@gmail.com> wrote:

> The only use I know of is converting accessors to variables in a local scope like a class, but I'd like to hear about others.

We have already discussed a number of good uses privately some time
ago (and you seemed to agree);

Here's a consolidation of your list and SamTH's list, which I posted but not necessarily advocated for:

(1) Procedures overloaded by macros
(2) Definition of immutable variables
(3) LISP-style dynamic variables/C-style thread-local variables.
(4) Contracted values (the syntax object carries around source information so that violations of the contract can be reported properly

(5) functions w/ keyword arguments,

(6) structure constructors
(7) the `this' binding in classes (to make Racket classes look more like other languages)
(8) local field references in classes (ditto)
(9) imports in units (run-time linking)

Note that Racket converts all variable references and function calls to use magic macros, so the syntax is truly uniform.

(2) seems sound and useful. (5) hangs on how we do keywords. The others are mostly ignotum per ignotius.

For R7RS-small this may be fine; R7RS-large really needs SRFI 18 (or
some alternative) to accommodate its intended use cases.

I greatly prefer shared-nothing processes organized into coarse-grained dataflow graphs to share-everything-by-default threads, especially in the presence of mutability.

(let ((g (* x x)))

(lambda ()
(define x (g 2))
...
x))

Does this expression evaluate to 4? We cannot tell unless we examine
the contents of "...". The result depends on whether "..." contains a
local variable definition of "g" or not.

Internal defines map to a letrec*, so it returns 4 unless there is another internal definition in the "..." at the same block level as the definition of x. But this has nothing to do with syntax definitions.

The artificial restriction
that unknown identifiers have to be variables does not help.

Unless forward references are completely restricted to explicit letrec
constructs,

Or defines at the same level, including the top level. That should suffice.

From a practical point of view, you are right that interchanging both
models won't affect virtually any real program. This, however, shows
that the choice of R7RS to break compatibility with R6RS (needlessly)
cannot be grounded on practical reasons.

Historical correction: since R7RS-small was never intended to be a successor of R6RS, it is misleading to speak of breaking compatibility: rather, this is a feature of R6RS that we chose not to include.

Okay. This makes sense after abandoning the R6RS versioning scheme.
(Did anyone use it actually?)

As far as I know (which is pretty far) no one.

John Cowan http://vrici.lojban.org/~cowan co...@ccil.org

Almost all theorems are true, but almost all proofs have bugs.
--Paul Pedersen

Marc Nieper-Wißkirchen

unread,

Apr 30, 2020, 3:10:13 PM4/30/20

to scheme-re...@googlegroups.com

Am Do., 30. Apr. 2020 um 20:44 Uhr schrieb John Cowan <co...@ccil.org>:

[...]

>> (let ((g (* x x)))
>>
>> (lambda ()
>> (define x (g 2))
>> ...
>> x))
>>
>> Does this expression evaluate to 4? We cannot tell unless we examine
>> the contents of "...". The result depends on whether "..." contains a
>> local variable definition of "g" or not.
>
>
> Internal defines map to a letrec*, so it returns 4 unless there is another internal definition in the "..." at the same block level as the definition of x. But this has nothing to do with syntax definitions.

It doesn't have to do anything with syntax definitions, yes, and this
is exactly my point!

Please reread what I wrote when presenting this example. The R7RS
semantics does not prohibit what it seemingly wanted to prohibit,
namely that one can understand the code sequentially.

>> From a practical point of view, you are right that interchanging both
>> models won't affect virtually any real program. This, however, shows
>> that the choice of R7RS to break compatibility with R6RS (needlessly)
>> cannot be grounded on practical reasons.
>
>
> Historical correction: since R7RS-small was never intended to be a successor of R6RS, it is misleading to speak of breaking compatibility: rather, this is a feature of R6RS that we chose not to include.

Not including a feature is different from including an incompatible
version of a feature. (Not that I would call it a feature; it is just
the simplest semantic model for expansion.)

While R7RS-small was not intended as a successor of R6RS, there is an
implicit constraint by the charter of WG2:

First of all, every valid R7RS-small program (that uses just the
R7RS-small features) must be a valid R7RS-large program. This means
that the semantics of R7RS-small and R7RS-large have to be aligned.
Then it says that "insofar as practical, the language [R7RS-large]
should be backward compatible with an appropriate subset of the R6RS
standard".

My point is that there were no constraints that would have made
sticking to the R6RS semantics impractical. By changing the semantics
without any pressure for whatever reasons, we have now backward
compatbility with a subset of R6RS programs, but not with any subset
of the language itself.

Whatever, it doesn't matter as the R7RS semantics are also in
isolation an inferior choice. Luckily, almost no one would notice if
we agreed and adopted the better semantics. :)

Reply all

Reply to author

Forward