JC's comments on dpk's R7RS issues list

17 views

Skip to first unread message

John Cowan

unread,

Mar 19, 2022, 11:33:48 PM3/19/22

to scheme-re...@googlegroups.com

1. General principle: What is the scope of R7RS Large?

I don't know that any language standards process has any *principles* specifying what is found in the standard library. They pretty much just copy one another.

2. General principle: When do procedures take comparators and when do they take comparison procedures?

I think the rule should be that if only one of the three procedures is needed, that's all that should be provided. I can't see changing `assoc` to use a comparator, for example. If more than one is needed, a comparator is appropriate.

3. General principle: How low-level and OS-specific should operating system support be?

I have trouble with the terminology here. You give examples of low-level and mid-level APIs, and R7RS-small is said to have high-level APIs, but without some better explanation of what counts as low, mid, and high levels, I don't know how to evaluate this. The assignments seem fairly arbitrary to me.

However, in general I think that Posix support should not be treated as OS-specific, since pretty much all OSes provide a reasonable amount of support for it.

4. General principle: How do we decide what is an error vs what signals an error? (also: Error conditions in libraries adopted so far)

I don't think this will be relevant given the current direction of R7RS-large.

5. General principle: What is mandatory for R7RS Large implementations?

We should minimize optionality in the standard, allowing it only where it is clearly justified. Something we might want to make optional should simply be removed.

6. General principle: What stability guarantees should be made between ‘editions’ of R7RS Large?

This seems not to matter any more either.

7. General principle: Optional exports

See 5.

8. Procedural: To what extent do post-finalization notes to SRFIs affect R7RS Large?

I've already posted about this.

9. Informational: # notation

This is part of the general treatment of lexical syntax, which needs to be made consistent.

10. Declaring supported features to cond-expand in libraries

I think the specialized-library approach is best, though they do not have to be empty: (scheme regex unicode) might as well export whatever (scheme regex) does, to make it easier to use. However, features with names that are lists would be okay.

11. Safety of utf8->string and friends

Clearly these procedures should be exposed in only one library (preferably the base), and should provide for start and end. We may want to have both strict (throws an exception) and relaxed (uses U+FFFD) modes; if so, I don't see that it particularly matters how many U+FFFD characters are inserted, as the whole idea of relaxed mode is that you don't particularly care what you get, you just want to keep on truckin'.

12. Condition types

I think we must be very careful here, and I don't want to say anything off the cuff at this point.

13. Additional arguments to (scheme bytevector) UTF-8, UTF-16, and UTF-32 functions

The base versions are appropriate, and the functions should be removed from (scheme bytevector).

14. Bounds checking

In the large language we should check bounds. Schemes generally do so anyway.

(scheme list) should be fixed.

17. Argument order to map and unfold procedures with comparators

The set (and bag) procedures should be fixed.

18. Inheritance in record types

This is a can of worms. I continue to think that inheritance is usually a Bad Thing, and where needed, can be subsumed by replaced objects.

Nothing here yet, so I can't comment.

21. Naming conventions for record type descriptors

My view is that standard-library record types should not allow themselves to be subtyped (even if subtyping is added to the language).

22. Date and time API

Nothing here yet either.

23. Are all three of (scheme mapping), (scheme mapping hash), and (scheme hash-table) necessary?

In line with the C++ template library, our data structures should specify their big-O requirements, as this is important to programmers. Conflating trees with hash tables conceals the difference.

24. Requirements on implementations of (scheme set) with regard to comparators

My plan here is to restore SRFI 153 to draft state (Arthur has okayed this, since it was basically withdrawn because I accidentally lost the implementation and couldn't force myself to rewrite it at that time) and make it the tree-based set SRFI, at which time a PFN will make SRFI 113 the hash-table-based set SRFI. This will be a beth PFN, so it will have to be voted on in conjunction with SRFI 153.

25. Immutable values, mutation procedures, and linear-update procedures

I think the Right Thing here is to say that map! and friends *must* mutate their arguments in R7RS-Large, as making mutation optional primarily benefits Schemes that have to run in very limited code space.

26. rx from (scheme regex) is unnecessary given identifier macros and/or partial evaluation

I believe this is based on a misunderstanding. The `rx` macro has an implicit quasiquotation in it, and is provided for convenience. The fact that (if there are no unquoted portions) it can be compiled away is incidental.

27. Phasing of identifiers visible to syntax-case macros

I don't think any of this matters if phasing is implicit (but I may be failing to understand some distinction here).

28. Lexical notation of data structures

I think we should be very very careful about adding new lexical syntax, and should definitely not make them user-extensible, as that creates a phasing problem analogous to macro phasing: the code that specifies the new lexical extension must be executed before the code using it can be read. The CL experience with lexical syntax extensions has not been good: the Google CL Style Guide severely restricts their use, saying:

"You must not install new reader macros without a consensus among the developers of your system. Reader macros must not leak out of the system that uses them to clients of that system or other systems used in the same project. You must use software such as cl-syntax or named-readtables to control how reader macros are used. This clients who desire it may use the same reader macros as you do. In any case, your system must be usable even to clients who do not use these reader macros."

29. Extensible write etc.

Providing for generic functions and making `write` one of them is probably the Right Thing here. If we do that, we should figure out what other standard procedures should also be generic functions.

30. Performance characteristics of string-ref

I'll talk about this when I get yet another string library SRFI written. I put it off when I ran into some conceptual problems.

31. Unicode alias for lambda and possibly other things

I've never been thrilled with this, but of course identifier aliasing will achieve it, as will plain old macros.

32. Raw string literals

These only make sense, I think, if you have a lot of built-in escapes, but Scheme strings only have

33. Unicode gamut

All of Unicode, I think.

34. NULLs in strings

No existing Small implementation actually takes advantage of this feature, and I think it can be eliminated from Large.

35. Modularization of the (small) language

It's a charter principle that a Small program that doesn't involve any "is an error" situations has to run in Large unchanged, so I don't think we can do this.

36. Compatibility between syntax-error and syntax-violation

That would make it difficult for an R6RS with-exception-handler to translate an R7RS condition object, it's true. But the definitions of the standard condition type use the modal verb "could", which suggests to me that other fields may exist than the standardized ones.

37. current-jiffy is not usable portably

Does this still matter in the age of 64-bit processors, where fixnums can count up to 2^60 jiffies or so?

38. Half-precision and quadruple-precision floating point vectors

I have no problem adding new SRFI 160

39. Legacy and implementation-specific junk in (scheme hash-table)

No strong feelings about that.

40. Syntax for accessing ephemerons

Seems a priori like a good idea; I don't know the details.

41. Unicode normalization

Definitely something we should have.

42. Issuing warnings

See AssertionsWarnings. Warning condition objects aren't the most important thing; the `warn` procedure is.

43. HyperSpec

We will probably need a publication committee.

44. Interchangeability of mutable and immutable pairs

My inclination is not to make it mandatory. Pairs are pretty deep

45. What values are allowed to be returned from syntax transformers and what are their semantics?

I think "object" is probably an error for "datum"; if so, this should be fixed as an R6RS erratum.

46. Form feed characters in source code

Eh, I suppose we could. I doubt that part of the style guide is actually current.

47. Behaviour of newline on transcoded ports whose eol-style is none

Per SRFI 186 (which is supposed to be the same as R6RS modulo the additional end-of-line types) in style *none* no translation is done, so the (internal) newline character is output as-is, namely as a newline character.

48. u8-ready? and char-ready? on custom ports

There are a variety of complexities that we decided to sidestep in SRFI 181. One point not mentioned there is that if you have a deep stack of custom ports, you need to buffer up an arbitrary amount of input before you reach the bottom.

49. Access to stdin/stdout/stderr as binary ports

See the fd->port function in SRFI 170 which generalizes this idea.

50. Opening a file for both input and output

SRFI 170 and SRFI 181 provides this too, but only for binary ports.

51. Opening files on filesystems with non-Unicode names

Note that on NTFS, files are actually named with sequences of 16-bit values (with some restrictions), so it's even more complicated. Unicode is the least common denominator (modulo FAT file systems and the like). The general case shouldn't be part of the standard.

52. (Online) documentation

define-* macros (other than `define` itself) should take the form (define-* identifier form [docstring]).

53. Proper tail recurn modulo cons

Scheme implementations actually aren't required to treat all cases of tail recursion "properly", only those of a certain set of forms: see the last example in section 3.5 of R7RS-small. In order to make this practical, it would be necessary to specify in detail which procedures are to be treated as constructors, and that is much harder (in Haskell, lists suffice because they are lazy.

54. Low-level regular expression operations

This is a nice-to-have, but it limits how regular expressions are implemented unless the low-level operations are not guaranteed to be used to implement the high-level operations.

Vincent Manis

unread,

Mar 20, 2022, 6:28:59 PM3/20/22

to scheme-re...@googlegroups.com, John Cowan

On 2022-03-19 20:33, John Cowan wrote:

43. HyperSpec

We will probably need a publication committee.

s/probably/definitely/

To the extent that the outside world pays attention to our work, they will judge us by our reports, just as even non-Schemers often praise the quality of the RnRS series. I have been implicitly referencing this in some of my posts, but I want to make it explicit. We want to produce very high quality reports, organized as both hyperspecs and documents intended for sequential reading.

I don't think that this can be done after the fact. We need to plan for this right through the project.

So I would recommend that we add a Committee P to our efforts.

-- vincent

John Cowan

unread,

Mar 20, 2022, 6:36:29 PM3/20/22

to Vincent Manis, scheme-re...@googlegroups.com

Sounds good to me.

Reply all

Reply to author

Forward

0 new messages