Most of us, myself included, had thought that you could reasonably
build a record system on top of procedural records using field names.
So, something like
(define-record-type a (fields b))
Would become something like
(begin
(define a (make-rtd '#(b)))
(define (a-b x) ((record-accessor a 'b) x)))
And moreover, one of the let-record forms specified by a member here
also bases is accessors by symbolic name.
However, this model, of having symbolic field names, breaks rather
spectacularly the hygiene of the system. That is, we expect to be
able to write macros that do their job, and preserve identifiers and
the like. Consider, therefore, the following macro:
;;; R6RS
(define-syntax build-box-type
(syntax-rules ()
[(_ box-type name getter)
(define-record-type box-type
(fields
(immutable foo get-foo)
(immutable name getter)))]))
;;; SRFI 99
(define-syntax build-box-type
(syntax-rules ()
[(_ box-type name getter)
(define-record-type box-type #t #t
(foo get-foo)
(name getter))]))
Now, let's consider the following series of expressions:
(build-box-type my foo get-foo)
(define x (make-my-foo 'foo 3))
(get-foo x)
What should this series of expression return? We would expect that it
would return 3. Moreover, we expect that the definition of the
internal get-foo and the external get-foo do not conflict.
Fortunately, both the implementation of SRFI 99 and R6RS' record
system preserve the latter behavior, and neither makes the internal
binding to get-foo visible. However, SRFI 99 has its syntactic record
system based on symbolic field names. This means that there are two
fields named foo in the record now. Since all accessors are defined
by index in the R6RS version, this isn't a problem. In the second one,
however, we now have a data leakage and a hygiene breakage, as we
expected that the externally defined get-foo should reference the second
field of the record, but instead, it references the first.
This means that in the R6RS version (as implemented in Chez Scheme),
the first gives us the expected return value of 3, but with the SRFI 99
implementation, we get 'foo instead.
Therefore, in any record system that we define, we must be careful to
ensure that accessor do not leak or shadow each other like this. In
this example, while functionally disastrous, from a security point of
view, this seems somewhat benign. However, if the first field were
meant to be a hidden field, or if we instead got an error saying that
you can't have two fields named the same thing, you have suddenly
leaked information out of your abstraction that could be very bad
for the safety and security of the program.
Aaron W. Hsu
Can we not assume that people writing web servers will use WG2?
> ;;; SRFI 99
> (define-syntax build-box-type
> (syntax-rules ()
> [(_ box-type name getter)
> (define-record-type box-type #t #t
> (foo get-foo)
> (name getter))]))
What this does completely depends on how the Scheme implements syntax
definitions that expand into definitions only. I tried the following
program on my usual array of Schemes:
> (define-syntax foo
(syntax-rules ()
((foo) (define x 32))))
> (foo)
> x
to see whether I got 32 as the value of x or blew up on an error such
as invalid syntax (in the define-syntax) or undefined variable. PLT,
MIT, Gambit, scsh/Scheme48, Guile, SISC, Chez, Ikarus, Mosh blew up;
Gauche, Chicken, Bigloo, Kawa, SCM, Larceny (in R5RS mode), Scheme 9,
STklos, sscm, SXM, VSCM, Chibi were fine with it and returned 32.
I haven't tested it, but I would expect that the definitions that SRFI-9(9)
expands into in the first set of Schemes would be invisible outside the
define-syntax, whereas they would be visible in the second set.
> SRFI 99 has its syntactic record system based on symbolic field
> names. This means that there are two fields named foo in the record
> now. Since all accessors are defined by index in the R6RS version,
> this isn't a problem. In the second one, however, we now have a data
> leakage and a hygiene breakage, as we expected that the externally
> defined get-foo should reference the second field of the record,
> but instead, it references the first.
In SRFI-99, the last line of the procedural layer definition says:
"Fields in derived record-types shadow fields of the same name in a
parent record-type." So that's normal and expected behavior.
> This means that in the R6RS version (as implemented in Chez Scheme),
> the first gives us the expected return value of 3, but with the SRFI 99
> implementation, we get 'foo instead.
>
> Therefore, in any record system that we define, we must be careful to
> ensure that accessor do not leak or shadow each other like this. In
> this example, while functionally disastrous, from a security point of
> view, this seems somewhat benign. However, if the first field were
> meant to be a hidden field, or if we instead got an error saying that
> you can't have two fields named the same thing, you have suddenly
> leaked information out of your abstraction that could be very bad
> for the safety and security of the program.
In SRFI-99 the field names are symbols, not identifiers, so they are
exposed as part of the type.
(If I'm talking completely past your point, as may be the case,
please let me know.)
--
John Cowan http://ccil.org/~cowan co...@ccil.org
Lope de Vega: "It wonders me I can speak at all. Some caitiff rogue did
rudely yerk me on the knob, wherefrom my wits still wander."
An Englishman: "Ay, a filchman to the nab betimes 'll leave a man
crank for a spell." --Harry Turtledove, Ruled Britannia
As specified here in R5RS :
* If a macro transformer inserts a binding for an
identifier
(variable or keyword), the identifier will in effect be
renamed
throughout its scope to avoid conflicts with other
identifiers.
Note that a `define' at top level may or may not introduce
a
binding; see section *Note
Definitions::.
So as I understand it free references in a macro definition should be
renamed internally so that it cannot be externally accessed:
(define-syntax foo
(syntax-rules ()
((foo) (define x 32))))
(foo)
; => (define <internal x 292> 32) therefore useless here...
This enforces "top-level definition inserting" macros to be explicit
and take names as a parameter, like this:
(define-syntax foo
(syntax-rules ()
((foo x) (define x 32))))
I really think this is a Good Thing (I like having explicit, clear and
unambiguous expressions)
If current implementations of macros makes different choices here,
would it be good to set a consensus ?
Best regards,
--
Emmanuel Medernach
Can we not assume that WG2 will use WG1's language?
In all serious, I find the implied line of reasoning here
very...suspect. It's like saying that we don't need to do any array
bounds checking, because the programs you expect to write should be
small enough that it won't matter.
Aaron W. Hsu
Any argument can be subject to reductio ad absurdum. But there is a world
of difference between bounds checking and /security/ concerns. If we have
to stand up against willful attacks I think we can give up right now; none
of us afaik have the very special expertise (and mindset) for that.