[racket] Struct fields in struct-info + match enhancements

38 views
Skip to first unread message

J. Ian Johnson

unread,
May 14, 2014, 12:35:06 PM5/14/14
to users
tl;dr if you use struct-info in your programs, I might break them. Please continue reading.

I had a PR a while ago suggesting a change to struct-copy due to its unhygienic nature with fields. It did not go through since there wasn't enough information in the struct-info to separate the struct-name and the field-name. Because struct-info does not have a procedural-only interface, changing it to instead or also hold the individual field identifiers would be backwards incompatible. However, I also expect that struct-info manipulation outside of core Racket is rare.

Is there anyone out there that would be affected by a this change that would be unwilling to make slight modifications to support the new struct-info?
I ask not because of struct-copy itself, but for an additional enhancement to racket/match: named field selection from structs instead of positional only.
I'm getting bitten by pervasive refactoring woes whenever I add fields to structs. All of my match patterns must change to have an extra _ somewhere.

My proposal for match is to change the pattern language in a backwards-compatible way:
The two forms
(struct-id pat ...)
(struct struct-id (pat ...))
will get optional keyword arguments #:first, and #:last that will match named fields appropriately, and unnamed patterns will be matched positionally from either the first field, or the (#fields - #unnamed patterns)th field.

(struct-id op-first-or-last-kw pat-or-named-pat ...)
(struct struct-id op-first-or-last-kw pat-or-named-pat ...)
op-first-or-last-kw ::=
op-first-or-last-kw ::= #:first | #:last
pat-or-named-pat ::= [#:field field-id pat] | pat

If first-or-last-kw is not given, named patterns are only allowed if all patterns are named.
If they are given, then there do not have to be as many patterns as there are fields.

Names that clash with the positions will do either of the following, depending on popular opinion:
(A) the name will be treated as positionally correct, and further patterns skip past the clashing named patterns.
Ex: for (struct A (w x y z)),
(match (A 0 1 #t 2) [(A #:first 0 [#:field y y*] 1) y*] [_ #f]) ==> #t
(match (A 0 1 #t 2) [(A #:first 0 [#:field y y*] 1 2) y*] [_ #f]) ==> #t
(match (A #f 0 #t 1) [(A #:last [#:field y y*] 0 1) y*] [_ #f]) ==> #t
(match (A 'x 0 #t 1) [(A #:last [#:field y y*] x 0 1) (cons x y*)] [_ #f]) ==> '(x . #t)
(B) clashes have confusing behavior when refactoring. Syntax error.
Ex: all the above would error, but the following are still allowed
(match (A 0 1 2 3) [(A #:first 0 [#:field y 2]) #t] [_ #f]) ==> #t
(match (A 0 1 2 3) [(A #:last 3 [#:field y 2]) #t] [_ #f]) ==> #t
(C) some kind of option to match to prefer (A) or (B) behavior?

Hygiene-wise, field identifiers are interpreted in the context of the struct identifier's context, and not the local context. Thus we can bind x to #t and still name the x field of the A struct. This might be better dealt with via delta-transformers, but I'm not sure. Matthew, Carl or Ryan would be better judges of that.

Thanks,
-Ian
____________________
Racket Users list:
http://lists.racket-lang.org/users

Sam Tobin-Hochstadt

unread,
May 14, 2014, 12:42:56 PM5/14/14
to J. Ian Johnson, users
On Wed, May 14, 2014 at 12:35 PM, J. Ian Johnson <ia...@ccs.neu.edu> wrote:
> tl;dr if you use struct-info in your programs, I might break them. Please continue reading.
>
> I had a PR a while ago suggesting a change to struct-copy due to its unhygienic nature with fields. It did not go through since there wasn't enough information in the struct-info to separate the struct-name and the field-name. Because struct-info does not have a procedural-only interface, changing it to instead or also hold the individual field identifiers would be backwards incompatible. However, I also expect that struct-info manipulation outside of core Racket is rare.
>
> Is there anyone out there that would be affected by a this change that would be unwilling to make slight modifications to support the new struct-info?
> I ask not because of struct-copy itself, but for an additional enhancement to racket/match: named field selection from structs instead of positional only.
> I'm getting bitten by pervasive refactoring woes whenever I add fields to structs. All of my match patterns must change to have an extra _ somewhere.

I don't understand why this would require a backwards-incompatible
change to struct-info.

Also, this discussion is confusing because it's not clear whether you
mean the dynamic value produced by the `struct-info` procedure, or the
structure type transformer binding. I think you mean the latter, in
which case I expect you could do what you want by implementing
`prop:struct-info` appropriately with an extended structure, and
handling existing values (such as six-element lists) appropriately
with defaults.

Sam

J. Ian Johnson

unread,
May 14, 2014, 1:08:06 PM5/14/14
to Sam Tobin-Hochstadt, users
I'm talking about the transformer binding. Specifically the result of extract-struct-info that both struct-copy and match use in their expansion, which produces a six-element list that satisfies struct-info?.
This six-element list has
optional identifier bound to type descriptor
optional identifier bound to type constructor
optional identifier bound to type predicate
list of field accessor identifiers (optional last value of #f)
list of optional field mutator identifiers
optional super type identifier

I want a 7th element that is the list of field identifiers themselves as given to the struct form or the define-signature form (which I believe produces the expected struct-info transformer binding, so I don't have to actually change define-signature). The expectation is that the field names and the field accessor names correspond 1-to-1.

7 is not 6, thus backwards-incompatible.
I know many programs use positional accessors rather than match on the whole structure of the list, so I imagine adding a 7th element is not very intrusive.
-Ian

Sam Tobin-Hochstadt

unread,
May 14, 2014, 1:18:58 PM5/14/14
to J. Ian Johnson, users
Here's my suggestion:

- We add an `extract-extended-struct-info` procedure which always
produces a struct that implements `prop:struct-info` (it's easy to use
a simply struct to wrap the list when needed)
- We change `struct` & `define-struct` to use a structure implementing
`prop:struct-info`, but with an additional field holding the list you
want (or perhaps this should be a new structure property)
- When `match` encounters a struct reference, it uses
`extract-extended-struct-info`, checks if it's an instance of this
extended struct, and if it is, uses that info.
- When `match` encounters a struct reference that doesn't support the
new behavior, the sort of keyword syntax you describe is a syntax
error.

I think this (a) preserves backwards compatibility (b) supports future
extension (c) accomplishes what you need.

Sam

J. Ian Johnson

unread,
May 14, 2014, 1:33:47 PM5/14/14
to Sam Tobin-Hochstadt, users
Sounds good. I'll pin this email for when I can get back to it.
Thanks,
Reply all
Reply to author
Forward
0 new messages