Syntax pattern to match a sequence with x identical elements?

75 views
Skip to first unread message

Jonathan Simpson

unread,
Oct 10, 2019, 9:37:56 PM10/10/19
to Racket Users
This seems like it should be simple but I've never been able to figure out how to do this. What I've been doing instead is this:

(x:integer ...+) to match two or more integers.

(x:integer y:integer ...+) to match three or more.

And so on.

I'm at a point now where I need to build patterns dynamically to match an exact number of elements. I'd also like to avoid having to create unique names for a bunch of pattern variables. ~between seems like what I want but I haven't been able to get it to work. I've been using ~seq without issue but that isn't exactly what I need.

Example of an attempt to use ~between:

(syntax-parse #'(1 1 1) [((~between x 3 3)) #'(x ...)])
; stdin::2631: syntax-parse: pattern keyword not allowed here
;   at: ~between


Can anyone give me a quick example of how to do this, using ~between or otherwise? I'm using syntax-parse, if that makes a difference.

Thanks!

-- Jonathan

Alexis King

unread,
Oct 10, 2019, 11:17:53 PM10/10/19
to Jonathan Simpson, Racket Users
tl;dr: You need to use an ellipsis, so your pattern should be ((~between x:integer 3 3) ...). A (much) more detailed explanation of why follows.

~between is an ellipsis-head pattern. The most common ellipsis-head pattern, ~optional, also works as a plain head pattern, but ~between does not. What’s the difference?

Let’s start by answering what a head pattern is. The simplest kind of syntax/parse pattern is a single-term pattern, which (as the name implies) only matches a single syntax object at a time. Head patterns are special in that they can match zero or more consecutive syntax objects in the head of a list. What is the head of a list? Well, if you have a list like '(1 2 3 4), its head is the sequence of elements “1 2 3 4” and its tail is simply the empty list, '(). It’s possible to write the list '(1 2 3 4 . ()) to make that more explicit.

So when you have a head pattern like (~optional x:integer), it might parse an integer, but it also might parse nothing. In the latter case, the next head pattern in the sequence would get a chance to parse the same element that (~optional x:integer) did. Head patterns are able to do this because lists introduce a kind of linear sequencing (not just tree-like nesting), so “skipping” an element is an operation that makes sense.

But what about ellipsis-head patterns? These are patterns that don’t just appear inside a list pattern, they appear inside a list pattern and under an ellipsis. For example, in the pattern (x y ... z), x and z are head patterns, but y is an ellipsis-head pattern. While head patterns introduce the ability to consume one or more elements at a time, ellipsis-head patterns extend that with the power to match elements in the list out of order. This is most useful when parsing keyword options, such as in the following pattern:

    ((~alt (~once (~seq #:foo foo:integer)) (~once (~seq #:bar bar:string))) ...)

The above pattern will match (#:foo 1 #:bar "two") or (#:bar "two" #:foo 1), but not (#:foo 1) or (#:foo 1 #:foo 2 #:bar "three"). This is because ~alt introduces a set of alternatives that can be matched, but unlike a simple ~or* pattern, it also keeps track of how many times each case matched, and patterns like ~once, ~optional, and ~between introduce constraints on the number of times a given case must match for the overall parse to be successful.

Interestingly, note that pattern variables bound under ~once and ~optional don’t have an ellipsis depth of 1, they have an ellipsis depth of 0. This is why, in the given example, you can refer to the foo and bar pattern variables in a template without any ellipses. ~between, however, still increments the ellipsis depth, since the pattern can actually match multiple times.

In the pattern I suggested at the beginning of this email, ((~between x:integer 3 3) ...), you’re creating an ellipsis-head context with exactly one alternative: (~between x:integer 3 3). That is exactly what you want, so everything works out fine.

The one remaining question, however, is why ~between is only allowed as an ellipsis-head pattern, but ~optional is also allowed as a head pattern. I can’t say for certain, since you can think of ((~optional x:integer)) as being sort of implicitly expanded to ((~optional x:integer) ...), and the same could be done for ~between. However, my guess is that it isn’t allowed because ~between increments the ellipsis depth of its sub-pattern, and Ryan thought it would be confusing for a pattern variable’s ellipsis depth to be incremented despite there not actually being any ellipses in the pattern. Therefore, when using ~between, you have to write the ellipsis explicitly.

Alexis

David Storrs

unread,
Oct 11, 2019, 12:02:43 PM10/11/19
to Alexis King, Jonathan Simpson, Racket Users
Alexis, is there any way that we can get this explanation and some of
the ones you've given me added to the Guide? I've tried to add things
to the documentation before, or even just submit typo reports but, as
I've mentioned before, the way the docs are organized (random little
snippets scattered in unintuitive locations in a huge code base
scattered across multiple repositories) presents too high a barrier to
entry to be worth it for simple changes. Presumably someone who knows
the system better could copy/paste this into an appropriate file and
submit a pull request.

Your explanations of things are extremely helpful, and really REALLY
need to be part of the docs somewhere. I would add them myself if the
method of doing so were straightforward and/or clearly documented.
Sadly, it is not.

Dave

PS: IMO, a vastly better system for the docs would be to have a set
of complete pages that actually correspond to the pages / URLs that
appear in the presentation. I'm not sure why it isn't done that way,
although I'm sure there's a reason. I do think that it's a major
problem that is denying a lot of the value of open-source
documentation.
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/7CD020EC-82DF-4117-8C56-A88CDD2EF818%40gmail.com.

wanderley...@gmail.com

unread,
Oct 11, 2019, 2:04:53 PM10/11/19
to David Storrs, Alexis King, Jonathan Simpson, Racket Users
+1 for adding the explanation to the Guide.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAE8gKoeAf5DAO_PFWTQMpGXvHtk1S6rAT7tCZaLQhn-EWBHe%2Bw%40mail.gmail.com.



--
Abraço,
Wanderley Guimarães

Jonathan Simpson

unread,
Oct 11, 2019, 10:55:19 PM10/11/19
to Racket Users
Thank you Alexis for the clear explanation. I now understand how to use ~between and it is working for me.

One small hitch I encountered is a custom syntax class I defined doesn't work in the ~between statement but works elsewhere within the same syntax pattern. This isn't a huge issue for me as I just copied the pattern in place of the syntax class but I am curious why the :integer syntax class works and my custom one doesn't.

Once again, thanks for taking the time to explain this!

-- Jonathan

Jonathan Simpson

unread,
Oct 12, 2019, 12:13:05 PM10/12/19
to Racket Users
Regarding my custom syntax-class issue, I realize now that it is probably because ~between only accepts splicing syntax classes. So, I created one that matches my regular syntax class. I'm not 100 percent sure that these are interchangeable in my use case though:

(define-syntax-class mag-lvl
    (pattern ({~datum level})))

(define-splicing-syntax-class mag-slvl
    (pattern ({~datum level})))

Does anyone know if :mag-slvl is interchangeable with :mag-lvl in most uses? Are there cases where :mag-slvl won't work the way I expect it to. I'm not confident in my understanding of the differences between using head patterns and single term patterns.

-- Jonathan

Alexis King

unread,
Oct 12, 2019, 2:28:05 PM10/12/19
to Jonathan Simpson, Racket Users
I believe your two syntax classes are identical, except for the fact that the splicing variant will not be allowed as a single term pattern. Therefore, I don’t think there’s ever any reason to prefer the splicing version.

I tried an example using your mag-lvl syntax class with ~between, and it worked fine. This program successfully prints a list of length 3:

#lang racket

(require syntax/parse)

(define-syntax-class mag-lvl
  (pattern ({~datum level})))

(syntax-parse #'((level) (level) (level))
  [((~between lvls:mag-lvl 3 3) ...)
   (attribute lvls)])

So I’m not sure what problem you’re bumping into, and it’s not something I can guess without knowing more information.

-- 
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.

Jonathan Simpson

unread,
Oct 12, 2019, 10:45:45 PM10/12/19
to Racket Users
I'm not sure exactly why my syntax class wasn't working, but it is working now. I had an extra set of parentheses around the ~between pattern, so it may have been related to that. Whatever the case may be, the non-splicing syntax class is working now.

I am very close to getting everything working but I am still having trouble using ~between as part of a ~seq. Here's an example:

(syntax-parse #'(1 2 'bar 4 5 'bar 'foo) [((~seq (~between x:integer 2 2) ... z) ...+ expr) #'foo])
; >: contract violation
;   expected: real?
;   given: #<syntax:stdin::15625 2>
;   argument position: 1st
;   other arguments...:
;    0

I want to match one or more sequences of two integers and an expression, finally ending in one final expression. This is a contrived example but demonstrates how I'll eventually need to use ~between. The syntax error I'm getting here isn't particularly enlightening.

Once again, I really appreciate any help here.

-- Jonathan
To unsubscribe from this group and stop receiving emails from it, send an email to racket...@googlegroups.com.

Alexis King

unread,
Oct 13, 2019, 4:08:00 AM10/13/19
to Jonathan Simpson, Racket Users
I think that error is a bug in syntax/parse. I have reported it here:

To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/f7c0b4ac-b11c-419a-b8e6-49176cd561fd%40googlegroups.com.

Jonathan Simpson

unread,
Oct 13, 2019, 10:57:17 AM10/13/19
to Racket Users
Thanks Alexis. 

To anyone who is in a position to look into this: it would be great if this can make it into the upcoming 7.5 release.

-- Jonathan

Jonathan Simpson

unread,
Nov 10, 2019, 10:10:28 PM11/10/19
to Racket Users
I verified that the pre-release version 7.4.0.902 resolves my issue with ~between. With that fix I was able to complete the macro I was working on.

Thanks again for the advice and the quick response by the Racket dev team!

-- Jon
Reply all
Reply to author
Forward
0 new messages