Matching groups of optional elements

37 views
Skip to first unread message

David Storrs

unread,
May 4, 2020, 10:16:48 PM5/4/20
to Racket Users
I'm trying to write a parser for a CSV file with optional columns.  Simplified version:  There are 2 mandatory columns, after which there can be 0+ 4-column groups describing a person.  Each group has the same column headers.

; legal column arrangements:
RequiredA RequiredB
RequiredA RequiredB Name Age First Last
RequiredA RequiredB Name Age First Last Name Age First Last


; illegal:  if an optional group is present, it must have all 4 columns
RequiredA RequiredB Name Age First Last Name

I thought I could do this straightforwardly with `match`, but I'm wrong.  Can someone point me to the way to write such a match clause?


Various failed attempts:
(list reqA reqB (opt1 opt2 opt3 opt4) ...)   ; syntax error. matching clauses do not do grouping like this
(list reqA reqB (list opt1 opt2 opt3 opt4) ...) ; didn't expect this to work since it would specify an embedded list.  I was right.

This one surprised me:
(match row
  [(list required1 required2 (and opt1 opt2 opt3 opt4) ...)
   (list opt1 opt2 opt3 opt4)])

This distributes the ... over the four items inside the 'and' clause such that each of the 'optionalN' identifiers matches all remaining elements.
'(("Name" "Age" "First" "Last")
("Name" "Age" "First" "Last")
("Name" "Age" "First" "Last")
("Name" "Age" "First" "Last"))

In hindsight it makes sense -- the 'and' causes it to match the element across all four patterns.  They all match because they are identifiers and therefore match anything.  Then the '...' causes it to do that for all remaining elements, generating lists into each of the identifiers because that's what '...' does.

Michael MacLeod

unread,
May 4, 2020, 10:39:21 PM5/4/20
to David Storrs, Racket Users
I'm not sure this is possible with only using `match` patterns. A combination of the `list-rest` and `app` patterns as well as the `in-slice` procedure from `racket/sequence` should do the trick, though:

#lang racket

(require racket/match)

(define (collect-optional-vals x)
  (for/list ([y (in-slice 4 x)])
    y))

(match '(req-a req-b name1 age1 first1 last1 name2 age2 first2 last2)
  [(list-rest req-a req-b (app collect-optional-vals optional-vals))
   (list req-a req-b optional-vals)])

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAE8gKocCPSgVQG_aMSC%3DQJAmAtxvmCN8vqpwsankKnCJZAOotw%40mail.gmail.com.

Philip McGrath

unread,
May 4, 2020, 10:44:33 PM5/4/20
to Michael MacLeod, David Storrs, Racket Users
Depending on your requirements, I would consider using `syntax-parse` at runtime: this is easily written with its `~seq` patterns, and you get nicer error reporting.

Here's an example—I use syntax classes for clarity, but they aren't necessary, if you prefer to be more concise:

#lang racket

(require syntax/parse
         rackunit)

(define-splicing-syntax-class person-cols
  (pattern (~seq "Name" "Age" "First" "Last")))

(define-syntax-class csv-header
  (pattern ("RequiredA" "RequiredB" person:person-cols ...)))

(define valid-header?
  (syntax-parser
    [:csv-header
     #t]
    [_
     #f]))

;; legal column arrangements:
(check-true
 (valid-header? #'("RequiredA" "RequiredB")))
(check-true
 (valid-header? #'("RequiredA" "RequiredB" "Name" "Age" "First" "Last")))
(check-true
 (valid-header?
  #'("RequiredA" "RequiredB" "Name" "Age" "First" "Last" "Name" "Age" "First" "Last")))

;; illegal:  if an optional group is present, it must have all 4 columns
(check-false
 (valid-header?
  #'("RequiredA" "RequiredB" "Name" "Age" "First" "Last" "Name")))



David Storrs

unread,
May 4, 2020, 10:51:12 PM5/4/20
to Michael MacLeod, Racket Users
Fantastic.  Thanks, Michael.

On Mon, May 4, 2020 at 10:39 PM Michael MacLeod <michael...@gmail.com> wrote:

David Storrs

unread,
May 4, 2020, 10:52:28 PM5/4/20
to Philip McGrath, Michael MacLeod, Racket Users
Thanks, Phillip. 

Laurent

unread,
May 5, 2020, 6:21:19 AM5/5/20
to David Storrs, Philip McGrath, Michael MacLeod, Racket Users
If you insist on using match, you can do it with a loop:

#lang racket

(define (parsel l)
  (match l
    [(list-rest req1 req2 r)
     (cons (list req1 req2)
           (let loop ([r r])
             (match r
               ['() '()]
               [(list-rest opt1 opt2 opt3 opt4 r2)
                (cons (list opt1 opt2 opt3 opt4)
                      (loop r2))])))]))

(parsel '(RequiredA RequiredB Name Age First Last Name Age First Last)) ; works
(parsel '(RequiredA RequiredB Name Age First Last Name Age First )) ; match error



Reply all
Reply to author
Forward
0 new messages