Parse-pollen?

104 views
Skip to first unread message

Joel Dueck

unread,
Jul 13, 2018, 4:01:43 PM7/13/18
to Pollen

I'm trying to understand how to transform generic x-expressions using the currently defined tag functions and setup values. Any pointers welcome!

Example, consider this pollen.rkt:

;;;;;;;;pollen.rkt
#lang racket

(require markdown
         pollen/decode)

(provide (all-defined-out))

(define (em . xs)
  `(my-em "--Never a dull moment!--" ,@xs))

(define (root . xs)
  (define marked-down (decode-elements xs #:string-proc parse-markdown))
  `(body ,@marked-down))

If I then create a test.html.pm consisting solely of

#lang pollen

Hello _world_!

I get this for a doc → '(body (p () "Hello, " (em () "world") "!"))

How can I then re-run this x-expression through all of my tag functions?
I.e., so that in the above example I get instead, '(body (p () "Hello," (my-em () "--Never a dull moment!--" "world") "!"))

(I'm aware that there are better ways to do what this specific example seems to want to accomplish; I don't particularly care about using Markdown inside #lang pollen/markup documents, just using it as a semi-plausible example)

Matthew Butterick

unread,
Jul 14, 2018, 1:41:42 AM7/14/18
to Joel Dueck, Pollen

On Jul 13, 2018, at 2:01 PM, Joel Dueck <dueck...@gmail.com> wrote:

I get this for a doc → '(body (p () "Hello, " (em () "world") "!"))

How can I then re-run this x-expression through all of my tag functions?
I.e., so that in the above example I get instead, '(body (p () "Hello," (my-em () "--Never a dull moment!--" "world") "!"))

It's possible to send the x-expression back through another round of `eval` using the current bindings. Try this on the REPL:

> (require pollen/top) ; to handle unbound identifiers
> (eval '(body (p "Hello " (em "world") "!")))

'(body (p "Hello " (my-em "--Never a dull moment!--" "world") "!"))


Or in the definitions window, you have to include a namespace:

(require pollen/top)
(define-namespace-anchor nsa)
(eval '(body (p "Hello " (em "world") "!")) (namespace-anchor->namespace nsa))


However, even with `pollen/top` to catch undefined tags, an X-expression can still have elements that don't make sense in terms of `eval` — for instance (), which will throw an error, or an attribute list like ((foo "bar")). 

I think you'd end up having to pick through the x-expr to find the tags you want to re-process, but that just sounds like a `decode` operation.

Joel Dueck

unread,
Jul 14, 2018, 11:20:36 AM7/14/18
to Pollen
On Saturday, July 14, 2018 at 12:41:42 AM UTC-5, Matthew Butterick wrote:
Or in the definitions window, you have to include a namespace:

(require pollen/top)
(define-namespace-anchor nsa)
(eval '(body (p "Hello " (em "world") "!")) (namespace-anchor->namespace nsa))

However, even with `pollen/top` to catch undefined tags, an X-expression can still have elements that don't make sense in terms of `eval` — for instance (), which will throw an error, or an attribute list like ((foo "bar")). 

It’s funny, in trying to get this to work I was using almost exactly this technique (I was parameterizing current-namespace and attaching pollen/top inside that form) but I kept getting an error about #%app having an empty form or something. So perhaps it’s because I was passing it these empty attribute lists.
 

I think you'd end up having to pick through the x-expr to find the tags you want to re-process, but that just sounds like a `decode` operation.


Yes. I also tried using eval in a function and giving that as the txexpr-tag-proc to decode, but I got errors about temp-tag4123 being undefined.

Maybe what I’m really looking for is a decode (like scribble's?) that can take a tagged X-expression just apply any existing tag functions to it and its elements.

Matthew Butterick

unread,
Jul 14, 2018, 7:37:44 PM7/14/18
to Joel Dueck, Pollen

On Jul 14, 2018, at 8:20 AM, Joel Dueck <dueck...@gmail.com> wrote:

Maybe what I’m really looking for is a decode (like scribble's?) that can take a tagged X-expression just apply any existing tag functions to it and its elements.

AFAICT this does what you want. It descends the xexpr and sniffs the tags to see if they have bindings (using `identifier-binding`). If not, it subs in a `default-tag-function` for the tag. It also handles the special case of avoiding ().


#lang racket
(require markdown pollen/decode pollen/tag racket/syntax)

(provide (all-defined-out))

(define-tag-function (em attrs elems)
  `(my-em ,attrs  "--Never a dull moment!--" ,@elems))

(define (reapply-tags stx)
  (syntax-case stx ()
    [(TAG . XS) (with-syntax* ([XS (map reapply-tags (syntax->list #'XS))]
                               [TAG-FUNC (if (and (symbol? (syntax->datum #'TAG)) (identifier-binding #'TAG))
                                             #'TAG
                                             #'(default-tag-function 'TAG))])
                  #'(TAG-FUNC . XS))]
    [() #'(list)]
    [OTHER #'OTHER]))

(define (root . xs)
  (define marked-down (decode-elements xs #:string-proc parse-markdown))
  (eval (reapply-tags `(body ,@marked-down))))

Matthew Butterick

unread,
Jul 14, 2018, 8:10:44 PM7/14/18
to Joel Dueck, Pollen
On Jul 14, 2018, at 4:37 PM, Matthew Butterick <m...@mbtype.com> wrote:


On Jul 14, 2018, at 8:20 AM, Joel Dueck <dueck...@gmail.com> wrote:

Maybe what I’m really looking for is a decode (like scribble's?) that can take a tagged X-expression just apply any existing tag functions to it and its elements.

AFAICT this does what you want. It descends the xexpr and sniffs the tags to see if they have bindings (using `identifier-binding`). If not, it subs in a `default-tag-function` for the tag. It also handles the special case of avoiding ().


Correction: the last one didn't handle attribute lists correctly (and () can then be treated as a special case)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

#lang racket
(require markdown pollen/decode pollen/tag racket/syntax)

(provide (all-defined-out))

(define-tag-function (em attrs elems)
  `(my-em ,attrs  "--Never a dull moment!--" ,@elems))

(define (reapply-tags stx)
  (syntax-case stx ()
    [((ATTR-KEY ATTR-VAL) ...) #''((ATTR-KEY ATTR-VAL) ...)]
    [(TAG . XS) (with-syntax* ([XS (map reapply-tags (syntax->list #'XS))]
                               [TAG-FUNC (if (and (symbol? (syntax->datum #'TAG)) (identifier-binding #'TAG))
                                             #'TAG
                                             #'(default-tag-function 'TAG))])
                  #'(TAG-FUNC . XS))]

Joel Dueck

unread,
Jul 15, 2018, 11:16:03 AM7/15/18
to Pollen
On Saturday, July 14, 2018 at 7:10:44 PM UTC-5, Matthew Butterick wrote:
(define (reapply-tags stx)
  (syntax-case stx ()
    [((ATTR-KEY ATTR-VAL) ...) #''((ATTR-KEY ATTR-VAL) ...)]
    [(TAG . XS) (with-syntax* ([XS (map reapply-tags (syntax->list #'XS))]
                               [TAG-FUNC (if (and (symbol? (syntax->datum #'TAG)) (identifier-binding #'TAG))
                                             #'TAG
                                             #'(default-tag-function 'TAG))])
                  #'(TAG-FUNC . XS))]
    [OTHER #'OTHER]))


 Just tried it out, and this does exactly what I was struggling to imagine. The major “aha” for me in understanding what you’ve done here is that syntax-case can take a simple datum as well as a full-blown syntax object—I did not know that! Which in turn means syntax-case can be useful at run time as well as at compile time, which to me is kind of huge.

Now I’m curious to know where else this syntax/datum duality applies? The Racket docs for syntax-case seem to say that the first argument must be an expression that produces a syntax object, and yet if I try, say, `(syntax? '(body "Hello"))` in the REPL I get #f.

Matthew Butterick

unread,
Jul 15, 2018, 1:03:33 PM7/15/18
to Joel Dueck, Pollen
On Jul 15, 2018, at 8:16 AM, Joel Dueck <dueck...@gmail.com> wrote:

 Just tried it out, and this does exactly what I was struggling to imagine. The major “aha” for me in understanding what you’ve done here is that syntax-case can take a simple datum as well as a full-blown syntax object—I did not know that! Which in turn means syntax-case can be useful at run time as well as at compile time, which to me is kind of huge.

Yes, in a manner of speaking. What you're seeing is just a policy of lenient input, not something magical in the `syntax-case` semantics. If `syntax-case` gets input that is not a syntax object, it will convert it to one, and then behave normally. From the `syntax-case` docs:

"If stx-expr produces a non-syntax object, then its result is converted to a syntax object using datum->syntax and the lexical context and source location of the stx-expr."

(To be fair, it's easy to miss this, because it's a ways down on the page)
 
IOW, `(syntax-case '(foo "bar") ···)` is treated as if it were `(syntax-case #'(foo "bar") ···)` AFAIK this coercion behavior is shared by all the syntax matchers (e.g., `with-syntax`)


Sample code PS:
`(symbol? (syntax->datum #'TAG))` can be equivalently (and more readably) written as `(identifier? #'TAG)`.

Matthew Butterick

unread,
Jul 15, 2018, 10:03:33 PM7/15/18
to Joel Dueck, Pollen

On Jul 15, 2018, at 10:03 AM, Matthew Butterick <m...@mbtype.com> wrote:


On Jul 15, 2018, at 8:16 AM, Joel Dueck <dueck...@gmail.com> wrote:

 Just tried it out, and this does exactly what I was struggling to imagine. The major “aha” for me in understanding what you’ve done here is that syntax-case can take a simple datum as well as a full-blown syntax object—I did not know that! Which in turn means syntax-case can be useful at run time as well as at compile time, which to me is kind of huge.



One more (perhaps) simplifying idea — the bound-identifier sniffing is exactly what `#%top` from `pollen/top` does. So we could also wrap a `let-syntax` around the x-expression to rebind `#%top` to the special `pollen/top` version (using `make-rename-transformer`, which just means "treat the syntax on the left as if it were the syntax on the right")


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

#lang racket
(require markdown pollen/decode pollen/tag
         (prefix-in pt: pollen/top))

(provide (all-defined-out))

(define-tag-function (em attrs elems)
  `(my-em ,attrs  "--Never a dull moment!--" ,@elems))

(define (reapply-tags stx)
  (with-syntax ([XEXPR (let loop ([stx stx])
                         (syntax-case stx ()
                           [((ATTR-KEY ATTR-VAL) ...) #''((ATTR-KEY ATTR-VAL) ...)]
                           [(TAG . XS) (with-syntax ([XS (map loop (syntax->list #'XS))])
                                         #'(TAG . XS))]
                           [OTHER #'OTHER]))])
    #'(let-syntax ([#%top (make-rename-transformer #'pt:#%top)])
        XEXPR)))

Matthew Butterick

unread,
Jul 15, 2018, 10:21:06 PM7/15/18
to Joel Dueck, Pollen


On Jul 15, 2018, at 7:03 PM, Matthew Butterick <m...@mbtype.com> wrote:
One more (perhaps) simplifying idea — the bound-identifier sniffing is exactly what `#%top` from `pollen/top` does. So we could also wrap a `let-syntax` around the x-expression to rebind `#%top` to the special `pollen/top` version (using `make-rename-transformer`, which just means "treat the syntax on the left as if it were the syntax on the right")

Two more, apparently. I forgot that `txexpr` now has a `txexpr/stx` module that spares you from remembering the patterns for each piece. So this is an option too:


;;;;;;;;;;;;;;;;;;

#lang racket
(require (only-in markdown parse-markdown)
         pollen/decode pollen/tag txexpr/stx
         (prefix-in pt: pollen/top))

(provide (all-defined-out))

(define-tag-function (em attrs elems)
  `(my-em ,attrs  "--Never a dull moment!--" ,@elems))

(define (reapply-tags stx)
  (with-syntax ([XEXPR (let loop ([stx stx])
                         (if (stx-txexpr? stx)
                             (with-syntax ([TAG (stx-txexpr-tag stx)]
                                           [ATTRS (stx-txexpr-attrs stx)]
                                           [ELEMS (map loop (stx-txexpr-elements stx))])
                               #'(TAG 'ATTRS . ELEMS))
                             stx))])

Joel Dueck

unread,
Jul 18, 2018, 11:19:16 AM7/18/18
to Pollen
Thanks for these examples! I have been chewing on them for the past couple of days.

Reading ‘Beautiful Racket’ really helped me get what #%top was all about. As I mentioned, I was originally trying to attach pollen/top to a parameterized current-namespace for use with eval. I guess I wasn't fully digesting what things (specifically attribute lists and empty lists) need to look like before eval takes a crack at them, because eval will try to evaluate inner expressions before outer ones. I do try pretty hard to work it out before asking for help, but I rarely get more than an hour at a time to think about it :-P

Also I hadn't really examined `let loop` before now. I think it finally makes sense.

This kind of learning is what I really have enjoyed about Pollen (and BR), they give me specific use-cases or need/solution pairs that are like an on-ramp for learning Racket. “I want to be able to do X”→“well esoteric concepts Y and Z come in handy for that.” The more I learn, the more I want to be able to do = X gets more and more interesting and leads to more and more interesting Y and Z. Y and Z were sitting in the Racket docs the whole time, but I wouldn't have recognized their value without a good X.

Matthew Butterick

unread,
Jul 18, 2018, 3:23:45 PM7/18/18
to Joel Dueck, Pollen
On Jul 18, 2018, at 8:19 AM, Joel Dueck <dueck...@gmail.com> wrote:

Reading ‘Beautiful Racket’ really helped me get what #%top was all about. As I mentioned, I was originally trying to attach pollen/top to a parameterized current-namespace for use with eval.

`#%top` is not very beautiful. But sometimes necessary.

`eval` is its own thing. Namespaces are mutable, so you can do successive `eval`s like so:

#lang br
(define ns (make-base-namespace))
(eval '(require pollen/top) ns)
(eval '(foo "bar") ns) ; produces '(foo "bar") rather than error


I guess I wasn't fully digesting what things (specifically attribute lists and empty lists) need to look like before eval takes a crack at them, because eval will try to evaluate inner expressions before outer ones.

That is true of Racket's evaluation model generally: macros are evaluated outermost to innermost (because a macro can potentially change what's inside) and then the fully expanded expressions are evaluated innermost to outermost (and left to right, as we saw in your footnotes tag function a while back)

But yes, I too still have the experience of coding up a solution to something only to find that Racket already contains a more elegant version ;)
Reply all
Reply to author
Forward
0 new messages