udelim package -- more parens than you can shake a stick at

115 views
Skip to first unread message

William G Hatch

unread,
Sep 24, 2016, 12:46:07 PM9/24/16
to racket...@googlegroups.com
Hello everybody,

I'm announcing another little package I've written to get comments on
it: udelim.

Udelim is a library for adding extra parens and string delimiters to
your language.

For many years, before ever coming to racket, I've wanted nestable
string delimiters. Especially when working with web stuff. Now that I
have a programmable programming language, I have them. The big push to
making them now is that lately with rash I've been working on nesting
different syntax with macros and alternative readers. Some weeks ago I
found myself dissatisfied with the method I was using (at-expressions),
and wrote the code for balanced strings instead, and have loved it.

Additionally, I've long wanted more types of parens in Racket. I
haven't really known what I would do with them -- I use Racket's
conventions for () and [], and have my own loose convention for {}. But
after seeing Jay McCarthy's wonderful talk on Remix, I am now blatantly
stealing his idea of wrapping different paren types with a
macro-dispatchable symbol.

So udelim has functions for extending readtables to have more parens and
balanced string delimiters, optionally wrapped up a-la Remix. It also
has a stx-string->port function for convenience in making macros that
read an alternate syntax with those nestable strings. It has a
metalanguage with some delimiters auto-enabled: «» (guillemets, used as
quotes in many European languages) as nestable, non-escaping quotes, 「」
likewise but wrapped so 「foo bar」 reads as (#%cjk-corner-quotes "foo
bar"), and several unicode paren types (﴾﴿, ⦓⦔, ⦕⦖, 🌜🌛, ...) wrapped
with a starter symbol as well.

Some people I've talked to about this package have seemed unhappy with
my introduction of nestable strings, and my recommendation of using them
over the at-expressions for nesting different syntax, so I feel I ought
to explain that a little. First, nestable strings are nice for other
things as well. For instance, since they don't escape backslashes, they
are nice for constructing regexps, which famously explode into mountains
of backslashes due to being inside "" strings. They are a great
alternative to #<<HERE strings when you want something big and
complicated for any purpose. Second, while I love the at-reader for
Scribble, I think it has several drawbacks for nesting different syntax.

• It splits up the string at newlines. Not a huge deal -- you can
reassemble them, but it's a bit of a hassle.

• It expands inner @-exps unless you use |{}|. Again, not a huge deal
as long as you sprinkle in some pipes.

• It removes leading white space on every line. This is a bad thing if
your nested language is whitespace sensitive, like a python or
haskell-style syntax would be. By reading the location data on the
strings you get back, you can probably still reconstruct the original
string, but it's still a hassle.

• After all the at-exp's work splitting and trimming the string, every
macro that uses at-exp output as if it were just a string has to do all
the work of putting it all back together into a string to use as a port.
If you just get a string in the first place, none of that is necessary,
and the macro's interface is simpler.

So I love the at-reader, I just like nestable strings better for this
purpose, and I want nestable strings even without syntax nesting. No
hard feelings. If you still think these nestable strings are a bad
idea, I'd like to hear your reasoning.

I'd love some feedback on what paren types should be included in #lang
udelim (I'm not sure what is widely supported by fonts, sufficiently
distinct, and well liked), and what default macros on them should do.

Also, I've upgraded rash significantly with it, and rewrote the docs on
it so my intentions are a little more clear. It hasn't progressed too
much, as I'm otherwise busy with things I'm actually supposed to be
working on, but if you're curious to see a slightly better presentation
of what I want #lang rash to be, you can take a gander.

Additionally, could someone clue me in to where a good explanation of
language-info stuff is? The languages I've implemented all have
language info lifted off something else, and it's definitely wrong. But
all I've found in my (meager) search for what to do there is that
*something* is supposed to be there, and that it will probably fix my
languages' issues in DrRacket. Or maybe I'm totally misunderstanding
it. Otherwise my languages work fine, so I haven't bothered too much to
correct it, but I would like to... I'm sure the info I need is in the
docs somewhere, but the reference can be rather expansive for reading
unless you know the right search term.


Thanks for your feedback,

William

[ PS, udelim docs are here: http://docs.racket-lang.org/udelim/index.html ]

Dupéron Georges

unread,
Sep 24, 2016, 4:28:27 PM9/24/16
to Racket Users
Le samedi 24 septembre 2016 18:46:07 UTC+2, William G Hatch a écrit :
> Udelim is a library for adding extra parens and string delimiters to
> your language.

I can't tell how ecstatic I am about this :) . I have been wanting to add new parenthesis shapes for a while, but never found the time to look seriously into it, so thank you a lot for writing this! I was thinking about using the white brackets ⟦⟧, braces ⦃⦄ and parentheses ⦅⦆ are commonly used for describing semantics, for example.

I also like the idea of nestable string delimiters, although I rarely have use for it.

One note about the docs: when you write:

(open-input-string
"«this is a string with nested «string delimiters.» No \n escape interpreting.»")

the "\n" is already escaped by the "…" fed into open-input-string I think, so what udelim parses in that example is a raw newline, not the \ character followed by the n character.

Scribble supports "element transformers" which allow to change how an identifier is printed. Unfortunately, when the identifier appears in the first position of a form (like the #% wrappers), only the identifier itself can get styled, not the whole form. I added a few days ago a quick hack to my unstable scribble-enhanced library to add catch-alls which can re-style any identifier matching a given pattern. The hack [1] should also work for whole forms (untested, though), so that in scribble or scribble/lp2, @racketblock[(a ⟦b⟧ c)] would be properly typeset.

Georges

[1] The hack in scribble-enhanced https://github.com/jsmaniac/scribble-enhanced/blob/master/racket.rkt#L1012
[2] Example using the hack to nicely typeset identifiers with numeric superscripts like String³, which means "three strings in a row" for the xlist type expander https://github.com/jsmaniac/xlist/blob/master/scribble-enhanced.rkt

Alex Knauth

unread,
Sep 24, 2016, 5:33:21 PM9/24/16
to William G Hatch, racket...@googlegroups.com

> On Sep 24, 2016, at 12:46 PM, William G Hatch <wil...@hatch.uno> wrote:

> Additionally, I've long wanted more types of parens in Racket. I
> haven't really known what I would do with them -- I use Racket's
> conventions for () and [], and have my own loose convention for {}. But
> after seeing Jay McCarthy's wonderful talk on Remix, I am now blatantly
> stealing his idea of wrapping different paren types with a
> macro-dispatchable symbol.
>
> So udelim has functions for extending readtables to have more parens and
> balanced string delimiters, optionally wrapped up a-la Remix. It also
> has a stx-string->port function for convenience in making macros that
> read an alternate syntax with those nestable strings. It has a
> metalanguage with some delimiters auto-enabled: «» (guillemets, used as
> quotes in many European languages) as nestable, non-escaping quotes, 「」
> likewise but wrapped so 「foo bar」 reads as (#%cjk-corner-quotes "foo
> bar"), and several unicode paren types (﴾﴿, ⦓⦔, ⦕⦖, 🌜🌛, ...) wrapped
> with a starter symbol as well.

While I like the idea of more types of parens for meaning different things, I'm not sure whether having a #%cjk-corner-quotes -ish macro *expected to be defined* for every one of them is the best way to do it, or whether we should be trying to think of a better way to introduce the distinction.

The way racket already does this is with a 'paren-shape syntax property, which you can ignore if you want to use 「」 as a normal visually distinctive paren type *without* needing a special macro with a weird name.

Now the problem with 'paren-shape is that everything ignores it. But that could easily change if you had a syntax-parse pattern to check the property for you.

But a different problem occurs with the remix #%brackets convention. A form like [a b c] would match the pattern (a ...), which doesn't seem like an ideal default. With a 'paren-shape syntax property and a syntax-parse pattern expander, the pattern (a ...) could mean only match syntax lists with a 'paren-shape property of #\( .

Are there any alternative ways to solve these problems?

Alex Knauth

Eli Barzilay

unread,
Sep 25, 2016, 12:57:04 AM9/25/16
to William G Hatch, Racket Users
On Sat, Sep 24, 2016 at 12:46 PM, William G Hatch <wil...@hatch.uno> wrote:
>
> First, nestable strings are nice for other things as well. For
> instance, since they don't escape backslashes, they are nice for
> constructing regexps, which famously explode into mountains of
> backslashes due to being inside "" strings. They are a great
> alternative to #<<HERE strings when you want something big and
> complicated for any purpose.

You could just as well replace that with "Eli bait"... Yes, the
@-syntax is addressing exactly these kind of things.


> Second, while I love the at-reader for Scribble, I think it has
> several drawbacks for nesting different syntax.
>
> • It splits up the string at newlines. Not a huge deal -- you can
> reassemble them, but it's a bit of a hassle.

There are two reasons is splits strings:

1. Doing that adds information about newlines (hence you know more about
how the expression was written), and about parts that are indentation.

2. More importantly, since you can nest @-expressions, they must
translate to separate things -- for example, in @foo{...@bar{...}...}
you can't combine the result of (bar "...") with its surrounding,
*unless*
a. You assume that all values are strings
b. You're willing to have a planeted `string-append` in the result of
reading an @-form (and this is a big problem: both confusing, and
depending on having a `string-append` in your language)
c. You're willing to take the big GC hit of representing values using
strings only (which is almost always there, except maybe when you
use an all-literal string)


> • It expands inner @-exps unless you use |{}|. Again, not a huge deal
> as long as you sprinkle in some pipes.

Yes it does! And that's a *very* important feature.

I'm guessing that if you read "|{...}|" as some "⟦...⟧" then you get the
same thing. Only instead of choosing from a bunch of parens I wanted to
make it possible (and easy) to always find a new delimiter.


> • It removes leading white space on every line. This is a bad thing if
> your nested language is whitespace sensitive, like a python or
> haskell-style syntax would be. By reading the location data on the
> strings you get back, you can probably still reconstruct the
> original string, but it's still a hassle.

Again, this is a very intentional design that should work well in almost
all cases, including (and especially!) producing code for some
indentation sensitive language like python.

The thing is that without this feature, you cannot create code easily
unless you're creating the whole global code, maybe analogous to how
macros are much more convenient since they're local, and not global
source transformers. For example, say that you want to write this
helper in your python-generating code:

(define (generate-foo^2 expr)
@list{foo = @expr
return foo*foo})

If @-forms wouldn't have ignored indentation, you'd need a whole bunch
of complications around this:

- Can't just use @expr without adjusting it in case it has newlines

- Can't use the resulting string without adjusting its newlines to
whatever indentation is the context in which you put it in.

- And of course you need some way to deal with your own source code mess
resulting from indentation no longer being an indication of your
source structure.

(Note, BTW, that @-forms do have at least two ways to "force" including
any whitespace if that's what you *really* want.)


> • After all the at-exp's work splitting and trimming the string, every
> macro that uses at-exp output as if it were just a string has to do
> all the work of putting it all back together into a string to use as
> a port. If you just get a string in the first place, none of that
> is necessary, and the macro's interface is simpler.

I'm not sure I see what this is saying. You can't always have a string
at the macro level because there are nested *expressions*. At best,
you'll need to have a stream of characters with specials planted in it
which would provide the string representation of their values -- at
runtime. If you want to use it at the macro level, without any
expressions -- ie, just handle literal strings -- then things are very
simple: the 'scribble syntax property on newlines will have a string
holding both the newline and the indentation that follows the newline.


> So I love the at-reader, I just like nestable strings better for this
> purpose, and I want nestable strings even without syntax nesting. No
> hard feelings. If you still think these nestable strings are a bad
> idea, I'd like to hear your reasoning.

It's your use of "nestable" here that seems to me like it's making
things bogus. If you really want it to be nested, then this is exactly
what the scribble syntax is doing -- to an extreme. But if all you want
is to be able to nest the string *delimiters*, than any string delimiter
pair that is not using the same character will do... but more
importantly I don't see any cases where you would want *that* but not
the rest of the scribble syntax features.

--
((x=>x(x))(x=>x(x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!

William G Hatch

unread,
Sep 25, 2016, 3:35:13 AM9/25/16
to Eli Barzilay, Racket Users
First of all, I really didn't mean any offense. I think the at-reader
and my nestable string delimiters are trying to solve slightly different
problems, and I didn't really convey that well. I didn't mean for it to
be "Eli bait". Let me explain my use case a little, and maybe my
earlier mail will seem less baiting in the context I had intended,
albeit poorly communicated.

>It's your use of "nestable" here that seems to me like it's making
>things bogus. If you really want it to be nested, then this is exactly
>what the scribble syntax is doing -- to an extreme. But if all you want
>is to be able to nest the string *delimiters*, than any string delimiter
>pair that is not using the same character will do... but more
>importantly I don't see any cases where you would want *that* but not
>the rest of the scribble syntax features.

So I really do just mean that the string delimiters themselves nest --
IE it balances them so that the string doesn't necessarily end once it
hits an ending delimiter. So yes, any two characters will do for the
job. That in itself is something that I've wanted independent of
anything else, so for me that was a good enough reason to make this, and
is something that I'll use it for, to avoid things like \", which have
always irked me (whether or not it's reasonable that that should bother
me).

As for other aspects of nesting and my other uses of these strings, the
at-reader makes perfect sense if you want to nest various expressions
inside a string that needs to remain a string at run-time, such as is
done in scribble. But in my case that's not what I want for, say, #lang
rash, or other nebulous embedded language ideas I have floating around
my mind. Basically, I want to have a macro that will be fully in charge
of determining the meaning of the string, and I want to be able to use
the same reader functions in the macro that I use in the #lang. But to
use the same read-syntax function that my language uses, I need a port
to run it on. To make this port, I really just need a string with no
pre-read syntax objects inside it. So in this case I don't want the
top-level reader of whatever #lang I'm in to look in the string, I just
want the macro to be able to use read-syntax on the full string. If the
middle of the string has already been read into syntax objects, my
reader functions would be much more complicated to write (IE I'd have to
figure out how to deal with the port ending in the middle of a
parenthesised expression or something, then use a pre-read syntax
object, then jump back into reading the next section that remained a
string while conveying whatever context I was in in the last string
segment...). And the string splitting, which is as you've shown quite
helpful in many cases, would in this case simply be something that I
would have to undo, which as you point out would be a bit of a waste.

So I see the difference as being that in uses like scribble, the bold
procedure isn't trying to use a reader on its arguments, and the
at-reader needs to have split them up and turned the nested expressions
into s-expressions already for them to have their intended meaning.
Whereas the macros I want to make want to just have a string to turn
into a port. It could potentially be that my plans for these macros are
misguided, but I have liked the results I've gotten so far and feel like
it has promise.

Here is an example of the sort of thing I've been doing with it:

;; Here is my rash macro
(define-syntax (rash stx)
(syntax-parse stx
[(rash arg:str)
;; Note that since I just get one string, it is easy to turn it into a port
;; and use my read function on it.
(with-syntax ([(parg ...) (map (λ (s) (replace-context #'arg s))
(syntax->list
(rash-read-syntax* (syntax-source #'arg)
(stx-string->port #'arg))))])
;; rash-line-parse is what the #%module-begin of rash uses around everything.
#'(rash-line-parse parg ...))]))
;; So basically the rash macro does exactly the same thing as #lang rash,
;; but is embeddable in #lang whatever!

Is this the best way of going about it? I don't know. But it's easier
than the way I was going about it before. Something I like about this
method is that I could nest several of these macros into each other, and
each one can do the reading however it sees fit, as long as at each
level I can pass an appropriate string to the next level down. Maybe
the languages have very different views on which characters do something
special (or specifically should not do something special), including
flag characters like @ (or any one you choose at the top level or a
higher level up in the nesting).

For example, something like this could happen:

(define some-output
(rash/out
«some-query $(first
(python-ish-list-comprehend
«machine for i in machine-list where should-i-query(i)»))
$(make-query
«this is a bogus example that I'm really stretching
for, but maybe this is some nice syntax for some
sort of query producing dsl? And maybe it has some
macro in it in whatever its syntax is to
(go-a-level-deeper «in this nonsense ...»)
But importantly, no top-level reader has to know or
care what the syntax here is, nor the rash reader,
nor any reader in between, aside from simply
preserving it as a string, which I can hopefully do
in most any language.»)»))

I'm a little hard pressed to come up with examples I haven't thought of
remotely concretely yet, but it seems to me that it's much easier to
have these sort of #lang-embedding macros that do their own reading if
you just have a simple string. Nesting the delimiters is just
convenient for nesting syntax for a call to another such macro in the
top level string.

>2. More importantly, since you can nest @-expressions, they must
> translate to separate things -- for example, in @foo{...@bar{...}...}
> you can't combine the result of (bar "...") with its surrounding,
> *unless*
> a. You assume that all values are strings
> b. You're willing to have a planeted `string-append` in the result of
> reading an @-form (and this is a big problem: both confusing, and
> depending on having a `string-append` in your language)
> c. You're willing to take the big GC hit of representing values using
> strings only (which is almost always there, except maybe when you
> use an all-literal string)

So in my example, I would probably have something like
(foo «...(bar «...»)...»)
and foo would be responsible for reading the string that contains bar,
and the argument of bar would have to be a string in foo's language that
contains source code for bar's language (supposing bar is another one of
these language nesting macros). So that is why in this case these
nested strings seem good to me. Those bullet points of bad things about
reconstructing one string are some of the things I'm trying to avoid by
just having a plain string in the first place. They are some of the
things that frustrated me when I was trying to write these macros using
the at-reader, and I know at least one other person is patching together
strings produced by the at-reader to make a port to achieve a similar
macro.

>> Second, while I love the at-reader for Scribble, I think it has
>> several drawbacks for nesting different syntax.

Here I should have been more clear: I mean that I think it has
drawbacks for nesting syntax using the sort of macros that I am trying
to write. It's clearly great for nesting different syntax in Scribble
documents! It's probably great for other things that I'm not trying to
do right now, or maybe it's even great for a better way of doing the
things I'm doing that has thus far escaped me.

I'm really sorry if my initial mail came off as offensive or aggressive
against the at-reader, because I really think it's great. It's just
that it doesn't seem to be the tool best suited to my particular need,
and when I had had some verbal conversations with people I had mentioned
that and they didn't see why, so I wanted to include some reasoning for
why I didn't want the at-reader for this purpose. But clearly the
use-case I had in mind was not clear in my previous email. I guess I
should have given one of my examples from this mail in my original post.
Communication has never been a great strength of mine.

But yeah, if what I'm doing still seems dumb with this explanation, feel
free to let me know my method's weaknesses.

Thanks,
William

William G Hatch

unread,
Sep 25, 2016, 3:40:10 AM9/25/16
to Dupéron Georges, Racket Users
>One note about the docs: when you write:
>
>(open-input-string
> "«this is a string with nested «string delimiters.» No \n escape interpreting.»")
>
>the "\n" is already escaped by the "…" fed into open-input-string I think, so what udelim parses in that example is a raw newline, not the \ character followed by the n character.

Thanks, I clearly wasn't thinking about that very hard while writing the
docs. Fixed.

>Scribble supports "element transformers" which allow to change how an identifier is printed. Unfortunately, when the identifier appears in the first position of a form (like the #% wrappers), only the identifier itself can get styled, not the whole form. I added a few days ago a quick hack to my unstable scribble-enhanced library to add catch-alls which can re-style any identifier matching a given pattern. The hack [1] should also work for whole forms (untested, though), so that in scribble or scribble/lp2, @racketblock[(a ⟦b⟧ c)] would be properly typeset.

Thanks. I didn't put too much thought into the typesetting so far
(apologies to Mr. Butterick), but I'll look into that more when the
library stabilizes more.

William G Hatch

unread,
Sep 25, 2016, 3:56:00 AM9/25/16
to Alex Knauth, racket...@googlegroups.com
On Sat, Sep 24, 2016 at 05:33:18PM -0400, Alex Knauth wrote:
>The way racket already does this is with a 'paren-shape syntax property, which you can ignore if you want to use 「」 as a normal visually distinctive paren type *without* needing a special macro with a weird name.

I hadn't thought about the 'paren-shape property. I should put that on.
I hadn't thought as much about these paren shapes being used to be
distinguished in other macros (eg. macro foo will do something different
if its argument is wrapped in bold brackets or in moon faces), but had
thought more about either just having them be normal parens or making
them be transformers for a fancy macro shorthand - eg. 〘+ _ 3〙might be
a shorthand lambda wrapper or something. But at the same time, I don't
see how matching on the 'paren-shape property is any better or worse
than matching funky #%paren-shape lists.

So... yeah. I should definitely add the 'paren-shape property, and I'll
make that change. And I don't want all paren shapes to create a
#%paren-shape wrapper. But I found #%braces part of Jay's talk to be
persuasive, at least in that I definitely want it on some of my parens.
So maybe some unicode paren turned on by #lang udelim should have the #%
wrapper and others not.

Thanks for your thoughts.

William

Eli Barzilay

unread,
Sep 25, 2016, 5:10:36 AM9/25/16
to William G Hatch, Racket Users
On Sun, Sep 25, 2016 at 3:34 AM, William G Hatch <wil...@hatch.uno> wrote:
> First of all, I really didn't mean any offense. I think the at-reader
> and my nestable string delimiters are trying to solve slightly
> different problems, and I didn't really convey that well. I didn't
> mean for it to be "Eli bait". Let me explain my use case a little,
> and maybe my earlier mail will seem less baiting in the context I had
> intended, albeit poorly communicated.

To be clear, no offense taken -- I'm only considering the technical
reasons to want yet another string delimiter and the technical aspects
of what you get. "Eli bait" is not because I take it personally, it's
just because I spent a ton of time thinking about such problems, and I'd
hate to see people fall into the expected traps when following
"traditional" solutions, not seeing how the @-syntax already provides a
nice solution. (I suspected that you'd fall into them, and later in
this email you indeed do...)


> So I really do just mean that the string delimiters themselves nest --
> IE it balances them so that the string doesn't necessarily end once it
> hits an ending delimiter. So yes, any two characters will do for the
> job. That in itself is something that I've wanted independent of
> anything else, so for me that was a good enough reason to make this,
> and is something that I'll use it for, to avoid things like \", which
> have always irked me (whether or not it's reasonable that that should
> bother me).

Note that the scribble syntax starts with that as a basic feature. That
is, if you replace {}s with something else like «», then inside a quoted
context @«...» any matching «»s are ignored.


> Basically, I want to have a macro that will be fully in charge of
> determining the meaning of the string, and I want to be able to use
> the same reader functions in the macro that I use in the #lang.

Same here: this is one of the uses cases I described many times for the
scribble syntax: start with a macro that uses a parser to parse the
whole thing, then refine by making it parse just one form and start
besting @-forms, and eventually get to a point when you have a new #lang
implementation.


> But to use the same read-syntax function that my language uses, I need
> a port to run it on. To make this port, I really just need a string
> with no pre-read syntax objects inside it. So in this case I don't
> want the top-level reader of whatever #lang I'm in to look in the
> string, I just want the macro to be able to use read-syntax on the
> full string.

Yes, and you can do all of that with just a string, which you can still
get from an @-form -- just throw a syntax error if it's not all strings.
And with just that you get the *benefit* of ignoring indentation which
makes it possible to use your syntax in a sane way.


> If the middle of the string has already been read into syntax objects,
> my reader functions would be much more complicated to write (IE I'd
> have to figure out how to deal with the port ending in the middle of a
> parenthesised expression or something, then use a pre-read syntax
> object, then jump back into reading the next section that remained a
> string while conveying whatever context I was in in the last string
> segment...).

Right -- that's what I mentioned as "throw a syntax error" above. But
it would be a good idea to think about why it's a syntax error, which
would in most cases get you closer to something that is a composable
language.


> And the string splitting, which is as you've shown quite helpful in
> many cases, would in this case simply be something that I would have
> to undo, which as you point out would be a bit of a waste.

What I view as a waste is not just undoing the indentation elimination
-- it's the idea of a form where you want indentation to matter at the
semantic runtime level. To clarify, my opinion is that

(foo bar
baz)

and

(foo bar
baz)

should be the same. @-expressions follow that; (traditional)
here-strings do not. Because here-strings do not follow that, you end
up writing ugly code like

(foo #<<FOO
blah
FOO
baz)

which is a mess that might make your readers' eyes bleed. The usual
shell-way-out is to just use it in a context that is insensitive for
spaces so you end up writing

(foo #<<FOO
blah
FOO
baz)

and hope that `foo` will overlook the spaces that are all still there.


> So I see the difference as being that in uses like scribble, the bold
> procedure isn't trying to use a reader on its arguments, and the
> at-reader needs to have split them up and turned the nested
> expressions into s-expressions already for them to have their intended
> meaning.

*Don't* confuse scribble-the-documentation-system with the syntax -- the
syntax is useful for many other cases, and designed to make sense in
other cases. See my description (specifically section 4, which is very
relevant here), and the scribble/text and scribble/html languages.


> [...]
> ;; So basically the rash macro does exactly the same thing as #lang
> ;; rash, but is embeddable in #lang whatever!

Yes, and you can get the same with any string, and

(rash "stuff
in a different
language")

The next problem is quoting and backslash hell, so you want a better
quotation, and (I'm guessing) you end up with something like

(rash «stuff
in a different
language»)

using @-forms would be a tiny delta for the implementation -- basically
just a string-append (actually, not even a delta since your macro
already allows multiple strings), and the use is more convenient:

@rash{stuff
in a different
language}


> each one can do the reading however it sees fit, as long as at each
> level I can pass an appropriate string to the next level down.

The only tricky bit here is that if you want to deal with only strings
and at the same time maintain a syntax-time parsing of strings, then you
need to do this whole multi-level collapsing as a macro thing, which
means no runtime expressions.


> Maybe the languages have very different views on which characters do
> something special (or specifically should not do something special),
> including flag characters like @ (or any one you choose at the top
> level or a higher level up in the nesting).

Note that the scribble syntax uses "@" by default, but it's easy to
change, as Matthew B. did with pollen.


> For example, something like this could happen:
>
> (define some-output
> (rash/out
> «some-query $(first
> (python-ish-list-comprehend
> «machine for i in machine-list where should-i-query(i)»))
> $(make-query
> «this is a bogus example that I'm really stretching
> for, but maybe this is some nice syntax for some
> sort of query producing dsl? And maybe it has some
> macro in it in whatever its syntax is to
> (go-a-level-deeper «in this nonsense ...»)
> But importantly, no top-level reader has to know or
> care what the syntax here is, nor the rash reader,
> nor any reader in between, aside from simply
> preserving it as a string, which I can hopefully do
> in most any language.»)»))

And here you're falling into the trap I mentioned above. You're trying
to use "$" as an escape, but, for example, what happens if you want to
escape a single identifier and not an expression? Anyway, here's the
same thing using the scribble syntax:

(define some-output
@rash/out{
some-query @(first
@python-ish-list-comprehend{
machine for i in machine-list where should-i-query(i)})
@make-query{
this is a bogus example that I'm really stretching
for, but maybe this is some nice syntax for some
sort of query producing dsl? And maybe it has some
macro in it in whatever its syntax is to
(go-a-level-deeper {in this nonsense ...})
But importantly, no top-level reader has to know or
care what the syntax here is, nor the rash reader,
nor any reader in between, aside from simply
preserving it as a string, which I can hopefully do
in most any language.}})

Note that at the superficial concrete level what this does is (a)
eliminates the need for some "$" escape character, and (b) reduces the
double-delimiter (foo «...») that you often use into a single delimiter
form of @foo{...}. This latter point is subtly important: users that
write your version need to be aware of both sexpr syntax and string
syntax and how they combine, whereas users that write my version have a
single delimiter. This is combined with the fact that "@" means the
same at all level in the scribble syntax -- that simplifies things
further by removing the need for some $-like escape construct for
interpolation. These two features make it easier to comprehend the new
thing as a new syntax rather than be aware of some places that are
texts, some that are not, and the ways to combine this all in code.

It might help to compare this with JS's "template literals", where the
equivalent of the scribble form:

@foo{... @bar{... @baz ...} ...}

is possible to write -- it's even not that hard: just replace @X{_} with
"${X(`_`)}" except at the toplevel:

foo(`... ${bar(`... @{baz} ...`)} ...`)

but given how obfuscated the result is (specifically: you now have three
paired delimiters: {}s for interpolation, ()s for arguments, and ``s for
text), almost nobody will actually write such code.

Note also that in *both* of these cases you need to deal with the
problem of `first` being a Racket runtime binding that gets used at the
syntax level if you want to stick with `rash/out` doing its parsing as a
macro implementation -- that's a problem that is inherently there
regardless of concrete syntax. But the scribble syntax provides an easy
way to handle this: you just need to define `rash/out` as a function (so
do the parsing at runtime), and the result plays much nicer with any
language it's used in.


> I'm a little hard pressed to come up with examples I haven't thought
> of remotely concretely yet, but it seems to me that it's much easier
> to have these sort of #lang-embedding macros that do their own reading
> if you just have a simple string.

At this point I'm guessing that you'd still not be convinced. So maybe
phrase this as a challenge: see if you can come up with an actual
example where the scribble syntax won't do what you want, or examples
that you're not sure how the scribble syntax would look like. Maybe
doing this will lead to further enlightment. (Feel free to email me
such examples off-list.)


> I'm really sorry if my initial mail came off as offensive or
> aggressive against the at-reader, because I really think it's great.
> It's just that it doesn't seem to be the tool best suited to my
> particular need, [...]

I'm not trying to defend the scrible syntax's honor -- I'm only trying
to show you why it solves your particular needs too... As a long
postfix comment, the personal angle here is that I've been obsessed with
such problems for a *very* long time. At some point I've had a system
that I'm guessing would interest you: I've basically made a language
where <<>> would delimit meta-level text to be run and its output used
instead, then I made it so each of these could specify its own language
so something like <<TCL: ...>> would run in a tcl interpreter (it was
before python became popular), and you could nest any number of levels
(using a different language in each). I've used some shreds of that
years later in mzpp which is a much more simplistic language, and closer
to a traditional template system. And since you've mentioned changing
the "@" character at every level, I've also implemented something on
that side of the spectrum -- mztext -- where the resulting language is
more tex-like where you can do arbitrary parsing at each nested level
down to changing how "functions" are written and arguments are parsed,
and (just like tex) the result is pretty powerful in what it can do, but
nobody ever needs that complication which makes the whole thing useless
for practical purposes. The scribble syntax is the last thing I had in
that long list of experiments which finally made sense at all levels,
and was both something that was easy to understand yet flexible to
express any of the complicated needs.

Alex Knauth

unread,
Sep 25, 2016, 9:33:57 AM9/25/16
to William G Hatch, Jay McCarthy, racket...@googlegroups.com

> On Sep 25, 2016, at 3:55 AM, William G Hatch <wil...@hatch.uno> wrote:
>
> On Sat, Sep 24, 2016 at 05:33:18PM -0400, Alex Knauth wrote:
>> The way racket already does this is with a 'paren-shape syntax property, which you can ignore if you want to use 「」 as a normal visually distinctive paren type *without* needing a special macro with a weird name.
>
> I hadn't thought about the 'paren-shape property. I should put that on.
> I hadn't thought as much about these paren shapes being used to be
> distinguished in other macros (eg. macro foo will do something different
> if its argument is wrapped in bold brackets or in moon faces), but had
> thought more about either just having them be normal parens or making
> them be transformers for a fancy macro shorthand - eg. 〘+ _ 3〙might be
> a shorthand lambda wrapper or something.

The usual way to do this is with #%app, but you would be right to point out that it shouldn't be #%app's job to handle fancy lambda shorthands, and it wouldn't work properly if it was a macro call.

Here's an idea, what if the macro expander introduced just one more #%app-like form in front of every expression? It's called #%group here but a better name would be better.

(+ 1 2)
--->
(#%group + 1 2)
--->
(#%app + 1 2)
--->
(#%app + (#%datum . 1) (#%datum . 2))
--->
(#%app + (quote 1) (quote 2))

The #%group macro would be able to look at the 'paren-shape property and decide either to expand to #%app, to expand to a lambda shorthand, or delegate to some other macro, based on which character it sees as the value of the property.

This introduces one new macro-that-needs-to-be-defined instead of the dozens of different ones you would need for the different delimiters, but it gives the #%group macro the power to dispatch on the 'paren-shape property of the syntax object. This dispatching could delegate to macros like #%braces when the 'paren-shape property is #\{, giving Remix what it wants, but these special-cases would be handled by the #%group macro at expansion time instead of by the reader.

I think it needs to be at expansion time because otherwise {1 2 3} looks like a 4-element list before expansion. These would seem to have very weird behavior under quote, and it will look weird to any other macro that doesn't explicitly look for the #%braces, #%brackets, #%cjk-corner-quotes, etc. symbols.

To have add the #%group macro would require either extending the macro expander or having the #%module-begin macro introduce it by simulating a macro expander. I might be misunderstanding, but does Remix already do the latter?

Alex Knauth

Matthew Butterick

unread,
Sep 25, 2016, 2:19:21 PM9/25/16
to Eli Barzilay, William G Hatch, Racket Users

On Sep 25, 2016, at 2:10 AM, Eli Barzilay <e...@barzilay.org> wrote:
> *Don't* confuse scribble-the-documentation-system with the syntax -- the
> syntax is useful for many other cases, and designed to make sense in
> other cases. See my description (specifically section 4, which is very
> relevant here), and the scribble/text and scribble/html languages.

To be fair, the documentation invites this kind of confusion. All the material about the at-reader is within the Scribble docs. This makes it look like it's dependent on Scribble, when really it's a separate thing.

In general, I think the word "Scribble" is misleadlingly overloaded within Racket. IMO "Scribble" should refer only to the family of languages that use the Scribble document model, including Racket documentation.

At some point the docs for "Scribble as a Preprocessor" were broken out from the main Scribble docs — I'm guessing to emphasize that they're conceptually separate from Scribble. But AFAICT what they really have in common is the at-reader, not the document model. Because they don't use the Scribble document model, I'm unclear why they're called `scribble/text` and `scribble/html`.

Meanwhile, I'd argue that the at-reader — itself an obsolete name, since one can swap out the @ for any Unicode char — deserves to have its documentation broken out into a separate top-level section, which would more accurately reflect its status within Racket.



> So maybe phrase this as a challenge: see if you can come up with an actual
> example where the scribble syntax won't do what you want,

When I started out with at-expressions, I too resisted some of the conventions. But as I worked with more complicated cases, I came to understand the wisdom of Eli's design choices. At this point I have two lingering wishes:

+ String splitting within {...} delimiters: I agree this is the right default behavior, but it doesn't seem unreasonable to wish for shorthand for when you really do want things concatenated into a single argument, given that Racket is full of cognates like let/let*, for/for*, list/list*, etc. That said, I don't have a good idea what the notation would be.

+ I wish at-expressions could use multiple [...] and {...} parts, in any order.

William G Hatch

unread,
Sep 25, 2016, 4:21:56 PM9/25/16
to Eli Barzilay, Racket Users
On Sun, Sep 25, 2016 at 05:10:27AM -0400, Eli Barzilay wrote:
>To be clear, no offense taken

That's good. After I read "Eli bait" my mind took the rest as having
an annoyed tone, probably from reading too many online flame wars.
It's hard to tell people's emotions in text.

I think ultimately we just disagree on what features we want, but I'd
like to clarify a few misunderstandings:

>Yes, and you can do all of that with just a string, which you can still
>get from an @-form -- just throw a syntax error if it's not all strings.
>And with just that you get the *benefit* of ignoring indentation which
>makes it possible to use your syntax in a sane way.

I was doing that before, and I just didn't see that as a benefit.

>Yes, and you can get the same with any string, and
>
> (rash "stuff
> in a different
> language")
>

Yeah, if you look at the rash docs you'll see that I have examples that
do exactly that. It's kind of the point that the macro can just use any
string. All the nestable string delimiters bring to the table for these
macros is that it makes it easier to nest them without crazy escapes.

>using @-forms would be a tiny delta for the implementation -- basically
>just a string-append (actually, not even a delta since your macro
>already allows multiple strings), and the use is more convenient:
>
> @rash{stuff
> in a different
> language}

Well, my current macro doesn't allow multiple strings as you state,
but my previous macro when I was using at-expressions was exactly like
what you have there.

>The only tricky bit here is that if you want to deal with only strings
>and at the same time maintain a syntax-time parsing of strings, then you
>need to do this whole multi-level collapsing as a macro thing, which
>means no runtime expressions.

I do get runtime expressions. I'm parsing the inner strings into
syntax objects at macro expansion time, and some of those end up being
themselves macro calls, and some of them are just normal expressions
that are evaluated at runtime.

>> Maybe the languages have very different views on which characters do
>> something special (or specifically should not do something special),
>> including flag characters like @ (or any one you choose at the top
>> level or a higher level up in the nesting).
>
>Note that the scribble syntax uses "@" by default, but it's easy to
>change, as Matthew B. did with pollen.

Yes, I'm aware, but any character you choose ends up being a magic
character through each nested level unless you use |{}|, which I
didn't want.


>> For example, something like this could happen:
>>
>> (define some-output
>> (rash/out
>> «some-query $(first
>> (python-ish-list-comprehend
>> «machine for i in machine-list where should-i-query(i)»))
>> $(make-query
>> «this is a bogus example that I'm really stretching
>> for, but maybe this is some nice syntax for some
>> sort of query producing dsl? And maybe it has some
>> macro in it in whatever its syntax is to
>> (go-a-level-deeper «in this nonsense ...»)
>> But importantly, no top-level reader has to know or
>> care what the syntax here is, nor the rash reader,
>> nor any reader in between, aside from simply
>> preserving it as a string, which I can hopefully do
>> in most any language.»)»))
>
>And here you're falling into the trap I mentioned above. You're trying
>to use "$" as an escape, but, for example, what happens if you want to
>escape a single identifier and not an expression? Anyway, here's the
>same thing using the scribble syntax:

The $ is an escape that rash has due to its design (because it
essentially quotes everything that doesn't have $), and is not
something that's generally necessary for any nested language that just
uses strings. For example:

;; starting in normal racket syntax, but with «» for convenience
(filter foo?
(python-ish-list-comprehend
«thing for x in sqlish(«select * from foo») where some_pred(x)»))

The example again is silly, but syntactically it needs neither $ nor @
nor any other magic character. The «» nesting quotes are just
convenient to avoid \" nonsense (and \\", \\\\" if there were more
nesting).

As for $ in rash, I chose $ because I'm giving it *some* similarities
to eg. bash. To escape just an identifier in rash (as opposed to a
larger expression in parens), you can use $id, which you'll see if you
look at the examples in theh rash docs. As an aside, eventually I
plan on changing it so it creates a macro that will do different
things than just escape, eg. $CAPS looks up environment variables
rather than normal variables, and maybe $«*.ext» expands globs, etc.
But my point is that the $ is a feature of the rash language that I
want, not some added complexity that I would want to avoid. So I'm
not saying I want a different @-like character for each level down,
but rather that I don't need any @-like character in the general case
(just string delimiters, either normal or preferably nestable), and
for rash in particular I want a magic character that does something
different than what the @-reader magic character does.

>
> (define some-output
> @rash/out{
> some-query @(first
> @python-ish-list-comprehend{
> machine for i in machine-list where should-i-query(i)})
> @make-query{
> this is a bogus example that I'm really stretching
> for, but maybe this is some nice syntax for some
> sort of query producing dsl? And maybe it has some
> macro in it in whatever its syntax is to
> (go-a-level-deeper {in this nonsense ...})
> But importantly, no top-level reader has to know or
> care what the syntax here is, nor the rash reader,
> nor any reader in between, aside from simply
> preserving it as a string, which I can hopefully do
> in most any language.}})

I think you missed one transformation:
@go-a-level-deeper{in this nonsense ...}

>
>Note that at the superficial concrete level what this does is (a)
>eliminates the need for some "$" escape character, and (b) reduces the
>double-delimiter (foo «...») that you often use into a single delimiter
>form of @foo{...}. This latter point is subtly important: users that
>write your version need to be aware of both sexpr syntax and string
>syntax and how they combine, whereas users that write my version have a
>single delimiter. This is combined with the fact that "@" means the
>same at all level in the scribble syntax -- that simplifies things
>further by removing the need for some $-like escape construct for
>interpolation. These two features make it easier to comprehend the new
>thing as a new syntax rather than be aware of some places that are
>texts, some that are not, and the ways to combine this all in code.

I don't see having two delimiters as a serious issue. I regularly
have lines of code that end in stuff like ))]))]]) and have no issues.
Keeping track of what's in the string (which is just normal syntax for
the nested language) seems natural and easy to me, and no more
difficult than keeping track of the difference between what is in and
out of {} delimiters. And as for a $ escape character in a shell
language syntax being confusing, I think that is maybe the most
natural shell syntax choice I've made for rash for anyone coming to it
from bash.

>Note also that in *both* of these cases you need to deal with the
>problem of `first` being a Racket runtime binding that gets used at the
>syntax level if you want to stick with `rash/out` doing its parsing as a
>macro implementation -- that's a problem that is inherently there
>regardless of concrete syntax. But the scribble syntax provides an easy
>way to handle this: you just need to define `rash/out` as a function (so
>do the parsing at runtime), and the result plays much nicer with any
>language it's used in.

No, I *definitely* want rash and rash/out to be macros. I don't want to
do any runtime parsing and evaluation for them. My current rash macro
is perfectly capable of having runtime variable references inside the
syntax that is nested in a string -- they all become syntax objects
within the context the string was in (there are examples of this in the
rash docs). So using `first` in the string isn't really different than
using `first` in a syntax template after the string is read in.

>At this point I'm guessing that you'd still not be convinced. So maybe
>phrase this as a challenge: see if you can come up with an actual
>example where the scribble syntax won't do what you want, or examples
>that you're not sure how the scribble syntax would look like. Maybe
>doing this will lead to further enlightment. (Feel free to email me
>such examples off-list.)

It's not that what I'm doing couldn't be done with the at-reader -- I
*was* using the at-reader to do this until recently. It was working, I
just didn't like it as much, and was frustrated with some of the details
(I had to use |{}| to not get non-strings, I had to stitch strings back
together to read them in a port, I didn't want the leading space
elimination...). I came to what I'm doing now because I like it better,
and I am personally of the opinion that it's better for what I'm doing.
I've no doubt that the @ syntax is powerful and useful (in and out of
scribble). I think we are just of two differing opinions on the matter
what features are best for the type of thing I'm doing.

I don't doubt your experience or expertise. It's just that I tried
the rash macros both with the @-reader and with just normal strings,
got it working both ways, and liked normal strings better. Nestable
string delimiters are just a little sugar on top, which I would have
made sooner or later for other purposes regardless of whether I'm
using the at-reader or not.

So I think we just have a simple disagreement as to which features are
benefits or hinderances.

Thanks,
William

Dupéron Georges

unread,
Sep 25, 2016, 4:50:51 PM9/25/16
to Racket Users, e...@barzilay.org
If I understand you well, the intended use of your nested delimiters can be more or less described as syntactic sugar for #reader, with auto-detection of where the string ends:

(filter foo?
(python-ish-list-comprehend
«thing for x in sqlish(«select * from foo») where some_pred(x)»))

could be rewritten as:

(filter foo?
#reader"python-ish-list-comprehend.rkt" thing for x in #reader"sqlish.rkt" select * from foo<READER 2 STOPS HERE> where some_pred(x)<READER 1 STOPS HERE>

--
Georges

Philip McGrath

unread,
Sep 25, 2016, 5:29:19 PM9/25/16
to Dupéron Georges, Racket Users, e...@barzilay.org
I second the idea that the documentation could be clearer on the difference between "#lang scribble/base" and friends and what can be done with the at-reader in general, as shown in languages like "scribble/text" and "scribble/html". Despite having used both "scribble/base"-family languages and tools like "make-at-readtable", I didn't realize until reading this thread that "scribble/text" may be a better basis for several things I've been trying to do.

On a related note, make-at-readtable accepts options for "#:command-readtable" and "#:datum-readtable", but not for reading the body of the @-form — though as I think about it, I guess that could be done using "#:syntax-post-processor", right? 

I'm still not 100% clear on what is supposed to happen at read-time vs. expand-time vs. runtime in this example:
(filter foo?
        (python-ish-list-comprehend
         «thing for x in sqlish(«select * from foo») where some_pred(x)»))
but I think those options might be able to construct something closer to the udelim goal than plain at-epressions.

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William G Hatch

unread,
Sep 25, 2016, 9:03:04 PM9/25/16
to Dupéron Georges, Racket Users, e...@barzilay.org
That seems like a very reasonable way of looking at it I it.

Here is a little step through of what my rash macro does:

(define pwd-var "pwd")
(rash «ls $(rash/trim «dirname $(rash/trim «$pwd-var»)»)»)

;; after one step of expansion, this looks something like this:
(rash-line-parse 'ls (rash/trim «dirname $(rash/trim «$pwd-var»)»))
;; rash-line-parse would expand to (run-pipeline ...) or (begin ...) if
;; there were multiple lines, but if we pretend that the inner macro
;; would expand first, it would be
(rash-line-parse 'ls (<surrounding-stuff-that-differentiates-rash/out-from-rash>
(rash-line-parse 'dirname (rash/trim «$pwd-var»))))
;; then
(rash-line-parse 'ls (<surrounding-stuff>
(rash-line-parse 'dirname
(<surrounding-stuff>
;; the $ in rash keeps pwd-var from being quoted
(rash-line-parse pwd-var)))))

So each rash macro reads another layer of string (each one needing a $
due to the syntax specifics of the rash language), but nested strings
will still become strings after applying the reader again, while other
things will become symbols, s-expressions... So you could also use \"
and \\" instead of «», «» is just nicer.

So it is very much like your #reader example above.

William

Eli Barzilay

unread,
Sep 26, 2016, 1:53:59 AM9/26/16
to Matthew Butterick, William G Hatch, Racket Users
On Sun, Sep 25, 2016 at 2:19 PM, Matthew Butterick <m...@mbtype.com> wrote:
>
> On Sep 25, 2016, at 2:10 AM, Eli Barzilay <e...@barzilay.org> wrote:
>> *Don't* confuse scribble-the-documentation-system with the syntax --
>> the syntax is useful for many other cases, and designed to make sense
>> in other cases. See my description (specifically section 4, which is
>> very relevant here), and the scribble/text and scribble/html
>> languages.
>
> To be fair, the documentation invites this kind of confusion. All the
> material about the at-reader is within the Scribble docs. This makes
> it look like it's dependent on Scribble, when really it's a separate
> thing.

Yes, I know, and yes, it could very much use a restructuring.


> In general, I think the word "Scribble" is misleadlingly overloaded
> within Racket. IMO "Scribble" should refer only to the family of
> languages that use the Scribble document model, including Racket
> documentation.

Well, it started with "scribble" being the name of the syntax, which
came first. Then Matthew built the documentation system on top of it,
and that was also named "scribble". When we realized that this is going
to be confusing, it was already clear that the latter meaning is already
"winning", so the syntax turned into @-forms, @-expressions etc -- I'm
probably the only one who still uses the first meaning from time to
time. And yes, that's not a good name since it can be customized too,
but I don't see a good way out of it...


> At some point the docs for "Scribble as a Preprocessor" were broken
> out from the main Scribble docs — I'm guessing to emphasize that
> they're conceptually separate from Scribble. But AFAICT what they
> really have in common is the at-reader, not the document
> model. Because they don't use the Scribble document model, I'm unclear
> why they're called `scribble/text` and `scribble/html`.

That also made more sense in the early days, since `scribble` was
supposed to be the place for all scribble (the syntax) related things.


> Meanwhile, I'd argue that the at-reader — itself an obsolete name,
> since one can swap out the @ for any Unicode char — deserves to have
> its documentation broken out into a separate top-level section, which
> would more accurately reflect its status within Racket.

At some point I intended to take the paper I wrote about it and make it
into a separate documentation about the syntax, but I never got to
actually do it.


> + String splitting within {...} delimiters: I agree this is the right
> default behavior, but it doesn't seem unreasonable to wish for
> shorthand for when you really do want things concatenated into a
> single argument, given that Racket is full of cognates like
> let/let*, for/for*, list/list*, etc. That said, I don't have a good
> idea what the notation would be.

For plain single-string use, I always encouraged using
`@string-append{...stuff...}`, but since this is horribly long, some
simple shorthand like (define ~ string-append) can be used. For cases
when you want to look at the result of some @-expression as a single
value but not pay the huge price of accumulating intermediate strings, a
plain old `list` does fine -- and that's what `scribble/text` is doing.
With that, it's very useful not only to @list{...stuff...} but also to
just quote it with '@{...} -- or the equivalent @'{...}, and, of course,
everything that comes out of throwing @`{...} into the mix.


> + I wish at-expressions could use multiple [...] and {...} parts, in
> any order.

Yeah, I considered it at some point, but decided to go with a more
restricted (but maximally useful) syntax to avoid possible problems and
maybe extended it later when needed. The idea of what we ended up with
is that you can always tell where the @-expression ends by looking at
the {}s, or putting them yourself to avoid getting things mixed up with
the following text.

Eli Barzilay

unread,
Sep 26, 2016, 1:54:37 AM9/26/16
to William G Hatch, Racket Users
On Sun, Sep 25, 2016 at 4:21 PM, William G Hatch <wil...@hatch.uno> wrote:
>
>> Yes, and you can do all of that with just a string, which you can
>> still get from an @-form -- just throw a syntax error if it's not all
>> strings. And with just that you get the *benefit* of ignoring
>> indentation which makes it possible to use your syntax in a sane way.
>
> I was doing that before, and I just didn't see that as a benefit.

This is even more confusing. You started by saying that you *don't*
want nested expressions, just strings -- and now you don't see the
benefit of erroring on that, and furthermore:

> I do get runtime expressions. I'm parsing the inner strings into
> syntax objects at macro expansion time, and some of those end up being
> themselves macro calls, and some of them are just normal expressions
> that are evaluated at runtime.

which means that you *do* want nested expressions after all--?

And you repeat this:

>> Yes, and you can get the same with any string, and
>>
>> (rash "stuff
>> in a different
>> language")
>
> Yeah, if you look at the rash docs you'll see that I have examples
> that do exactly that. It's kind of the point that the macro can just
> use any string. All the nestable string delimiters bring to the table
> for these macros is that it makes it easier to nest them without crazy
> escapes.

which makes me cringe yet again... Given all of that, you basically are
using all of the features that @-expressions have. If you could just
squint for a bit so that wherever it uses "{}"s you'd pretend it's
"«»"s, and wherever it uses "@" you'd pretend that you see a "$", you'd
see that you are going down the same path. Only you choose to keep a
bit more parens on the way.

> Yes, I'm aware, but any character you choose ends up being a magic
> character through each nested level unless you use |{}|, which I
> didn't want.

Right -- but those are just different ways to write string delimiters
(and "@" too) -- so continue squinting and read "|{}|" as "“”" and "|@"
as "¢", and maybe "|={}=|" and "|=@" as "⌜⌝" and "♯" etc. Same exact
idea, only (a) you're not limited to a choice from a few chosen
delimiters and instead can make up new ones, and (b) the syntax is
uniform at all levels so you (the end programmer) are not at the mercy
of the specific macro when it comes to deciding what delimiters to use.
That's, BTW, a *huge* win: uniformity at the *concrete* syntax level is
>>EXTREMELY<< important. That's the main reason sexprs are so great,
and the main problem with the wild world of tex (not latex which uses
conventions more; tex -- where any character can mean anything).


> ;; starting in normal racket syntax, but with «» for convenience
> (filter foo?
> (python-ish-list-comprehend
> «thing for x in sqlish(«select * from foo») where some_pred(x)»))
>
> The example again is silly, but syntactically it needs neither $ nor @
> nor any other magic character. The «» nesting quotes are just
> convenient to avoid \" nonsense (and \\", \\\\" if there were more
> nesting).

Sure it does! The "@" is implicit in the fact that
`python-ish-list-comprehend` is a macro that parses its textual body;
the "{}" are replaced by their squinted versions; and there's something
(python-ish-list-comprehend, probably) that decides that a nested
parenthesized-string following an "sqlish", which is another implicit
"@". It's all there, only implicit, which is not making your user's
lives any easier.


> As for $ in rash, I chose $ because I'm giving it *some* similarities
> to eg. bash.

Yes, I know. Read my paper: I go into much more details on quasi-
strings with some unquote characters in contrast to @-forms where "@" is
serving a double purpose which makes traditional string interpolation
*unnecessary*.


> To escape just an identifier in rash (as opposed to a larger
> expression in parens), you can use $id, which you'll see if you look
> at the examples in theh rash docs.

... and then you need to face the question of what happens with
«...$blah...» given that "." is a valid character in a racket
identifier. And a bunch of other little things. (Rhetorical; No need
to reply with what you do with such things.)


> As an aside, eventually I plan on changing it so it creates a macro
> that will do different things than just escape, eg. $CAPS looks up
> environment variables rather than normal variables, and maybe $«*.ext»
> expands globs, etc. But my point is that the $ is a feature of the
> rash language that I want, not some added complexity that I would want
> to avoid.

(Ha! The idea that $... behaving in different ways is *reducing*
complexity is amusing. Really.)


> So I'm not saying I want a different @-like character for each level
> down, but rather that I don't need any @-like character in the general
> case (just string delimiters, either normal or preferably nestable),
> and for rash in particular I want a magic character that does
> something different than what the @-reader magic character does.

The point is that "@" does the same thing at *all* level, including the
toplevel. It's *reducing* magic to an expected reader behavior. You're
nested strings -- at each level -- need some magic that *will*
eventually lead to parsing that string, leaving it implicit makes it
magical and not in a good way!


> And as for a $ escape character in a shell language syntax being
> confusing, I think that is maybe the most natural shell syntax choice
> I've made for rash for anyone coming to it from bash.

(As a side note, I ended up going back to intense shell scripting at my
$DAY_JOB. I *know* shell scripting. I've been doing it for a while. I
know people who *know* shell scripting. There is nothing natural about
them, *especially* around anything related to $-interpolation. Quick
example: I just recently found out something new about "..${a#$b}.."
which I was confused about, partly because of how it's different between
zsh and bash. Once you find out the difference between the two, you
will not be able to use "$" and "most natural" in the same syntax.

(Shell programming should come with an achievement system: as you write
a shell script, if you get to the dreaderd '"'"' quotation of ' there
should be some happy music and you get a master quoter badge. When you
write an expression that gets your editor to mis-highlight the quoted
and unquoted parts (as done with that ${a#$b} in bash and Emacs) you get
another out-of-mainstream-quoter achievemnt.))


> No, I *definitely* want rash and rash/out to be macros. I don't want
> to do any runtime parsing and evaluation for them. My current rash
> macro is perfectly capable of having runtime variable references
> inside the syntax that is nested in a string -- they all become syntax
> objects within the context the string was in (there are examples of
> this in the rash docs).

Yes, I see that. I assumed that you wanted both

(rash "blah blah")
(let ([x "blah blah"]) (rash x))

to be the same, mostly because you said how much you didn't want nested
@-forms. I see what you mean now, which is, like I said, pretty much
exactly what you get with @-forms. (BTW, note how @-forms are perfect
here: since `rash` should always be used with a string input, it ends up
having a single set of delimiters instead of the two that are obviously
redundant.)

-=- -=- -=-

But I'm guessing that I lost you again, so none of this would move you.
All I can do at this point is sigh and hope that you'll end up at the
best case of re-implementing @-expressions with the slightly more
verbose syntax that you want. The worst case will be ... well, much
worse.

Matthew Butterick

unread,
Sep 26, 2016, 9:40:13 AM9/26/16
to Eli Barzilay, Racket Users

On Sep 25, 2016, at 10:53 PM, Eli Barzilay <e...@barzilay.org> wrote:

When we realized that this is going
to be confusing, it was already clear that the latter meaning is already
"winning", so the syntax turned into @-forms, @-expressions etc -- I'm
probably the only one who still uses the first meaning from time to
time.  And yes, that's not a good name since it can be customized too,
but I don't see a good way out of it...

T as in "text":

@-form => T-form
@-expression = > T-expression (or t-exp in shorthand)

#lang at-exp racket => #lang t-exp racket

#lang scribble/text => #lang t-exp/text

#lang scribble/html => #lang t-exp/html


Keep the old @-names for backward compatability of course.

William G Hatch

unread,
Sep 26, 2016, 11:05:20 AM9/26/16
to Eli Barzilay, Racket Users
On Mon, Sep 26, 2016 at 01:54:35AM -0400, Eli Barzilay wrote:
>But I'm guessing that I lost you again, so none of this would move you.
>All I can do at this point is sigh and hope that you'll end up at the
>best case of re-implementing @-expressions with the slightly more
>verbose syntax that you want. The worst case will be ... well, much
>worse.

I don't think you lost me either time. I think we agree on more than
you think we do, but I think mostly we just disagree on what things we
want to be simple, and what features we find useful. This might be a
more useful conversation if we were in person and could hopefully
communicate more clearly, but I think going back and forth on this over
the mailing list would be edifying for nobody.

Thanks for your responses, though. Maybe we can meet and chat about
such things at the next RacketCon or something.

William

Greg Trzeciak

unread,
Sep 26, 2016, 11:34:52 AM9/26/16
to Racket Users, e...@barzilay.org
IMHO the "text expression" does precisely the same as current use of scribble -> pigeonholing the syntax for one use: in the case of scribble it is "documentation" in the case of text - "text processing".

I actually find "at-exp" to be quite fitting but would keep using this form everywhere instead of "@-exp". Instead of deriving the name from the "@" symbol, explain it as deriving from "at" preposition. Why?
- You can place the expression "AT ANY PLACE inside your text or code"
- Expression is identified by the selected identifier "AT THE FRONT of the expression" -> default @
- With at-exp the function is "AT THE FRONT followed by brackets/braces"
Ok maybe stretching it a bit but each to their own.

Some alternatives:
- M-expression (sic!) it even has plenty of similarities with meta-expression and in a way it is a meta syntax for s-expression
- F-expression - fore-expression -> function before brackets
- P-expression - pre(peri)-expression -> function before brackets

Cheers

Greg

Eli Barzilay

unread,
Sep 27, 2016, 1:41:22 AM9/27/16
to Matthew Butterick, Racket Users
On Mon, Sep 26, 2016 at 9:40 AM, Matthew Butterick <m...@mbtype.com> wrote:
>
> T as in "text":
>
> @-form => T-form
> @-expression = > T-expression (or t-exp in shorthand)

(Or "Texprs"...)


> #lang at-exp racket => #lang t-exp racket
>
> #lang scribble/text => #lang t-exp/text
>
> #lang scribble/html => #lang t-exp/html
>
> Keep the old @-names for backward compatability of course.

That sounds pretty good -- it follows the original intention of these
being a convenient and uniform syntax for "text-rich expressions".
That's if there's enough collective will-power to change it now...



On Mon, Sep 26, 2016 at 11:34 AM, Greg Trzeciak <gtrz...@gmail.com> wrote:
> IMHO the "text expression" does precisely the same as current use of
> scribble -> pigeonholing the syntax for one use: in the case of
> scribble it is "documentation" in the case of text - "text
> processing".

It's fine for just "text", without the "processing" -- since the idea
does revolve around text in code in all kinds of way.s

> - You can place the expression "AT ANY PLACE inside your text or code"
> - Expression is identified by the selected identifier "AT THE FRONT of
> the expression" -> default @
> - With at-exp the function is "AT THE FRONT followed by
> brackets/braces"
> Ok maybe stretching it a bit but each to their own.

Yeah, I think that this is stretching it... I think that it's perfectly
fine to stick with "@" or "at" for historical reasons, but the confusion
is certainly there, and that's not new.
Reply all
Reply to author
Forward
0 new messages