In a recent discussion on #clojure it was pointed out that another
language called newLISP has an excellent feature that would be neat to
adopt into clojure, namely its special text delimiters {} and [text][/
text]. It uses these delimiters to specify verbatim text (i.e. what's
in the parens is *exactly* what the string is, including the newlines).
This feature makes it incredibly easy to write and include various
bits of text in the language such as example code, html, and it makes
writing regular expressions simple by avoiding the need for some
escapes.
For example (newLISP code):
(replace {"quoted" text} my-str {"quoted" string})
vs
(replace "\"quoted\" text" my-str "\"quoted\" string})
As this has numerous advantages we discussed how such a construct
could be brought in to the benefit of Clojure, as in Clojure both the
{} and [] characters are reserved.
The following candidates were considered and rejected for various
reasons:
<> ; rejected because conflicts with statements like (< x 1)
#"" ; rejected because represents regex
#[] ; rejected because implies some sort of data structure like sets,
#{}
[t][/t]; rejected because conflicts with arrays
Finally we agreed that #s{ ... } would make a nice fit, as it fits
nicely with clojure's existing syntax and tendency to use the sharp to
signify a shorthand for something. On irc 'Chousuke' pointed out that
this construct could be used to make it easier to write doc strings
that include sample code for Clojure's functions, but of course there
are many other uses for such a construct (which I should note exists
in many other languages as well, even bash, but I referenced newLISP
as it's also a lisp and has a particularly elegant implementation).
Any and all input is welcome on this proposal!
Kind regards and thanks in advance for taking this into consideration,
Greg (irc: itistoday)
Reader macros have full access to the text stream, so it would be
straightforward to define a Perlish heredoc syntax for big literals,
e.g.
#_SOMETEXT
foo bar
...
SOMETEXT
I don't really see much point in this proposal as a whole, though --
almost every syntax will require escaping *something*, whether it be
quotes or curly braces -- so don't view this as me championing anything.
I certainly consider triple-quotes to be nicer than #s{}, which is
(IMO) visually repellent.
The reader macro approach has no significant advantage over triple
quotes, because either way involves a change to Clojure's Java code.
Indeed, it might be impossible to escape } within the reader macro
version, which defeats the point.
Sorry, that should read "quotes in them". ;-)
My understanding of a digraph is basically two characters that
together represent one real character. e.g. if you type ^Ka' in Vim
it turns the digraph consisting of a lowercase A and an apostrophe
into á (a acute).
> that $+ not be followed by a whitespace to count as a string? If so, I
> think that's undesirable as that is 1) confusing, and 2) limits the
> sorts of strings that you can create with it when the point of this
> addition would be to expand your freedom when creating strings. If
> I've misunderstood however, please let me know.
No, he's saying you could use $+ some string + or $% some string % or
$* some string * etc. to mean the same as " some string ", so the
thing immediately following the $ sign must not be a space, but could
be basically anything else.
The bit where he talks about digraphs was to do with being able to
quote the + character when you were using $+ and + for the delimiters.
He was proposing you use $+ to mean a + where it occurs in the middle
of the string. So:
$+something $+ else+ would be equivalent to "something + else". Of
course in this case you could just use $@something + else@ instead.
Where a $x is found in the middle of the string and the x is not a
delimiter (e.g. $#===$@===#) you would just leave it as-is (i.e. it
would be equivalent to "===$@===".)
--
Michael Wood <esio...@gmail.com>
My only concern, and perhaps John could elaborate on this because hetouched on it, would be how are Java nested classes protected from
this? Wouldn't it interfere with them? Or will the delimiters be
restricted to non-alphanumeric characters?
#s{...} ; sure!
#s{{...}{{{{}}}} ... } ; just fine!
#s{.{..} ; no can do...
Not nearly as flexible as some other ideas mentioned, but deserving of mention.
- Jeff
Yes, please. I'd like to pile on here with a few ideas and questions
(0) I'm not feeling the itch for verbatim strings, seeing as clojure
already does multi-line literals (with escaping) and has special
syntax for regex patterns.
(1) If it's all the same to everyone else, just use python's
triple-quotes. In practice, they work well enough, but if we're using
them for verbatim strings there still wouldn't be a way to embed, e.g.
a code fragment demonstrating use of a raw string in a a raw string.
Is this so terrible?
(2) Perlish/Sedish choose-your-own-quote has always struck me as an
ugly hack. More important though are worries about making tooling more
complicated:
How much more complex would this make, e.g. correct syntax
highlighting in emacs, in eclipse?
What about tools that wish to read clojure code as data but are not
themselves clojure? We wouldn't be doing them any favors by
unnecessarily complicating the surface syntax.
(3) Perhaps something akin to Lua's approach (mentioned previously in
this thread) could address the limitations of (1) without the uglyness
of (2).
Just my 2c
Ben
Yes, please. I'd like to pile on here with a few ideas and questions
On Mon, Oct 12, 2009 at 01:31, James Reeves <weave...@googlemail.com> wrote:
>
> What if you need to use braces? It seems to me that any syntax for
> representing long strings needs a terminator that is unlikely to occur
> within the string itself. For example, Python uses """, and XML CDATA
> uses ]]>, both of which are character sequences unlikely to turn up in
> a string. By contrast, an ending brace } is not rare enough to be used
> as a terminator, IMO.
(0) I'm not feeling the itch for verbatim strings, seeing as clojure
already does multi-line literals (with escaping) and has special
syntax for regex patterns.
(1) If it's all the same to everyone else, just use python's
triple-quotes. In practice, they work well enough, but if we're using
them for verbatim strings there still wouldn't be a way to embed, e.g.
a code fragment demonstrating use of a raw string in a a raw string.
Is this so terrible?