Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Unicode usenet posting. This is a test.

55 views
Skip to first unread message

RG

unread,
Sep 26, 2010, 3:42:57 AM9/26/10
to
I read news on a Mac using MT-Newswatcher, which is a Carbon application.
Carbon apps don't quite handle unicode correctly, so I can't actually
post a unicode message using MT-Newswatcher, and all of the Cocoa
newsreaders that I tried suck. So I'm posting this using a little
Python script, in part to test it to make sure it works, and in part
to stir the pot a little on the ongoing debate about Lisp and its
suitability for various tasks. Here's the Python code I used to
post this:

from nntplib import NNTP
c = NNTP('news.albasani.net', user='...', password='...')
c.post(StringIO(codecs.open('testpost', 'r', 'utf-8').read().encode('utf-8')))

Actually, that last line could have been just:

c.post(open('foo'))

(I first tried this in Lisp, but couldn't find an NNTP library that supported
posting.)

And here's some unicode content. This is a factorial function written
using an embeded Y combinator:

(((λ (f) ((λ (g) (g g)) (λ (h) (λ (x) ((f (h h)) x)))))
(λ (f) (λ (n) (if (zerop n) 1 (* n (f (1- n)))))))
15)

You should see a bunch of Greek lower-case lambdas in that code.

And just for good measure, some «European style quotes» and “balanced smart
quotes” which I intend some day to try to convince people to start using
to eliminate the scourge of backslash escapes. But that's a topic for
another day.

rg

RG

unread,
Sep 26, 2010, 3:44:37 AM9/26/10
to
I read news on a Mac using MT-Newswatcher, which is a Carbon application.
Carbon apps don't quite handle unicode correctly, so I can't actually
post a unicode message using MT-Newswatcher, and all of the Cocoa
newsreaders that I tried suck. So I'm posting this using a little
Python script, in part to test it to make sure it works, and in part
to stir the pot a little on the ongoing debate about Lisp and its
suitability for various tasks. Here's the Python code I used to
post this:

from nntplib import NNTP
c = NNTP('news.albasani.net', user='...', password='...')
c.post(StringIO(codecs.open('testpost', 'r', 'utf-8').read().encode('utf-8')))

(I first tried this in Lisp, but couldn't find an NNTP library that supported

RG

unread,
Sep 26, 2010, 3:48:50 AM9/26/10
to
In article <i7mtl5$9d9$2...@news.albasani.net>,
RG <rNOS...@mflownet.com> wrote:

> I read news on a Mac using MT-Newswatcher, which is a Carbon application.
> Carbon apps don't quite handle unicode correctly, so I can't actually
> post a unicode message using MT-Newswatcher, and all of the Cocoa
> newsreaders that I tried suck. So I'm posting this using a little
> Python script, in part to test it to make sure it works, and in part
> to stir the pot a little on the ongoing debate about Lisp and its
> suitability for various tasks. Here's the Python code I used to
> post this:
>
> from nntplib import NNTP
> c = NNTP('news.albasani.net', user='...', password='...')
> c.post(StringIO(codecs.open('testpost', 'r',
> 'utf-8').read().encode('utf-8')))
>
> (I first tried this in Lisp, but couldn't find an NNTP library that supported
> posting.)
>
> And here's some unicode content. This is a factorial function written
> using an embeded Y combinator:
>

> (((λ (f) ((λ (g) (g g)) (λ (h) (λ (x) ((f (h h)) x)))))
> (λ (f) (λ (n) (if (zerop n) 1 (* n (f (1- n)))))))


> 15)
>
> You should see a bunch of Greek lower-case lambdas in that code.
>

> And just for good measure, some «European style quotes» and ╲balanced
> smart
> quotes╡ which I intend some day to try to convince people to start using


> to eliminate the scourge of backslash escapes. But that's a topic for
> another day.
>
> rg

Blast, it worked when I tried it on alt.test. I wonder why it's not
working on cll.

rg

Raymond Toy

unread,
Sep 26, 2010, 8:32:39 AM9/26/10
to
On 9/26/10 3:48 AM, RG wrote:
> In article <i7mtl5$9d9$2...@news.albasani.net>,

> Blast, it worked when I tried it on alt.test. I wonder why it's not
> working on cll.

It worked for me. I see the lambdas and the different sets of quotes.
Don't know if the European-style quotes are right but the balanced
quotes look right.

Ray

Spiros Bousbouras

unread,
Sep 26, 2010, 8:40:47 AM9/26/10
to
On Sun, 26 Sep 2010 07:42:57 +0000 (UTC)
RG <rNOS...@mflownet.com> wrote:

> Unicode usenet posting. This is a test.

Has Kenny commandeered your account or is this now a general purpose
group ?


Your header is all wrong. For example
Content-Transfer-Encoding: utf-8

From RFC 2045:

The Content-Transfer-Encoding field's value is a single token
specifying the type of encoding, as enumerated below. Formally:

encoding := "Content-Transfer-Encoding" ":" mechanism

mechanism := "7bit" / "8bit" / "binary" /
"quoted-printable" / "base64" /
ietf-token / x-token

A correct header for UTF-8 transmission might contain:
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

> And just for good measure, some «European style quotes» and “balanced smart
> quotes” which I intend some day to try to convince people to start using
> to eliminate the scourge of backslash escapes. But that's a topic for
> another day.

I don't see how they would help to eliminate backslash escapes. Let's
imagine that strings were delimited by « and ». If you wanted a
string which contained a » you would still need to escape it.

--
Doesn't love always begin that way , with the illusion more real than
the woman ?
11001001

Pascal J. Bourguignon

unread,
Sep 26, 2010, 11:11:16 AM9/26/10
to
Spiros Bousbouras <spi...@gmail.com> writes:

> I don't see how they would help to eliminate backslash escapes. Let's
> imagine that strings were delimited by « and ». If you wanted a
> string which contained a » you would still need to escape it.

Unless it's closing a «.

« Jean a dit « Jacques a dit lève toi ! » à Jeanne. »

Now, of course if you quote Hungarian, you could have a problem, since
in the second level quote they reverse the closing and opening quotes:

« Sarkozy a dit: „A feleségem azt mondta: \»Nekem van egy kis golyó\«”. »

so you will have indeed to backslash them to avoid counting them as
closing or opening a leve.


--
__Pascal Bourguignon__ http://www.informatimago.com/

RG

unread,
Sep 26, 2010, 12:05:42 PM9/26/10
to
> Has Kenny commandeered your account or is this now a general purpose
> group?

Why does it have to be one or the other?

> Your header is all wrong.

Ah. Then why did it work on alt.test? (Actually, it seems to be a problem
in my newsreader.)

> I don't see how they would help to eliminate backslash escapes. Let's
> imagine that strings were delimited by « and ». If you wanted a
> string which contained a » you would still need to escape it.

Nope. You just enclose it in the other kind of balanced quote, e.g.:

“This is a European style close-quote: »”

Here's the code I use to read balanced quotes:

(require :named-readtables)
(use-package :named-readtables)

(defun make-string-reader (c1 c2)
(lambda (stream c)
(declare (ignore c))
(with-output-to-string (s)
(loop for c = (read-char stream)
with cnt = 1
if (eql c c1) do (incf cnt)
else if (eql c c2) do (decf cnt)
until (and (eql c c2) (eql cnt 0))
do (princ c s)))))

(defreadtable balanced-quotes
(:merge :standard)
(:case :upcase)
(:macro-char #\« (make-string-reader #\« #\»))
(:macro-char #\“ (make-string-reader #\“ #\”)))

rg

RG

unread,
Sep 26, 2010, 12:07:02 PM9/26/10
to
In article <i7neh7$e22$1...@news.eternal-september.org>,
Raymond Toy <toy.r...@gmail.com> wrote:

My headers were apparently not set correctly. Also there seems to be
some weirdness in my newsreader. The same article gets rendered
differently depending on whether it is in alt.test or cll.

rg

RG

unread,
Sep 26, 2010, 12:30:37 PM9/26/10
to
In article <87vd5sy...@kuiper.lan.informatimago.com>,
p...@informatimago.com (Pascal J. Bourguignon) wrote:

> Spiros Bousbouras <spi...@gmail.com> writes:
>
> > I don't see how they would help to eliminate backslash escapes. Let's
> > imagine that strings were delimited by « and ». If you wanted a
> > string which contained a » you would still need to escape it.
>

> Unless it's closing a «.
>
> « Jean a dit « Jacques a dit lève toi ! » à Jeanne. »
>
> Now, of course if you quote Hungarian, you could have a problem, since
> in the second level quote they reverse the closing and opening quotes:
>
> « Sarkozy a dit: „A feleségem azt mondta: \»Nekem van egy kis golyó\«”. »
>
> so you will have indeed to backslash them to avoid counting them as
> closing or opening a leve.

Are you sure you typed that correctly? I think it should be:

« Sarkozy a dit: „A feleségem azt mondta: \»Nekem van egy kis golyó\«‟. »

though I don't know Hungarian so I could be wrong.

But you don't need to backslash these quotes. All you need is an outermost
quote of a different kind:

Welcome to Clozure Common Lisp Version 1.6-dev-r14294M-trunk (DarwinX8664)!
? (in-readtable balanced-quotes)
NIL
? “ Sarkozy a dit: „A feleségem azt mondta: »Nekem van egy kis golyó«‟. ”


" Sarkozy a dit: „A feleségem azt mondta: »Nekem van egy kis golyó«‟. "

rg

Tamas K Papp

unread,
Sep 26, 2010, 12:37:58 PM9/26/10
to

Hungarian omits the substantive verb in certain situations. The
correct version is: "Kicsik a golyóim." (For those using Google
translate to decipher this: prepend "I have".)

But it was still funny.

Tamas

Raffael Cavallaro

unread,
Sep 26, 2010, 12:57:41 PM9/26/10
to
On 2010-09-26 03:44:37 -0400, RG said:

> I read news on a Mac using MT-Newswatcher, which is a Carbon application.
> Carbon apps don't quite handle unicode correctly, so I can't actually
> post a unicode message using MT-Newswatcher, and all of the Cocoa
> newsreaders that I tried suck.

I read news on a Mac using Unison. It's not the greatest piece of
software ever written, but it wouldn't say it sucks either.

warmest regards,

Ralph


--
Raffael Cavallaro

Ron Garret

unread,
Sep 26, 2010, 1:47:21 PM9/26/10
to
On 2010-09-26 09:57:41 -0700, Raffael Cavallaro said:

> On 2010-09-26 03:44:37 -0400, RG said:
>
>> I read news on a Mac using MT-Newswatcher, which is a Carbon application.
>> Carbon apps don't quite handle unicode correctly, so I can't actually
>> post a unicode message using MT-Newswatcher, and all of the Cocoa
>> newsreaders that I tried suck.
>
> I read news on a Mac using Unison. It's not the greatest piece of
> software ever written, but it wouldn't say it sucks either.


I tried Unison a long time ago and ran into some serious problem that I
can no longer remember (I think it crashed) so I abandoned it. But I
tried it again just now and it seems to be much improved. Thanks for
the pointer!

rg

--

Pascal J. Bourguignon

unread,
Sep 26, 2010, 3:03:53 PM9/26/10
to
RG <rNOS...@flownet.com> writes:

If you have N pairs of quotes, what happens when you use N+1 different
pairs in the string?

Or you could say that the language provides concatenate, so you don't
need to support strings containing N+1 different kinds of pairs of
quotes, the programmer being able to concatenate differenlty delimited
strings (eg. in Modula-2 you can write strings 'Hello' or "Hello", so
that you can write concatenate("'",'"') to get "'\"").

Having a backslash may still be useful (it may have other uses too), and
having a lot of quotes so that you can avoid backslash is nice too.

Raffael Cavallaro

unread,
Sep 26, 2010, 3:08:05 PM9/26/10
to
On 2010-09-26 13:47:21 -0400, Ron Garret said:

> I tried Unison a long time ago and ran into some serious problem that I
> can no longer remember (I think it crashed) so I abandoned it. But I
> tried it again just now and it seems to be much improved.

Yeah, it used to have some pretty serious bugs in early versions - I
remember going back to MT-Newswatcher for about a year after the first
time I tried Unison. They've slowly eliminated the serious problems and
it's pretty usable now. Only took them about 3 years ;^)

> Thanks for the pointer!

You're very welcome!

Ron Garret

unread,
Sep 26, 2010, 3:40:22 PM9/26/10
to

Your head explodes.

> Or you could say that the language provides concatenate, so you don't
> need to support strings containing N+1 different kinds of pairs of
> quotes, the programmer being able to concatenate differenlty delimited
> strings (eg. in Modula-2 you can write strings 'Hello' or "Hello", so
> that you can write concatenate("'",'"') to get "'\"").
>
> Having a backslash may still be useful (it may have other uses too), and
> having a lot of quotes so that you can avoid backslash is nice too.

Sure. There are situations in which backslashes can't be avoided. But
wouldn't you rather be able to write this:

«<input type="button" onClick="alert(\"Click\")">»

instead of this:

"<input type=\"button\" onClick=\"alert(\\\"Click\\\")\">"

?

rg

Xah Lee

unread,
Sep 26, 2010, 5:35:58 PM9/26/10
to
2010-09-26

great post. I heartily support it. I think we should all embrace
unicode than still doing ascii hacks such as ``this'' or or lots of
backslash toothpick syndroms (especially when in regex parsing regex
or other langs).

i'd post several links to articles related to this... but i guess i
done that too much and didn't want people say i'm here just to sell my
site.

but anyway, there are articles about unicode popularity or usage
stats, degree of support and spec of unicod in many langs, reasons for
why it's much better, emacs's support, unicode char browsers etc...

unicode is somewhat my love i guess, together similar to the love of
languages (human or comp), math symbols, writing systems, logos, ...

your post shows on my screen correctly in google groups, e.g.
http://groups.google.com/group/comp.lang.lisp/browse_frm/thread/73c8db9bf6ef818d/

am quite happy google in recent months cleaned up newsgroup spam.

Xah

Kenneth Tilton

unread,
Sep 26, 2010, 6:04:23 PM9/26/10
to
On 9/26/2010 8:40 AM, Spiros Bousbouras wrote:
> On Sun, 26 Sep 2010 07:42:57 +0000 (UTC)
> RG<rNOS...@mflownet.com> wrote:
>
>> Unicode usenet posting. This is a test.
>
> Has Kenny commandeered your account or is this now a general purpose
> group ?

:) Your concern is compelling as long as you do not add to the noise.

>
>
> Your header is all wrong. For example
> Content-Transfer-Encoding: utf-8

Doh! You want on topic? Check out this Lisp Web application:
http://teamalgebra.com/

Not only does it use Lisp, it uses a Lisp database, AllegroGraph 4.0.6.

I dissect the iPad/HMH Algebra 1 competition here:
http://www.stuckonalgebra.com/ipad_houghton.html That's what happens
when you use Objective-C. Kidding, this looks like a blast:

http://www.youtube.com/user/algebratouch#p/a/u/1/8LM0xdOzav0
http://www.youtube.com/user/algebratouch#p/a/u/2/A4SdNUwgkcg

kt

--
http://www.stuckonalgebra.com
"The best Algebra tutorial program I have seen... in a class by itself."
Macworld

Xah Lee

unread,
Sep 26, 2010, 6:20:33 PM9/26/10
to
On Sep 26, 5:40 am, Spiros Bousbouras <spi...@gmail.com> wrote:
> > And just for good measure, some «European style quotes» and “balanced smart
> > quotes” which I intend some day to try to convince people to start using
> > to eliminate the scourge of backslash escapes.  But that's a topic for
> > another day.
>
> I don't see how they would help to eliminate backslash escapes. Let's
> imagine that strings were delimited by « and ». If you wanted a
> string which contained a » you would still need to escape it.

I thought about these, but ultimately I don't think it is avoidable.

However, using matching pairs eliminate many unnecessary escapes.

For example in my recent essay about html6, I thought about the escape
issue. Ultimately, if your content refers to the language itself,
then you will need escapes, unless your language can switch to another
quoting mechanism.
For example, in perl:

"this"
'this'
q[this]
q(this)
q{this}

print <<'xyzxyz';
this
xyzxyz


Are all equivalent. (technically except the last one which contains
extra new line)

Note that it has variable quoting chars. Similarly with Python. e.g.

"this"
'this'
"""this"""
'''this'''

In python , the variability is less than perl.

Now, suppose, if you need to quote perl code in perl itself or python
in python. e.g. suppose you are writing a perl script that parse a
perl snippet that contains all perl lang's quoting mechanisms, then,
basically you'll start to need escapes. (worse when this is nested. A
practical example that happens to me often is when writing a blog in
html about using perl to parse html, and the html contains complex
javascript or php, which may contain regex string, and you need to
show the perl source code in html marked up syntax highlight.)

in the study of symbolic logic, this is a form of self reference, and
is a unavoidable problem (to not to have to escape chars yet still
want the ability to quote the lang itself).

the variable quoting chars also introduces some complexity. Namely,
your lang at syntax level is no longer simple. e.g. in emacs lisp,
whenever the symbol straight double quote appears, it has only one
meaning (unless in special cases such as in comments or being
escaped). Or, when you need to get strings in a lang, the only char
you need to look for is double straight quote. In langs with variable
quotes such as perl, this can no longer be true, in one or both ways.

doesn't matter which is your philosophy in lang design with regards to
quoting mechanism, unicode introduce many proper matching pairs that
are helpful, and avoid multiple semantic meanings for a given char.

in a similar way, this is one of my pet peeve in math notation and
computer lang syntaxes. e.g. In Mathematica, paren is used for one
single purpose only, always. Namely, grouping. The square bracket []
has one single purpose only, namely as bracket for function arguments.
The curly brackets {} again has one single purpose only, as a syntax
sugar for list, e.g. List[1,2] is the same as {1,2}. In traditional
math notation and most comp langs, it's all context dependent soup.

• 〈Strings in Perl and Python〉
http://xahlee.org/perl-python/quoting_strings.html

• 〈Strings in PHP〉
http://xahlee.org/php/quoting_strings.html

• 〈HTML6, Your HTML/XML Simplified〉
http://xahlee.org/comp/html6.html

• 〈Matching Brackets in Unicode〉
http://xahlee.org/comp/unicode_matching_brackets.html

Xah ∑ xahlee.org

Spiros Bousbouras

unread,
Sep 27, 2010, 5:13:25 PM9/27/10
to
On Sun, 26 Sep 2010 12:40:22 -0700
Ron Garret <rNOS...@flownet.com> wrote:
>
> Sure. There are situations in which backslashes can't be avoided. But
> wouldn't you rather be able to write this:
>
> «<input type="button" onClick="alert(\"Click\")">»
>
> instead of this:
>
> "<input type=\"button\" onClick=\"alert(\\\"Click\\\")\">"

So in your alternative syntax the backslash itself does not need to be escaped
within European quotes ? Anyway , personally I prefer

* (defmacro no-escapes (string special-char)
(substitute #\" special-char (copy-seq string)))
NO-ESCAPES

* (no-escapes "<input type=@button@ onClick=@alert(\\@Click\\@)@>" #\@)
"<input type=\"button\" onClick=\"alert(\\\"Click\\\")\">"

Spiros Bousbouras

unread,
Sep 27, 2010, 5:24:59 PM9/27/10
to
On Sun, 26 Sep 2010 18:04:23 -0400
Kenneth Tilton <kent...@gmail.com> wrote:
> Doh! You want on topic?

I would prefer on topic and not <already posted a dozen times>
;-)

Kenneth Tilton

unread,
Sep 27, 2010, 5:40:30 PM9/27/10
to

Spiros, your weak mastery of Asian wisdom is showing: one can never step
in the same stream of spam twice.

Consider this brilliant dissection of the new iPad app for Algebra, just
added today:

http://www.stuckonalgebra.com/ipad_houghton.html

Meanwhile, I must say your support and encouragement of the most serious
new Lisp application since ITA is.... typical. :)

Ron Garret

unread,
Sep 27, 2010, 5:50:42 PM9/27/10
to
On 2010-09-27 14:13:25 -0700, Spiros Bousbouras said:

> On Sun, 26 Sep 2010 12:40:22 -0700
> Ron Garret <rNOS...@flownet.com> wrote:
>>
>> Sure. There are situations in which backslashes can't be avoided. But
>> wouldn't you rather be able to write this:
>>
>> «<input type="button" onClick="alert(\"Click\")">»
>>
>> instead of this:
>>
>> "<input type=\"button\" onClick=\"alert(\\\"Click\\\")\">"
>
> So in your alternative syntax the backslash itself does not need to be escaped
> within European quotes ?

That's right.

? «hello \world»
"hello \\world"


> Anyway , personally I prefer
>
> * (defmacro no-escapes (string special-char)
> (substitute #\" special-char (copy-seq string)))
> NO-ESCAPES
>
> * (no-escapes "<input type=@button@ onClick=@alert(\\@Click\\@)@>" #\@)
> "<input type=\"button\" onClick=\"alert(\\\"Click\\\")\">"

You're kidding, right?

rg

Spiros Bousbouras

unread,
Sep 28, 2010, 4:33:23 PM9/28/10
to
On Mon, 27 Sep 2010 14:50:42 -0700

No , I'm being perfectly serious. For one thing my code is portable ,
yours isn't. But let's imagine an ideal world where every Lisp
implementation supports UTF-8.

I still feel that your solution emphasises the wrong thing. It is the
"" which requires special treatment and my code treats it specially
whereas your code treats specially the delimiters of the string while
leaving " itself unchanged.

--
My rebuttal to Greenspun's 10th Rule of Programming:
If the half left out contains the LOOP macro then it's a clear
improvement.

RG

unread,
Sep 28, 2010, 5:36:16 PM9/28/10
to
In article <nisoo.2710$Fr2...@newsfe11.ams2>,
Spiros Bousbouras <spi...@gmail.com> wrote:

I would characterize the situation differently. I would not say that ""
requires special treatment, I would say that "" is fundamentally broken.
Using the same character as both the open and close delimiter of a
string is a Really Bad Idea. The Right Way to write the code in
question is this:

«<input type=«button» onClick=«alert(«Click»)»>»

which, if you're going to be post-processing anyway, can be easily made
to render in a way that can be parsed by existing web browsers.

http://rondam.blogspot.com/2009/08/html-is-object-code.html

rg

Spiros Bousbouras

unread,
Sep 28, 2010, 6:01:36 PM9/28/10
to

Broken or not we're stuck with it. Which makes me wonder , is there any
programming language where the start of a string is signified by a
different character or sequence of characters than the end ?

> The Right Way to write the code in
> question is this:
>
> «<input type=«button» onClick=«alert(«Click»)»>»

Agreed , if we could rewrite programming history this would be much
better.

RG

unread,
Sep 28, 2010, 6:12:03 PM9/28/10
to
In article <4Btoo.10$S6...@newsfe06.ams2>,
Spiros Bousbouras <spi...@gmail.com> wrote:

Only if we decide we're stuck with it.

> Which makes me wonder , is there any
> programming language where the start of a string is signified by a
> different character or sequence of characters than the end ?

Yes. Common Lisp :-)

Seriously though, Python is an existence proof that the lexical
conventions for strings can be changed.

> > The Right Way to write the code in
> > question is this:
> >
> > «<input type=«button» onClick=«alert(«Click»)»>»
>
> Agreed , if we could rewrite programming history this would be much
> better.

I don't see why we would have to rewrite history to use:

«<input type=«button» onClick=«alert(«Click»)»>»

any more than we have to rewrite history to use:

"<input type=@button@ onClick=@alert(\\@Click\\@)@>"

rg

Xah Lee

unread,
Sep 28, 2010, 7:52:03 PM9/28/10
to

lol Ron. I didn't realize the RG in this thread is you, Until I saw
your name Ron Garret in one of the thread.

Your name change many years ago is already confusing, why not put your
full name in your signature?

btw, samething with kenny and others... Please put your full name!

Xah

Rob Warnock

unread,
Sep 28, 2010, 9:23:22 PM9/28/10
to
Spiros Bousbouras <spi...@gmail.com> wrote:
+---------------

| RG <rNOS...@flownet.com> wrote:
| > I would characterize the situation differently. I would not say that ""
| > requires special treatment, I would say that "" is fundamentally broken.
| > Using the same character as both the open and close delimiter of a
| > string is a Really Bad Idea.
|
| Broken or not we're stuck with it. Which makes me wonder , is there any
| programming language where the start of a string is signified by a
| different character or sequence of characters than the end ?
+---------------

Yes, actually: Lua, while allowing traditional matching single or
double quotes for literal strings -- with the usual C-style escapes,
e.g., 'ding!\a' or "line 1\nline 2" -- also provides "long brackets":

http://www.lua.org/manual/5.1/manual.html#2.1
2.1 - Lexical Conventions
...
Literal strings can also be defined using a long format enclosed by
*long brackets*. We define an *opening long bracket of level n* as an
opening square bracket followed by n equal signs followed by another
opening square bracket. So, an opening long bracket of level 0 is
written as [[, an opening long bracket of level 1 is written as [=[,
and so on. A *closing long bracket* is defined similarly; for instance,
a closing long bracket of level 4 is written as ]====]. A long string
starts with an opening long bracket of any level and ends at the first
closing long bracket of the same level. Literals in this bracketed
form can run for several lines, do not interpret any escape sequences,
and ignore long brackets of any other level. They can contain anything
except a closing bracket of the proper level.

Lua also provides both "short" and "long" comments, re-using the
"long bracket" concept for the latter:

A comment starts with a double hyphen (--) anywhere outside a string.
If the text immediately after -- is not an opening long bracket, the
comment is a short comment, which runs until the end of the line.
Otherwise, it is a long comment, which runs until the corresponding
closing long bracket. Long comments are frequently used to disable
code temporarily.


-Rob

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607

Xah Lee

unread,
Sep 28, 2010, 11:18:11 PM9/28/10
to
On Sep 28, 6:23 pm, r...@rpw3.org (Rob Warnock) wrote:
> Spiros Bousbouras  <spi...@gmail.com> wrote:

nice one!

this solves the problem of what heredoc is trying to solve but more
elegantly and stays within ascii.

though, thinking about it, it still doesn't solve the self referencing
problem. i.e. when lua program needs to quote arbitrary lua source
code.

Xah

Antony

unread,
Sep 29, 2010, 5:30:02 AM9/29/10
to
On 9/28/2010 2:36 PM, RG wrote:
> Using the same character as both the open and close delimiter of a
> string is a Really Bad Idea.
Interesting observation. So many problems caused by that one charcateristic.

-Antony

Tim Bradshaw

unread,
Sep 29, 2010, 7:46:14 AM9/29/10
to
On 2010-09-29 00:52:03 +0100, Xah Lee said:
>
> btw, samething with kenny and others... Please put your full name!

I agree with this: though I tend to ignore signatures, so I find it's
always nicer if people put their full names in the from line (obviously
some people worry about spam etc: my experience, having basically
always done this, is that the results are not (any more) catastrophic).

Thomas A. Russ

unread,
Sep 29, 2010, 1:41:31 PM9/29/10
to
Spiros Bousbouras <spi...@gmail.com> writes:

> On Tue, 28 Sep 2010 14:36:16 -0700 RG <rNOS...@flownet.com> wrote:
> > Using the same character as both the open and close delimiter of a
> > string is a Really Bad Idea.
>
> Broken or not we're stuck with it. Which makes me wonder , is there any
> programming language where the start of a string is signified by a
> different character or sequence of characters than the end ?

Yes. There are likely others, but PostScript uses () as string
delimiters. And they allow matched pairs of () inside the string
without needing escape characters. It is only unmatched nested
parentheses that need escape characters.

--
Thomas A. Russ, USC/Information Sciences Institute

0 new messages