multiline indented string literal proposal

663 views
Skip to first unread message

roger peppe

unread,
Nov 20, 2009, 6:41:39 AM11/20/09
to golang-nuts
ok, i know this has a subnanoscopic chance of getting adopted,
but i thought i'd describe it anyway.

currently there are two kinds of string literal, "..." and `...`.
the former allows \ escapes and spanning lines with \; the latter allows
no escapes (not even for ` itself) and also allows multiple lines.

the case that's always been awkward with conventional quoting
syntax is when there's a substantial string that spans multiple lines,
for example a template or a subprogram in an embedded language.

a) if it spans more than a page, it's not easy to see where string
stops and program starts (particularly if the embedded syntax and
the program code syntax are similar or identical)

b) it doesn't work well with program indentation - if the block of code
containing the string is indented, then the contents of the string
gets the indentation too, which is unnecessary and possibly
error-inducing.

c) it doesn't self-embed well. for instance, currently it's not
possible to embed one `...` string inside another. multiple
embedding of \ gives exponential increase, and consequent unreadability.

here's a proposal that avoids these problems
and still fits (nicely, i think) within the go syntax:


An indented string literal starts with a " character followed
by optional whitespace characters, followed by a single non-whitespace
character, the indent character, followed by a newline.
The indent character may not be \, " or /.
This initial sequence is not part of the value of the string.

Each subsequent line must consist of optional whitespace characters,
followed by either the indent character, in which case
the rest of the line is appended to the string (including
the trailing newline), or ", in which case the string literal terminates.

Other than that, all the rules are the same as for the
other kinds of string literals.

Note that because of the strict rules on indented lines,
it's very unlikely that an accidentally unterminated string
constant will get parsed correctly as an indented string literal.
Note also, that it's potentially possible to allow whitespace
and comments inside the string literal. I haven't yet decided if that's
a good or a bad idea.

I've implemented this scheme - it adds two extra functions and about 80
lines to src/cmd/gc/lex.c


Examples:

func f() string {
return "#
#line 1
#line 2
"
}

is the same as:

func f() string {
return "line1\nline2\n";
}

// parse a template string, and return a function (closure) which
// instantiates it with respect to some environment
// variables held in env.

type Env map[string] string
func Parsetemplate(string) func(e Env) string
var render = Parsetemplate("|
|<html>
|<body>
|<h1 $header>
|$text $etc
|</body>
|</html>
")

fmt.Printf("%s", render(Env{"header": "Title", "text": "Hello,",
"etc": "world"}));

roger peppe

unread,
Nov 20, 2009, 9:02:58 AM11/20/09
to golang-nuts
2009/11/20 i wrote:
> // parse a template string, and return a function (closure) which
> // instantiates it with respect to some environment
> // variables held in env.
>
> type Env map[string] string
> func Parsetemplate(string) func(e Env) string

as a little bit of example code, i've attached an implementation
of the above interface. it's quite a nice reminder of how
powerful closures are (and also how tail recursion optimisation
cannot always be replaced by goto).

it's also demonstrates another possible implementation strategy for the
existing template package (which i didn't know about until
i got a compilation conflict!), potentially speeding it up and
simplifying it.
tmplate.go
tsttmplate.go

roger peppe

unread,
Nov 20, 2009, 10:45:46 AM11/20/09
to Ian Lance Taylor, golang-nuts
2009/11/20 Ian Lance Taylor <ia...@google.com>:
> roger peppe <rogp...@gmail.com> writes:
>
>> func f() string {
>>       return "#
>>               #line 1
>>               #line 2
>>               "
>> }
>
> I guess my question would be whether this is significantly better than
>
>        return "\n"
>               "line 1\n"
>               "line 2\n"
>
> In your version you don't have to write \n, but you do have to write
> something like #.

because it's trivial to take a piece of text, paste it into the text
editor and do s/^/#/ or whatever your editor's idiom is.
no need to quote backslashes.
and it's equally trivial to unquote it (for instance to run it through
a shell command)
and it nests well:

interpreter("#
# print("#
# # doubly embedded text
# ");
");

i know it's unusual, but i think it works quite well.

roger peppe

unread,
Nov 20, 2009, 11:34:28 AM11/20/09
to Ian Lance Taylor, golang-nuts
>> I guess my question would be whether this is significantly better than

the other thing is that inside an indent-quoted string,
everything to the right of the indent character is literally
the text itself, so you can practically forget that you're editing
text inside a string
and concentrate entirely on the text itself, making it more visually transparent
and thus less error-prone.

Tonic Artos

unread,
Nov 20, 2009, 8:35:13 PM11/20/09
to roger peppe, Ian Lance Taylor, golang-nuts
2009/11/21 roger peppe <rogp...@gmail.com>:
>>> I guess my question would be whether this is significantly better than
>
> the other thing is that inside an indent-quoted string,
> everything to the right of the indent character is literally
> the text itself, so you can practically forget that you're editing
> text inside a string
> and concentrate entirely on the text itself, making it more visually transparent
> and thus less error-prone.

Just use a new char, maybe ¨, for string indented literal or add a
prefix to ", say ^. This gets rid of special formatting within the
string.

Using a string format prefix char

func f() string {
return ^"some text
some indented text
some text"
}

or using a new char courtesy of my composite key compose(", ")

func f() string {
return ¨some text
some indented text
more text¨
}

Edward Marshall

unread,
Nov 20, 2009, 10:55:15 PM11/20/09
to golang-nuts
On Fri, Nov 20, 2009 at 7:35 PM, Tonic Artos <ghata...@gmail.com> wrote:
func f() string {
  return ¨some text
              some indented text
          more text¨
}

Is this really any more readable than:

func f() string {
return
"some text\n"
"some indented text\n"
"more text\n"
}

Just curious, since the above already works today.

--
Ed Marshall <e...@logic.net>
Felix qui potuit rerum cognoscere causas.
http://esm.logic.net/

roger peppe

unread,
Nov 21, 2009, 10:33:00 AM11/21/09
to golang-nuts
2009/11/21 Tonic Artos <ghata...@gmail.com>:
> func f() string {
>   return ^"some text
>                some indented text
>             some text"
> }

the point of the character at the start of the line
is so that the contents of the string are not
sensitive to indentation white space.
i can't see how that can work in your example.

2009/11/21 Edward Marshall <e...@logic.net>:
> Is this really any more readable than:
> func f() string {
> return
> "some text\n"
> "some indented text\n"
> "more text\n"
> }

i think that:

code := "|
|func f() string {
|return
| "some text\n"
| "some indented text\n"
| "more text\n"
|}
";

is easier to read (and more easily maintained) than:

code := "func f() string {\n"
"return\n"
" \"some text\\n\"\n"
" \"some indented text\\n\"\n"
" \"more text\\n\"\n"
"}\n";

but it seems like i'm the only one that does :-)

Brian Slesinsky

unread,
Nov 21, 2009, 6:05:12 PM11/21/09
to golang-nuts
We have some perfectly good ways of quoting multiline strings in other
languages. I like the way we do it in email since nesting levels are
clear, but "here" documents from the Bourne shell would also work.

Perhaps using '^' to indicate that an attachment follows:

func GetPythonScript() string {
return ^
> def foo():
> print "hello"
}

(And this should still look okay after someone replies to this
message.)

- Brian

roger peppe

unread,
Nov 22, 2009, 4:19:05 PM11/22/09
to Brian Slesinsky, golang-nuts
2009/11/21 Brian Slesinsky <bsles...@gmail.com>:
> I like the way we do it in email since nesting levels are
> clear, but "here" documents from the Bourne shell would also work.

that's where the idea came from.
Reply all
Reply to author
Forward
0 new messages