Multiline strings semantics (leading newline)

715 Aufrufe
Direkt zur ersten ungelesenen Nachricht

Olov Lassus

ungelesen,
27.10.2011, 04:09:3727.10.11
an mi...@dartlang.org
Dart VM currently ignores the first leading newline in multiline
strings. I was pretty sure that this was a bug, especially since dartc
doesn't do that, so I opened up issue 240
<http://code.google.com/p/dart/issues/detail?id=240>. It got adjusted
to a spec bug since this is a yet undocumented feature.

Two existing languages where multiline strings are used a lot are
Python and Ruby, and neither of them ignore the first leading newline.
Java doesn't support multiline strings but C# does (when using @
string literals) and doesn't ignore the first leading newline either.

Now I can definitively see why ignoring it can be a desirable thing
for many uses. But then there's this familiarity thing that has been
mentioned a number of times. I'd be curious to hear your thoughts
about the balance of the two.

Ignoring the first leading newline in multiline strings basically turns
var s = """
hellojed
""";

into
var s = """hellojed
""";

/Olov

Peter Ahé

ungelesen,
27.10.2011, 06:20:1227.10.11
an Olov Lassus, mi...@dartlang.org
I think I championed the current behavior. I guess I had misunderstood
how it worked in other languages.

Cheers,
Peter

Peter Ahé

ungelesen,
27.10.2011, 06:20:4327.10.11
an Olov Lassus, mi...@dartlang.org
Just to be clear: so I think you're right it is a mistake.

Cheers,
Peter

Olov Lassus

ungelesen,
27.10.2011, 06:51:4727.10.11
an Peter Ahé, mi...@dartlang.org
Glad I took it to misc then. I have updated issue 240 based on your
reply <http://code.google.com/p/dart/issues/detail?id=240>.

/Olov

Jim Hugunin

ungelesen,
27.10.2011, 09:35:5327.10.11
an Olov Lassus, Peter Ahé, mi...@dartlang.org
Hmm, I was just commenting on a separate code review yesterday how much I liked this feature of ignoring an initial newline in multiline strings.

In Python, I will often start my multiline strings with a line continuation character so that I can format them as desired without a leading newline, i.e.

body = '''\
<html>
  <head>
  </head>
  ..
</html>'''

Since dart doesn't have the line continuation character - which makes me happy - I thought this was a nice way to be able to get that behavior.

I'm not sure what my actual opinion is, but I wanted to share what it was that I liked about the current design.

Thanks - Jim

Peter Ahé

ungelesen,
27.10.2011, 10:20:2027.10.11
an Jim Hugunin, Olov Lassus, mi...@dartlang.org
Hi Jim,

Do you have a sense for how frequently the trailing newline leads to
poorly formatted strings in Python?

Cheers,
Peter

John Tamplin

ungelesen,
27.10.2011, 10:24:2627.10.11
an Peter Ahé, Jim Hugunin, Olov Lassus, mi...@dartlang.org
On Thu, Oct 27, 2011 at 10:20 AM, Peter Ahé <a...@google.com> wrote:
Do you have a sense for how frequently the trailing newline leads to
poorly formatted strings in Python?

Perl also has mutliline strings, via <<TAG, which requires that it start on the next line.

Personally, it makes sense that a multiline string would start after a line break, but my objection to them has always been the indentation doesn't fit with the rest of the program.  I'm not sure there is a good way to fix that though, since the first line may well have leading spaces. 

--
John A. Tamplin
Software Engineer (GWT), Google

Jim Hugunin

ungelesen,
27.10.2011, 10:29:0727.10.11
an Peter Ahé, Olov Lassus, mi...@dartlang.org
For a really quick read, I just looked at the scripts in frog.  There are two true multiline strings in there (all of the others are doc comments).

Both of those would be better off without a leading newline.  One (in codegen.py) would have caused really ugly code with a leading newline, so it follows the pattern of starting the string on the same line as the '''.  However, this leads to an excessively long line and harder to read formatting.

The other one (in tokenizer_gen.py) doesn't worry about the initial newline and happens to turn out okay in this case.

BTW - The reason that these didn't use the line continuation character is that my brain is still struggling to move back and forth between dart and python and since dart doesn't have line continuations...

My rough sense is that almost all Python multiline strings look better and format better if they use the '''\ pattern to start the true string on the next line.  However, this is just a gut feeling.

Thanks - Jim

Here's the semi-ugly string for codegen.py:

HEADER = '''// Copyright (c) 2011, the Dart project authors.  Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
// Generated by %s.

'''

Colin Putney

ungelesen,
27.10.2011, 14:05:3227.10.11
an Olov Lassus, mi...@dartlang.org
On Thu, Oct 27, 2011 at 1:09 AM, Olov Lassus <olov....@gmail.com> wrote:
> Dart VM currently ignores the first leading newline in multiline
> strings. I was pretty sure that this was a bug, especially since dartc
> doesn't do that, so I opened up issue 240
> <http://code.google.com/p/dart/issues/detail?id=240>. It got adjusted
> to a spec bug since this is a yet undocumented feature.
>
> Two existing languages where multiline strings are used a lot are
> Python and Ruby, and neither of them ignore the first leading newline.
> Java doesn't support multiline strings but C# does (when using @
> string literals) and doesn't ignore the first leading newline either.

It's worth noting that both Ruby and Python have ways to create
multiline strings where the leading newline *is* ignored. Ruby has
Heredoc literals, like this:

$ irb
ruby-1.9.2-p180 :001 > s = <<END
ruby-1.9.2-p180 :002"> hellojed
ruby-1.9.2-p180 :003"> END
=> "hellojed\n"
ruby-1.9.2-p180 :004 >

Perl, PHP and Bash all support Heredoc as well.

In Python you can escape the initial newline to get a Heredoc-like effect.

I think ignoring the leading newline leads to better-looking code,
without begin confusing to users of other languages. Also, consider
the failure mode. Most of the time, people who expect a leading
newline to be part of the string won't actually want the newline.
They'll just put the beginning of the string right after the quotes,
and get what they expect, never knowing that it could have been nicely
formatted. I bet the number of people confused by the absence of a
newline they expected would be vanishingly small.

Colin

Bob Nystrom

ungelesen,
27.10.2011, 14:54:0627.10.11
an Colin Putney, Olov Lassus, mi...@dartlang.org
Out of curiosity, I hunted through our existing Dart code a bit to see how we use multi-line strings. Here's some examples:

    return new View.html(
        '''
        <div>
          Add or remove feeds in
          <a href="https://www.google.com/reader" target="_blank">
            Google Reader</a>'s "Subscriptions".
          Then come back here and click "Done" and we'll load your updated
          list of subscriptions.
        </div>
        ''');

    return '''<table width="90%" border=1 cellspacing="0" cellpadding="2">
            <tr bgcolor="#c3d9ff"> 
              ${cellStart} Shortcut Key </th>
              ${cellStart} Action </th>
            </tr>
            ...
            <tr>
              ${cellStart} p </th>
              ${cellStart} Previous Category </th>
            </tr>

        </table>''';

    node = new Element.html('''
        <div class="$storyClass">
          <div class="story-shadow"></div>
          ...
          <div class="caption">
            <div class="snippet"></div>
          </div>
        </div>''');

    window.console.warn('''Could not find an exact solution. LastY=${lastY},
        targetY=${targetY} lastX=$lastX delta=$delta  deltaX=$deltaX
        deltaY=$deltaY''');

So it looks we don't even have a consistent style on how we use them. Personally, I like the look of the first example here. Given that, it would make me happy if it stripped leading (and trailing?) newlines, and also removed as many indentation characters from each line as there are before the opening '''. So this:

    return new View.html(
        '''
        <div>
          Add or remove feeds in
          <a href="https://www.google.com/reader" target="_blank">
            Google Reader</a>'s "Subscriptions".
          Then come back here and click "Done" and we'll load your updated
          list of subscriptions.
        </div>
        ''');

Would yield a string that's exactly:

<div>\n
__Add or remove feeds in\n
__<a href="https://www.google.com/reader" target="_blank">\n
____Google Reader</a>'s "Subscriptions".\n
__Then come back here and click "Done" and we'll load your updated\n
__list of subscriptions.\n
</div>

But I don't have much of an opinion on this one way or the other.

- bob

John Tamplin

ungelesen,
27.10.2011, 15:00:0727.10.11
an Bob Nystrom, Colin Putney, Olov Lassus, mi...@dartlang.org
On Thu, Oct 27, 2011 at 11:54 AM, Bob Nystrom <rnys...@google.com> wrote:
Given that, it would make me happy if it stripped leading (and trailing?) newlines, and also removed as many indentation characters from each line as there are before the opening '''. So this:

    return new View.html(
        '''
        <div>
          Add or remove feeds in
          <a href="https://www.google.com/reader" target="_blank">
            Google Reader</a>'s "Subscriptions".
          Then come back here and click "Done" and we'll load your updated
          list of subscriptions.
        </div>
        ''');

Would yield a string that's exactly:

<div>\n
__Add or remove feeds in\n
__<a href="https://www.google.com/reader" target="_blank">\n
____Google Reader</a>'s "Subscriptions".\n
__Then come back here and click "Done" and we'll load your updated\n
__list of subscriptions.\n
</div>

But I don't have much of an opinion on this one way or the other.

Great idea -- if the ''' is on a line with only whitespace, strip off the same amount of whitespace from each following line.  It would be an error to not have that exact whitespace as a prefix to the each following line (otherwise mixing tabs/spaces in lines would be a problem).

This lets it fit in the rest of the indentation in surrounding code when needed, and doesn't break any existing uses.

Olov Lassus

ungelesen,
27.10.2011, 15:26:1627.10.11
an John Tamplin, Bob Nystrom, Colin Putney, mi...@dartlang.org
On Thu, Oct 27, 2011 at 9:00 PM, John Tamplin <j...@google.com> wrote:
> Great idea -- if the ''' is on a line with only whitespace, strip off the
> same amount of whitespace from each following line.  It would be an error to
> not have that exact whitespace as a prefix to the each following line
> (otherwise mixing tabs/spaces in lines would be a problem).
> This lets it fit in the rest of the indentation in surrounding code when
> needed, and doesn't break any existing uses.

I really like this approach too. Seems like ignoring leading newline
is here to stay for now (as clarified in
<http://code.google.com/p/dart/issues/detail?id=240#c5>) so I guess
dartc should mimic the current VM behavior. I'll file another bug for
that unless someone tells me it's not necessary.

My takeaway from this thread (thanks everyone for participating) is that
1) Ignoring leading newline is indeed useful and lends itself to more
beautifully formatted code.
2) The familiarity issue isn't that big of a deal since it is likely
that many programmers don't intuitively understand leading newline
semantics in other languages, or at least don't use it consistently.

/Olov

Allen antworten
Antwort an Autor
Weiterleiten
0 neue Nachrichten