[Python-ideas] Implicit string literal concatenation considered harmful?

848 views
Skip to first unread message

Guido van Rossum

unread,
May 10, 2013, 2:48:51 PM5/10/13
to Python-Ideas
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').

This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).

Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)

Would it be reasonable to start deprecating this and eventually remove
it from the language?

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Matt Chaput

unread,
May 10, 2013, 2:50:47 PM5/10/13
to python...@python.org
On 5/10/2013 2:48 PM, Guido van Rossum wrote:
> Would it be reasonable to start deprecating this and eventually remove
> it from the language?

Yes please! I've been bitten by the same issue more than once.

Matt

Dave Peticolas

unread,
May 10, 2013, 2:58:49 PM5/10/13
to Guido van Rossum, Python-Ideas
2013/5/10 Guido van Rossum <gu...@python.org>
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').

This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).

Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)

Would it be reasonable to start deprecating this and eventually remove
it from the language?

From my perspective as a Python user (not knowing anything about the
ramifications for the required changes to the parser, etc.) it is very reasonable.
This bug is very hard to spot when it happens, and an argument count error
is really one of the more benign forms it can take.

 

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas



--
--Dave Peticolas

Antoine Pitrou

unread,
May 10, 2013, 3:16:13 PM5/10/13
to python...@python.org
On Fri, 10 May 2013 11:48:51 -0700
Guido van Rossum <gu...@python.org> wrote:
> I just spent a few minutes staring at a bug caused by a missing comma
> -- I got a mysterious argument count error because instead of foo('a',
> 'b') I had written foo('a' 'b').
>
> This is a fairly common mistake, and IIRC at Google we even had a lint
> rule against this (there was also a Python dialect used for some
> specific purpose where this was explicitly forbidden).
>
> Now, with modern compiler technology, we can (and in fact do) evaluate
> compile-time string literal concatenation with the '+' operator, so
> there's really no reason to support 'a' 'b' any more. (The reason was
> always rather flimsy; I copied it from C but the reason why it's
> needed there doesn't really apply to Python, as it is mostly useful
> inside macros.)
>
> Would it be reasonable to start deprecating this and eventually remove
> it from the language?

I'm rather -1. It's quite convenient and I don't want to add some '+'
signs everywhere I use it. I'm sure many people also have long string
literals out there and will have to endure the pain of a dull task to
"fix" their code.

However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
the "continuation" is on the same line.

Regards

Antoine.

Ezio Melotti

unread,
May 10, 2013, 3:18:19 PM5/10/13
to Antoine Pitrou, python-ideas
On Fri, May 10, 2013 at 10:16 PM, Antoine Pitrou <soli...@pitrou.net> wrote:
> On Fri, 10 May 2013 11:48:51 -0700
> Guido van Rossum <gu...@python.org> wrote:
>>
>> Would it be reasonable to start deprecating this and eventually remove
>> it from the language?
>
> I'm rather -1. It's quite convenient and I don't want to add some '+'
> signs everywhere I use it. I'm sure many people also have long string
> literals out there and will have to endure the pain of a dull task to
> "fix" their code.
>
> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
> the "continuation" is on the same line.
>

I was going to say the exact same thing -- you just read my mind :)

Alexander Belopolsky

unread,
May 10, 2013, 3:26:10 PM5/10/13
to Guido van Rossum, Python-Ideas
On Fri, May 10, 2013 at 2:48 PM, Guido van Rossum <gu...@python.org> wrote:
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').

I had a similar experience just few weeks ago.  The bug was in a long list written like this:

['item11', 'item12', ..., 'item17',
 'item21', 'item22', ..., 'item27' 
 ...
 'item91', 'item92', ..., 'item97']

Clearly the bug crept in when more items were added.   (I try to keep redundant commas at the end of the list to avoid this, but not everyone likes this style.) 
 
Would it be reasonable to start deprecating this and eventually remove
it from the language?

+1, but I would start by requiring () around concatenated strings. 

M.-A. Lemburg

unread,
May 10, 2013, 3:28:48 PM5/10/13
to Antoine Pitrou, python...@python.org
On 10.05.2013 21:16, Antoine Pitrou wrote:
> On Fri, 10 May 2013 11:48:51 -0700
> Guido van Rossum <gu...@python.org> wrote:
>> I just spent a few minutes staring at a bug caused by a missing comma
>> -- I got a mysterious argument count error because instead of foo('a',
>> 'b') I had written foo('a' 'b').
>>
>> This is a fairly common mistake, and IIRC at Google we even had a lint
>> rule against this (there was also a Python dialect used for some
>> specific purpose where this was explicitly forbidden).
>>
>> Now, with modern compiler technology, we can (and in fact do) evaluate
>> compile-time string literal concatenation with the '+' operator, so
>> there's really no reason to support 'a' 'b' any more. (The reason was
>> always rather flimsy; I copied it from C but the reason why it's
>> needed there doesn't really apply to Python, as it is mostly useful
>> inside macros.)
>>
>> Would it be reasonable to start deprecating this and eventually remove
>> it from the language?
>
> I'm rather -1. It's quite convenient and I don't want to add some '+'
> signs everywhere I use it. I'm sure many people also have long string
> literals out there and will have to endure the pain of a dull task to
> "fix" their code.
>
> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
> the "continuation" is on the same line.

Nice idea.

I mostly use this feature when writing multi-line or too-long-to-fit-on-
one-editor-line string literals:

s = ('abc\n'
'def\n'
'ghi\n')
t = ('some long paragraph spanning multiple lines in an editor, '
'without newlines')

This looks and works much better than triple-quoted string literals,
esp. when defining such string literals in indented code.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 10 2013)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46
2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

Guido van Rossum

unread,
May 10, 2013, 3:30:15 PM5/10/13
to Antoine Pitrou, python...@python.org
On Fri, May 10, 2013 at 12:16 PM, Antoine Pitrou <soli...@pitrou.net> wrote:
> On Fri, 10 May 2013 11:48:51 -0700
> Guido van Rossum <gu...@python.org> wrote:
>> I just spent a few minutes staring at a bug caused by a missing comma
>> -- I got a mysterious argument count error because instead of foo('a',
>> 'b') I had written foo('a' 'b').
>>
>> This is a fairly common mistake, and IIRC at Google we even had a lint
>> rule against this (there was also a Python dialect used for some
>> specific purpose where this was explicitly forbidden).
>>
>> Now, with modern compiler technology, we can (and in fact do) evaluate
>> compile-time string literal concatenation with the '+' operator, so
>> there's really no reason to support 'a' 'b' any more. (The reason was
>> always rather flimsy; I copied it from C but the reason why it's
>> needed there doesn't really apply to Python, as it is mostly useful
>> inside macros.)
>>
>> Would it be reasonable to start deprecating this and eventually remove
>> it from the language?
>
> I'm rather -1. It's quite convenient and I don't want to add some '+'
> signs everywhere I use it. I'm sure many people also have long string
> literals out there and will have to endure the pain of a dull task to
> "fix" their code.

Fixing this is an easy task for lib2to3 though.

I think the "convenience" argument doesn't cut it -- if Python didn't
have it, can you imagine it being added? It would never make it past
all the examples of code broken by missing commas.

> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
> the "continuation" is on the same line.

There are plenty of examples where the continuation isn't on the same
line (some were already posted here).

--
--Guido van Rossum (python.org/~guido)

Ethan Furman

unread,
May 10, 2013, 3:07:10 PM5/10/13
to Python-Ideas
On 05/10/2013 11:48 AM, Guido van Rossum wrote:
> I just spent a few minutes staring at a bug caused by a missing comma
> -- I got a mysterious argument count error because instead of foo('a',
> 'b') I had written foo('a' 'b').
>
> This is a fairly common mistake, and IIRC at Google we even had a lint
> rule against this (there was also a Python dialect used for some
> specific purpose where this was explicitly forbidden).
>
> Now, with modern compiler technology, we can (and in fact do) evaluate
> compile-time string literal concatenation with the '+' operator, so
> there's really no reason to support 'a' 'b' any more. (The reason was
> always rather flimsy; I copied it from C but the reason why it's
> needed there doesn't really apply to Python, as it is mostly useful
> inside macros.)
>
> Would it be reasonable to start deprecating this and eventually remove
> it from the language?

Sounds good to me.

--
~Ethan~

Antoine Pitrou

unread,
May 10, 2013, 3:37:07 PM5/10/13
to python...@python.org
On Fri, 10 May 2013 12:30:15 -0700
Guido van Rossum <gu...@python.org> wrote:
> On Fri, May 10, 2013 at 12:16 PM, Antoine Pitrou <soli...@pitrou.net> wrote:
> > On Fri, 10 May 2013 11:48:51 -0700
> > Guido van Rossum <gu...@python.org> wrote:
> >> I just spent a few minutes staring at a bug caused by a missing comma
> >> -- I got a mysterious argument count error because instead of foo('a',
> >> 'b') I had written foo('a' 'b').
> >>
> >> This is a fairly common mistake, and IIRC at Google we even had a lint
> >> rule against this (there was also a Python dialect used for some
> >> specific purpose where this was explicitly forbidden).
> >>
> >> Now, with modern compiler technology, we can (and in fact do) evaluate
> >> compile-time string literal concatenation with the '+' operator, so
> >> there's really no reason to support 'a' 'b' any more. (The reason was
> >> always rather flimsy; I copied it from C but the reason why it's
> >> needed there doesn't really apply to Python, as it is mostly useful
> >> inside macros.)
> >>
> >> Would it be reasonable to start deprecating this and eventually remove
> >> it from the language?
> >
> > I'm rather -1. It's quite convenient and I don't want to add some '+'
> > signs everywhere I use it. I'm sure many people also have long string
> > literals out there and will have to endure the pain of a dull task to
> > "fix" their code.
>
> Fixing this is an easy task for lib2to3 though.

Assuming someone does it :-)

You may also have to "fix" other software. For example, I don't know if
gettext supports fetching literals from triple-quoted Python strings,
while it works with string continuations.

As for "+", saying it is a replacement is a bit simplified, because
the syntax definition (for method calls) or operator precedence (for
e.g. %-formatting) may force you to add parentheses.

Regards

Antoine.

Barry Warsaw

unread,
May 10, 2013, 3:41:16 PM5/10/13
to python...@python.org
On May 10, 2013, at 09:28 PM, M.-A. Lemburg wrote:

>>> Would it be reasonable to start deprecating this and eventually remove
>>> it from the language?

I'm pretty mixed. OT1H, you're right, it's a common mistake and often *very*
hard to spot. A SyntaxWarning when it appears on a single line doesn't help
because I'm much more likely to forget a trailing comma in situations like:

files = [
'/tmp/foo',
'/etc/passwd'
'/etc/group',
'/var/cache',
]

(g'wan, spot the missing comma ;).

OTOH, doing things like:

>s = ('abc\n'
> 'def\n'
> 'ghi\n')
>t = ('some long paragraph spanning multiple lines in an editor, '
> 'without newlines')

Is pretty common in code I see all the time. I'm not sure why; I use it
occasionally, but only very rarely. A lot of folks like this style a lot
though from what I can tell.

>This looks and works much better than triple-quoted string literals,
>esp. when defining such string literals in indented code.

I also see this code a lot:

from textwrap import dedent

s = dedent("""\
abc
def
ghi
""")

I think having to deal with indentation could be a common reason why people
use implicit concatenation instead of TQS.

All things considered, I think the difficult-to-spot bugginess of implicit
concatenation outweighs the occasional convenience of it.

-Barry
signature.asc

M.-A. Lemburg

unread,
May 10, 2013, 3:46:44 PM5/10/13
to Guido van Rossum, Antoine Pitrou, python...@python.org
On 10.05.2013 21:30, Guido van Rossum wrote:
> On Fri, May 10, 2013 at 12:16 PM, Antoine Pitrou <soli...@pitrou.net> wrote:
>> On Fri, 10 May 2013 11:48:51 -0700
>> Guido van Rossum <gu...@python.org> wrote:
>>> I just spent a few minutes staring at a bug caused by a missing comma
>>> -- I got a mysterious argument count error because instead of foo('a',
>>> 'b') I had written foo('a' 'b').
>>>
>>> This is a fairly common mistake, and IIRC at Google we even had a lint
>>> rule against this (there was also a Python dialect used for some
>>> specific purpose where this was explicitly forbidden).
>>>
>>> Now, with modern compiler technology, we can (and in fact do) evaluate
>>> compile-time string literal concatenation with the '+' operator, so
>>> there's really no reason to support 'a' 'b' any more. (The reason was
>>> always rather flimsy; I copied it from C but the reason why it's
>>> needed there doesn't really apply to Python, as it is mostly useful
>>> inside macros.)
>>>
>>> Would it be reasonable to start deprecating this and eventually remove
>>> it from the language?
>>
>> I'm rather -1. It's quite convenient and I don't want to add some '+'
>> signs everywhere I use it. I'm sure many people also have long string
>> literals out there and will have to endure the pain of a dull task to
>> "fix" their code.
>
> Fixing this is an easy task for lib2to3 though.

Think about code written to work in Python 2 and 3.

Python 2 would have to get the compile-time concatenation as well,
to prevent slow-downs due to run-time concatenation. And there would
have to be a tool to add the '+' signs and parens
to the Python 2 code...

s = ('my name is %s and '
'I live on %s street' % ('foo', 'bar'))

-->

s = ('my name is %s and ' +
'I live on %s street' % ('foo', 'bar'))

results in:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
TypeError: not all arguments converted during string formatting

The second line is also a good example of how removing the feature
would introduce a new difficult to see error :-)

IMO, the issue is a task for an editor or a lint tool to highlight,
not the Python compiler, IMO.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 10 2013)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46
2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

MRAB

unread,
May 10, 2013, 3:54:58 PM5/10/13
to python-ideas
I'm not so sure.

Currently, parentheses, brackets and braces effectively make Python
ignore a newline within them.

(1
+2)

is the same as:

(1+2)

and:

[1
+2]

is the same as:

[1+2]

Under the proposal:

("a"
"b")

or:

("a" "b")

would be the same as:

("ab")

but:

["a"
"b"]

or:

["a" "b"]

would be a syntax error.

Michael Foord

unread,
May 10, 2013, 4:09:08 PM5/10/13
to Antoine Pitrou, Python-Ideas
On 10 May 2013 20:16, Antoine Pitrou <soli...@pitrou.net> wrote:
On Fri, 10 May 2013 11:48:51 -0700
Guido van Rossum <gu...@python.org> wrote:
> I just spent a few minutes staring at a bug caused by a missing comma
> -- I got a mysterious argument count error because instead of foo('a',
> 'b') I had written foo('a' 'b').
>
> This is a fairly common mistake, and IIRC at Google we even had a lint
> rule against this (there was also a Python dialect used for some
> specific purpose where this was explicitly forbidden).
>
> Now, with modern compiler technology, we can (and in fact do) evaluate
> compile-time string literal concatenation with the '+' operator, so
> there's really no reason to support 'a' 'b' any more. (The reason was
> always rather flimsy; I copied it from C but the reason why it's
> needed there doesn't really apply to Python, as it is mostly useful
> inside macros.)
>
> Would it be reasonable to start deprecating this and eventually remove
> it from the language?

I'm rather -1. It's quite convenient and I don't want to add some '+'
signs everywhere I use it. I'm sure many people also have long string
literals out there and will have to endure the pain of a dull task to
"fix" their code.

However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
the "continuation" is on the same line.



I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines.

Michael
 
Regards

Antoine.


_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas



--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

Haoyi Li

unread,
May 10, 2013, 4:24:32 PM5/10/13
to python-ideas
+1; I've been bitten by this many times.

As already mentioned, one big use case where this is useful is having multiline string literals without having all the annoying indentation leak into your code. I think this could be easily fixed with a convenient .dedent() or .strip_margin() function.

Ezio Melotti

unread,
May 10, 2013, 4:40:08 PM5/10/13
to python-ideas
On Fri, May 10, 2013 at 10:54 PM, MRAB <pyt...@mrabarnett.plus.com> wrote:
> Under the proposal:
>
> ("a"
> "b")
>
> or:
>
> ("a" "b")
>
> would be the same as:
>
> ("ab")
>
> but:
>
> ["a"
> "b"]
>
> or:
>
> ["a" "b"]
>
> would be a syntax error.
>

This would actually be fine with me. I use implicit string literal
concatenation mostly within (...), and even though I've seen (and
sometimes written) code like
['this is a '
'long string',
'this is another '
'long string']
I agree that requiring extra (...) in this case is reasonable, i.e.:
[('this is a '
'long string'),
('this is another '
'long string')]
The same would apply to other literals like {...} (for both sets and
dicts), and possibly for tuples too (assuming that it's possible to
figure out when a tuple is being created).

I also write code like:
raise SomeException('this is a long message '
'that spans 2 lines')
or even:
self.assertTrue(somefunc(), 'somefunc() returned '
'a false value and this is wrong')
In these cases I wouldn't like redundant (...) (or even worse extra
'+'s), especially for the first case.

I also think that forgetting a comma in a list of function args
between two string literal args is quite uncommon, whereas forgetting
it in a sequence of strings (list, set, dict, tuple) is much more
common, so this approach should cover most of the cases.

Best Regards,
Ezio Melotti

Serhiy Storchaka

unread,
May 10, 2013, 5:12:26 PM5/10/13
to python...@python.org
10.05.13 23:40, Ezio Melotti написав(ла):

> I also think that forgetting a comma in a list of function args
> between two string literal args is quite uncommon, whereas forgetting
> it in a sequence of strings (list, set, dict, tuple) is much more
> common, so this approach should cover most of the cases.

Tuples.

Antonio Messina

unread,
May 10, 2013, 5:17:21 PM5/10/13
to Ezio Melotti, python-ideas
My 2 cents: as an user, I often split very long text lines (mostly log
entries or exception messages) into multiple lines in order to stay
under 80chars (PEP8 docet), like:

log.warning("Configuration item '%s' was renamed to '%s',"
" please change occurrences of '%s' to '%s'"
" in configuration file '%s'.",
oldkey, newkey, oldkey, newkey, filename)

This should become (if I understand the proposal) something like:

log.warning("Configuration item '%s' was renamed to " % oldkey +
"'%s', please change occurrences of '%s'" % (newkey, oldkey) +
" to '%s' in configuration file '%s'." % (newkey, filename))

but imagine what would happen if you have to rephrase the text, and
reorder the variables and fix the `+` signs...

On the other hands, I think I've only got the ``func("a" "b")`` error
once or twice in my life.

.a.


--
antonio....@gmail.com
antonio...@uzh.ch +41 (0)44 635 42 22
GC3: Grid Computing Competence Center http://www.gc3.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland

Ian Cordasco

unread,
May 10, 2013, 5:53:10 PM5/10/13
to Antonio Messina, python-ideas
On Fri, May 10, 2013 at 5:17 PM, Antonio Messina
<antonio....@gmail.com> wrote:
> My 2 cents: as an user, I often split very long text lines (mostly log
> entries or exception messages) into multiple lines in order to stay
> under 80chars (PEP8 docet), like:
>
> log.warning("Configuration item '%s' was renamed to '%s',"
> " please change occurrences of '%s' to '%s'"
> " in configuration file '%s'.",
> oldkey, newkey, oldkey, newkey, filename)

Actually it would just become

log.warning(("Configuration item '%s' was renamed to '%s'," +
" please change occurrences of '%s' to '%s'" +
" in configuration file '%s'."),
oldkey, newkey, oldkey, newkey, filename)

Perhaps without the inner set of parentheses. The issue of string
formatting wouldn't apply here since log.* does the formatting for
you. A more apt example of what they were talking about earlier is

s = ("foo %s bar bogus" % (var1)
"spam %s spam %s spam" % (var2, var3))

Would have to become

s = (("foo %s bar bogus" % (var1)) +
("spam %s spam %s spam" % (var2, var3)))

Because + has operator precedence over % otherwise, var1 would be
concatenated with "spam %s spam %s spam" and then you would have
substitution take place.

Antonio Messina

unread,
May 10, 2013, 6:00:16 PM5/10/13
to Ian Cordasco, python-ideas
On Fri, May 10, 2013 at 11:53 PM, Ian Cordasco
<graffatc...@gmail.com> wrote:
> On Fri, May 10, 2013 at 5:17 PM, Antonio Messina
> <antonio....@gmail.com> wrote:
>> My 2 cents: as an user, I often split very long text lines (mostly log
>> entries or exception messages) into multiple lines in order to stay
>> under 80chars (PEP8 docet), like:
>>
>> log.warning("Configuration item '%s' was renamed to '%s',"
>> " please change occurrences of '%s' to '%s'"
>> " in configuration file '%s'.",
>> oldkey, newkey, oldkey, newkey, filename)
>
> Actually it would just become
>
> log.warning(("Configuration item '%s' was renamed to '%s'," +
> " please change occurrences of '%s' to '%s'" +
> " in configuration file '%s'."),
> oldkey, newkey, oldkey, newkey, filename)
>
> Perhaps without the inner set of parentheses. The issue of string
> formatting wouldn't apply here since log.* does the formatting for
> you. A more apt example of what they were talking about earlier is

You are right, I've picked up the wrong example. Please rephrase it
using "raise SomeException()" instead of "log.warning()", which is the
other case I often have to split the string over multiple lines:

raise ConfigurationError("Configuration tiem '%s' was renamed to '%s',"
" please change occurrences of '%s' to '%s'"
" in configuration file '%s'." %
(oldkey, newkey, oldkey, newkey, filename))

.a.

--
antonio....@gmail.com
antonio...@uzh.ch +41 (0)44 635 42 22
GC3: Grid Computing Competence Center http://www.gc3.uzh.ch/
University of Zurich
Winterthurerstrasse 190
CH-8057 Zurich Switzerland

Juancarlo Añez

unread,
May 10, 2013, 6:07:24 PM5/10/13
to Guido van Rossum, Antoine Pitrou, python...@python.org

On Fri, May 10, 2013 at 3:00 PM, Guido van Rossum <gu...@python.org> wrote:
There are plenty of examples where the continuation isn't on the same
line (some were already posted here).

+1

I've never used the feature and don't intent to.

A related annoyance is the trailing comma at the end of stuff (probably a leftover from a previous edit). For example:

def fun(a, b, c,):

Parses fine. But the one that has bitten me is the comma at the end of a line:

>>> x = 1,
>>> x
(1,)
>>> x == 1, # inconsistency?
(False,)
>>> x == (1,)
True
>>> y = a_very_long_call(param1, param2, param3), # this trailing comma is difficult to spot

I'd prefer that the syntax for creating one-tuples requires the parenthesis, and that trailing commas are disallowed.

Cheers,

--
Juancarlo Añez

Jonathan Eunice

unread,
May 10, 2013, 6:58:11 PM5/10/13
to python...@python.org
Implicit line concatenation is one of those rare places where Python
turns an oddly blind eye to the "Explicit is better than implicit"
rule it otherwise loves.

I can't speak to how much inconvenience/breakage it would cause,
but deprecating it would seem to increase the language's "coherence"
with--or at least, adherence to--its principles.

But I have no real stake in it; I seldom if ever use the construct.

I prefer a trick I learned in Perl: Using a "here document" cleanup
function. This allows multi-line literal strings to be stated in a
program, at whatever level of indentation is appropriate for code
clarity, and the indentation to be automatically removed. For
example, using [textdata](https://pypi.python.org/pypi/textdata):

from textdata import *

data = lines("""
There was an old woman who lived in a shoe.
She had so many children, she didn't know what to do;
""")

will result in:

['There was an old woman who lived in a shoe.',
"She had so many children, she didn't know what to do;"]

This is dedented, but also had some blank lines at the start
and end removed (the blanks make Python formatting look good,
but might gunk up subsequent processing). `lines` can also
do what implicit concatenation does:

data = lines(join=True, text="""
this
that
""")

gives a purely concatenated result:

'thisthat'

The `join` kwarg, given a string, will join on that string. Some
edge-case options aside, being able to state indented literal strings
without fussing with line-by-line quoting is the jewel.

Especially if Python's implicit line concatenation is going to be
deprecated, you might consider adding an "ease of multi-line string
specification" function like `lines` to the standard library (in
`textwrap`?) to ease the passing.

Eli Bendersky

unread,
May 10, 2013, 7:27:32 PM5/10/13
to Guido van Rossum, Python-Ideas
On Fri, May 10, 2013 at 11:48 AM, Guido van Rossum <gu...@python.org> wrote:
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').

This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).

Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)

Would it be reasonable to start deprecating this and eventually remove
it from the language?

I would also be happy to see this error-prone syntax go (was bitten by it a number of times in the past), but I have a practical question here:

Realistically, what does "start deprecating" and "eventually remove" means here? This is a significant backwards-compatibility breaking change that will *definitely* break existing code. So would it be removed just in "Python 4"? Or are you talking about an actual 3.x release like "deprecate in 3.4 and remove in 3.5" ?

Eli

 

Guido van Rossum

unread,
May 10, 2013, 7:41:34 PM5/10/13
to Eli Bendersky, Python-Ideas
It's probably common enough that we'd have to do a silent deprecation
in 3.4, a nosy deprecation in 3.5, and then delete it in 3.6, or so.
Or maybe even more conservative.

Plus we should work on a conversion tool that adds + and () as needed,
*and* tell authors of popular lint tools to add rules for this.

The hacky proposals for making it a syntax warning "sometimes" don't
feel right to me.

I do realize that this will break a lot of code, and that's the only
reason why we may end up punting on this, possibly until Python 4, or
forever. But I don't think the feature is defensible from a language
usability POV. It's just about backward compatibility at this point.

Philip Jenvey

unread,
May 10, 2013, 7:43:43 PM5/10/13
to Michael Foord, Antoine Pitrou, Python-Ideas

On May 10, 2013, at 1:09 PM, Michael Foord wrote:

> On 10 May 2013 20:16, Antoine Pitrou <soli...@pitrou.net> wrote:
>
> I'm rather -1. It's quite convenient and I don't want to add some '+'
> signs everywhere I use it. I'm sure many people also have long string
> literals out there and will have to endure the pain of a dull task to
> "fix" their code.
>
> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
> the "continuation" is on the same line.
>
> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines.

Strongly -1 on this proposal, I also use this quite often.

--
Philip Jenvey

Nick Coghlan

unread,
May 10, 2013, 7:51:28 PM5/10/13
to Guido van Rossum, Python-Ideas


On 11 May 2013 04:50, "Guido van Rossum" <gu...@python.org> wrote:
>
> I just spent a few minutes staring at a bug caused by a missing comma
> -- I got a mysterious argument count error because instead of foo('a',
> 'b') I had written foo('a' 'b').
>
> This is a fairly common mistake, and IIRC at Google we even had a lint
> rule against this (there was also a Python dialect used for some
> specific purpose where this was explicitly forbidden).
>
> Now, with modern compiler technology, we can (and in fact do) evaluate
> compile-time string literal concatenation with the '+' operator, so
> there's really no reason to support 'a' 'b' any more. (The reason was
> always rather flimsy; I copied it from C but the reason why it's
> needed there doesn't really apply to Python, as it is mostly useful
> inside macros.)
>
> Would it be reasonable to start deprecating this and eventually remove
> it from the language?

I could live with it if we get "dedent()" as a string method. I'd be even happier if constant folding was extended to platform independent method calls on literals, but I don't believe there's a sane way to maintain the "platform independent" constraint.

OTOH, it's almost on the scale of "remove string mod formatting". Shipping at least a basic linting tool in the stdlib would probably be almost as effective and substantially less disruptive. lib2to3 should provide some decent infrastructure for that.

Cheers,
Nick.

Guido van Rossum

unread,
May 10, 2013, 7:57:02 PM5/10/13
to Nick Coghlan, Python-Ideas
Well, I think if you can live with

x = ('foo\n'
     'bar\n'
     'baz\n'
    )

I think you could live with

x = ('foo\n' +
     'bar\n' +
     'baz\n'
    )

as well... (Extra points if you figure out how to have a + on the last line too. :-)

So, as I said, it's not the convenience that matters, it's how much it is in use. :-(

--Guido

Ian Cordasco

unread,
May 10, 2013, 7:57:47 PM5/10/13
to Nick Coghlan, code-q...@python.org, Python-Ideas


On May 10, 2013 7:51 PM, "Nick Coghlan" <ncog...@gmail.com> wrote:
>
>
> On 11 May 2013 04:50, "Guido van Rossum" <gu...@python.org> wrote:
> >
> > I just spent a few minutes staring at a bug caused by a missing comma
> > -- I got a mysterious argument count error because instead of foo('a',
> > 'b') I had written foo('a' 'b').
> >
> > This is a fairly common mistake, and IIRC at Google we even had a lint
> > rule against this (there was also a Python dialect used for some
> > specific purpose where this was explicitly forbidden).
> >
> > Now, with modern compiler technology, we can (and in fact do) evaluate
> > compile-time string literal concatenation with the '+' operator, so
> > there's really no reason to support 'a' 'b' any more. (The reason was
> > always rather flimsy; I copied it from C but the reason why it's
> > needed there doesn't really apply to Python, as it is mostly useful
> > inside macros.)
> >
> > Would it be reasonable to start deprecating this and eventually remove
> > it from the language?
>
> I could live with it if we get "dedent()" as a string method. I'd be even happier if constant folding was extended to platform independent method calls on literals, but I don't believe there's a sane way to maintain the "platform independent" constraint.
>
> OTOH, it's almost on the scale of "remove string mod formatting". Shipping at least a basic linting tool in the stdlib would probably be almost as effective and substantially less disruptive. lib2to3 should provide some decent infrastructure for that.

I have cc'd the code-quality mailing list since several linger authors are subscribed there.

Alexander Belopolsky

unread,
May 10, 2013, 8:08:00 PM5/10/13
to Guido van Rossum, Python-Ideas

On Fri, May 10, 2013 at 7:57 PM, Guido van Rossum <gu...@python.org> wrote:
I think you could live with

x = ('foo\n' +
     'bar\n' +
     'baz\n'
    )

as well... (Extra points if you figure out how to have a + on the last line too. :-)

Does this earn a point?

x = (+ 'foo\n'

     + 'bar\n'
     + 'baz\n'
    )

:-))

Bruce Leban

unread,
May 10, 2013, 8:29:28 PM5/10/13
to Nick Coghlan, Python-Ideas
I got bit by this quite recently, leaving out a comma in a long list of strings and I only found the bug by accident.

This being python "ideas" I'll throw one out.

Add another prefix character to strings:

    a = [m'abc'
         'def']   # equivalent to ['abcdef']

A string with an m prefix is continued on one or more following lines. A string must have an m prefix to be continued (but this change would have to be phased in). A conversion tool need merely recognize the string continuations and insert m's. I chose the m character for multi-line but the character choice is available for bikeshedding. The m prefix can be combined with u and/or r but not with triple-quotes. The following are not allowed:

    b = ['abc'    # syntax error (m is required for continuation)
         'def')

    c = [m'abc']  # syntax error (when m is used, continuation lines must be present)

    d = [m'abc'
         m'def']  # syntax error (m only allowed for first string)

The reason to prohibit cases c and d guard against comma errors with these forms. Consider these cases with missing or extra commas.

    e = [m'abc',  # extra comma causes syntax error
         'def']

    f = [m'abc'   # missing comma causes syntax error
         m'def',
         'ghi']

Yes, I know this doesn't guard against all comma errors. You could protect against more with prefix and suffix (e.g., an m at the end of the last string) but I'm skeptical it's worth it.

 Conversion to this could be done in three stages:

(1) accept m's (case a), deprecate missing m's (case b), error for misused m's (case c-f)
(2) warn on missing m's (case b)
(3) error on missing m's (case b)

--- Bruce
Latest blog post: Alice's Puzzle Page http://www.vroospeak.com
Learn how hackers think: http://j.mp/gruyere-security

Michael Mitchell

unread,
May 10, 2013, 8:36:02 PM5/10/13
to Alexander Belopolsky, Python-Ideas

On Fri, May 10, 2013 at 7:08 PM, Alexander Belopolsky <alexander....@gmail.com> wrote:
Does this earn a point?

x = (+ 'foo\n'

     + 'bar\n'
     + 'baz\n'
    )

Plus doesn't make sense as a unary operator on strings. 

x = ('foo\n' +
     'bar\n' + 
     'baz\n' +
     '')

This would work.

Greg Ewing

unread,
May 10, 2013, 8:43:48 PM5/10/13
to python...@python.org
Antoine Pitrou wrote:
> As for "+", saying it is a replacement is a bit simplified, because
> the syntax definition (for method calls) or operator precedence (for
> e.g. %-formatting) may force you to add parentheses.

Maybe we could turn ... into a "string continuation
operator":

print("This is example %d of a line that is "...
"too long" % example_number)

--
Greg

MRAB

unread,
May 10, 2013, 9:12:56 PM5/10/13
to python-ideas, python-ideas
It wouldn't help with:

f = [m'abc'
'def'
'ghi']

vs:

f = [m'abc'
'def',
'ghi']

I think I'd go more for a triple-quoted string with a prefix for
dedenting and removing newlines:

f = [m'''
abc
def
ghi
''']

where f == ['abcdefghi'].

Joao S. O. Bueno

unread,
May 10, 2013, 10:13:54 PM5/10/13
to python-ideas
Any chance that along with that there could come up a syntax for
ignoring identation space inside multiline strings
along wih this deprecation?

Think something along:

logger.warn(I'''C: 'Ello, Miss?
Owner: What do you mean "miss"?
C: I'm sorry, I have a cold. I wish to make a complaint!
O: We're closin' for lunch.
C: Never mind that, my lad. I wish to complain about this
parrot what I purchased not half an hour ago from
this very boutique.\
''')

Against:

logger.warn( 'Owner: What do you mean "miss"?\n' +
'C: I'm sorry, I have a cold. I wish to make a
complaint!\n' +
'O: We're closin\' for lunch.\n' +
'C: Never mind that, my lad. I wish to complain
about this\n' +
'parrot what I purchased not half an hour ago
from this very boutique.\n'
)

I know this sugestion has come and gone before - but it still looks
like a good idea for me
- there is no ambiguity there - you either punch enough spaces to get
your content aligned
with the " i''' "in the first line, or get a SyntaxError.

Stephen J. Turnbull

unread,
May 10, 2013, 11:31:06 PM5/10/13
to MRAB, python-ideas
MRAB writes:

> I think I'd go more for a triple-quoted string with a prefix for
> dedenting and removing newlines:
>
> f = [m'''
> abc
> def
> ghi
> ''']
>
> where f == ['abcdefghi'].

Cool enough, but

>>> f = [m'''
... abc
... def
... ghi
... ''']
>>> f == ['abc def ghi']
True

Worse,

>>> f = [m'''
... abc
... def
... ghi
... ''']
>>> f == ['abc def ghi']
True

Yikes! (Yeah, I know about consenting adults.)

Mark Janssen

unread,
May 10, 2013, 11:43:00 PM5/10/13
to Matt Chaput, python...@python.org
On Fri, May 10, 2013 at 11:50 AM, Matt Chaput <ma...@whoosh.ca> wrote:
> On 5/10/2013 2:48 PM, Guido van Rossum wrote:
>>
>> Would it be reasonable to start deprecating this and eventually remove
>> it from the language?
>
> Yes please! I've been bitten by the same issue more than once.

+1

-m

Nick Coghlan

unread,
May 10, 2013, 11:37:04 PM5/10/13
to Bruce Leban, Python-Ideas
On Sat, May 11, 2013 at 10:29 AM, Bruce Leban <br...@leapyear.org> wrote:
> I got bit by this quite recently, leaving out a comma in a long list of
> strings and I only found the bug by accident.
>
> This being python "ideas" I'll throw one out.
>
> Add another prefix character to strings:
>
> a = [m'abc'
> 'def'] # equivalent to ['abcdef']

As MRAB suggested, a prefix for a compile time dedent would likely be
more useful - then you'd just use a triple quoted string and be done
with it. The other one I occasionally wish for is a compile time
equivalent of str.split (if we had that, we likely wouldn't see APIs
like collections.namedtuple and enum.Enum accepting space separated
strings).

Amongst my ideas-so-farfetched-I-never-even-wrote-them-up (which is
saying something, given some of the ideas I *have* written up) is a
notation like:

!processor!"STRING LITERAL"

Where the compile time string processors had to be registered through
an appropriate API (probably in the sys module). Then you would just
define preprocessors like "merge" or "dedent" or "split" or "sh" of
"format" and get the appropriate compile time raw string->AST
translation.

So for this use case, you would do:

a = [!merge!"""\
abc
def"""

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Mark Janssen

unread,
May 10, 2013, 11:50:56 PM5/10/13
to Greg Ewing, python...@python.org
> Maybe we could turn ... into a "string continuation
> operator":
>
> print("This is example %d of a line that is "...
> "too long" % example_number)

I think that is an awesome idea.

--
MarkJ
Tacoma, Washington

Raymond Hettinger

unread,
May 11, 2013, 12:04:56 AM5/11/13
to Guido van Rossum, Python-Ideas

On May 10, 2013, at 11:48 AM, Guido van Rossum <gu...@python.org> wrote:

Would it be reasonable to start deprecating this and eventually remove
it from the language?

I don't think it would be missed.  I very rarely see it used in practice.


Raymond

Philip Jenvey

unread,
May 11, 2013, 12:20:16 AM5/11/13
to Raymond Hettinger, Python-Ideas
Really? I see it used over multiple lines all over the place.

--
Philip Jenvey

Andrew Barnert

unread,
May 11, 2013, 1:05:55 AM5/11/13
to Mark Janssen, python...@python.org
On May 10, 2013, at 20:50, Mark Janssen <dreamin...@gmail.com> wrote:

>> Maybe we could turn ... into a "string continuation
>> operator":
>>
>> print("This is example %d of a line that is "...
>> "too long" % example_number)
>
> I think that is an awesome idea.

How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols).

Also, this gives two ways to do it, that have the exact same effect when they're both legal. The only difference is that the new way is only legal in a restricted set of cases.

By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples?

Random832

unread,
May 11, 2013, 1:11:44 AM5/11/13
to python...@python.org
On 05/11/2013 01:05 AM, Andrew Barnert wrote:
> How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols).
>
> By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples?
You just answered your own question. The reason it's better than + in
the same position, for those people, is that it would have higher
precedence than %.

Andrew Barnert

unread,
May 11, 2013, 1:12:58 AM5/11/13
to Nick Coghlan, Python-Ideas
On May 10, 2013, at 20:37, Nick Coghlan <ncog...@gmail.com> wrote:

> On Sat, May 11, 2013 at 10:29 AM, Bruce Leban <br...@leapyear.org> wrote:
>> I got bit by this quite recently, leaving out a comma in a long list of
>> strings and I only found the bug by accident.
>>
>> This being python "ideas" I'll throw one out.
>>
>> Add another prefix character to strings:
>>
>> a = [m'abc'
>> 'def'] # equivalent to ['abcdef']
>
> As MRAB suggested, a prefix for a compile time dedent would likely be
> more useful - then you'd just use a triple quoted string and be done
> with it. The other one I occasionally wish for is a compile time
> equivalent of str.split (if we had that, we likely wouldn't see APIs
> like collections.namedtuple and enum.Enum accepting space separated
> strings).

Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant?

If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants. (Doesn't the + optimization already make that assumption anyway?)

Stephen J. Turnbull

unread,
May 11, 2013, 1:16:00 AM5/11/13
to Mark Janssen, python...@python.org
Mark Janssen writes:
> > Maybe we could turn ... into a "string continuation
> > operator":
> >
> > print("This is example %d of a line that is "...
> > "too long" % example_number)
>
> I think that is an awesome idea.

Violates TOOWTDI.

>>> print("This is an" + # traditional explicit operator
... " %s idea." % ("awesome" if False else "unimpressive"))
This is an unimpressive idea.
>>>

already works (and always has AFAIK -- modulo the ternary operator, of
course).

Andrew Barnert

unread,
May 11, 2013, 1:15:48 AM5/11/13
to Random832, python...@python.org
On May 10, 2013, at 22:11, Random832 <rand...@fastmail.us> wrote:

> On 05/11/2013 01:05 AM, Andrew Barnert wrote:
>> How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols).
>>
>> By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples?
> You just answered your own question. The reason it's better than + in the same position, for those people, is that it would have higher precedence than %.

Ah, that makes sense.

Except that % formatting is supposed to be one of those "we haven't deprecated it, but we will, so stop using it" features, so it seems a little odd to add new syntax to make % formatting easier.

Also, doesn't this imply that ... is now an operator in some contexts, but a literal in others?

Georg Brandl

unread,
May 11, 2013, 1:24:39 AM5/11/13
to python...@python.org
Am 11.05.2013 01:43, schrieb Philip Jenvey:
>
> On May 10, 2013, at 1:09 PM, Michael Foord wrote:
>
>> On 10 May 2013 20:16, Antoine Pitrou <soli...@pitrou.net> wrote:
>>
>> I'm rather -1. It's quite convenient and I don't want to add some '+'
>> signs everywhere I use it. I'm sure many people also have long string
>> literals out there and will have to endure the pain of a dull task to
>> "fix" their code.
>>
>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
>> the "continuation" is on the same line.
>>
>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines.
>
> Strongly -1 on this proposal, I also use this quite often.

-1 here. I use it a lot too, and find it very convenient, and while I could
live with the change, I think it should have been made together with the lot
of other syntax changes going to Python 3.

Georg

Random832

unread,
May 11, 2013, 1:30:56 AM5/11/13
to Andrew Barnert, python...@python.org
On 05/11/2013 01:15 AM, Andrew Barnert wrote:
> Ah, that makes sense.
>
> Except that % formatting is supposed to be one of those "we haven't deprecated it, but we will, so stop using it" features, so it seems a little odd to add new syntax to make % formatting easier.
>
Well, technically the same would apply to .format(), I guess.

Steven D'Aprano

unread,
May 11, 2013, 1:36:39 AM5/11/13
to python...@python.org
On 11/05/13 04:48, Guido van Rossum wrote:
> I just spent a few minutes staring at a bug caused by a missing comma
> -- I got a mysterious argument count error because instead of foo('a',
> 'b') I had written foo('a' 'b').
>
> This is a fairly common mistake, and IIRC at Google we even had a lint
> rule against this (there was also a Python dialect used for some
> specific purpose where this was explicitly forbidden).
>
> Now, with modern compiler technology, we can (and in fact do) evaluate
> compile-time string literal concatenation with the '+' operator, so
> there's really no reason to support 'a' 'b' any more. (The reason was
> always rather flimsy; I copied it from C but the reason why it's
> needed there doesn't really apply to Python, as it is mostly useful
> inside macros.)
>
> Would it be reasonable to start deprecating this and eventually remove
> it from the language?


Not unless you guarantee that compile-time folding of string literals with '+' will be a language feature rather than an optional optimization.

I frequently use implicit string concatenation for long strings, or to keep within the 80 character line limit. I teach people to prefer it over '+' because:

- constant folding is an implementation detail that is not guaranteed, and not all versions of Python support;

- even when provided, constant folding is an optimization which might not be present in the future[1];

- implicit string concatenation is a language feature, so every Python must support it;

- and is nicer than the current alternatives involving backslashes or triple-quoted strings.


The problems caused by implicit string concatenation are uncommon and mild. Having two string literals immediately next to each other is uncommon; forgetting the comma makes it rarer. So I think the benefit of implicit string concatenation far outweighs the occasional problem.






[1] I recall you (GvR) publicly complaining about CPython optimizations and suggesting that they are more effort than they are worth and should be dropped. I don't recall whether you explicitly included constant folding in that.




--
Steven

Greg Ewing

unread,
May 11, 2013, 1:39:57 AM5/11/13
to python...@python.org
Andrew Barnert wrote:
> Except that % formatting is supposed to be one of those
> "we haven't deprecated it, but we will, so stop using it" features,
> so it seems a little odd to add new syntax to make % formatting easier.

Except that the same problem also occurs with .format() formatting.

>>> "a{}b" "c{}d" .format(1,2)
'a1bc2d'

but

>>> "a{}b" + "c{}d" .format(1,2)
'a{}bc1d'

so you need

>>> ("a{}b" + "c{}d") .format(1,2)
'a1bc2d'

> Also, doesn't this imply that ... is now an operator in some contexts,
> but a literal in others?

It would have different meanings in different contexts, yes.

But I wouldn't think of it as an operator, more as a token
indicating string continuation, in the same way that the
backslash indicates line continuation.

--
Greg

Random832

unread,
May 11, 2013, 1:44:30 AM5/11/13
to python...@python.org
On 05/11/2013 01:36 AM, Steven D'Aprano wrote:
> Not unless you guarantee that compile-time folding of string literals
> with '+' will be a language feature rather than an optional optimization.

What makes you think that implicit concatenation being compile-time
isn't optional?

Steven D'Aprano

unread,
May 11, 2013, 1:53:53 AM5/11/13
to python...@python.org
On 11/05/13 15:12, Andrew Barnert wrote:

> Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant?


String constants do not need to be concatenated only at import time.

Strings frequently need to be concatenated at run-time, or at function call time, or inside loops. For constants known at compile time, it is better to use a string literal rather than a string calculated at run-time for the same reason that it is better to write 2468 rather than 2000+400+60+8 -- because it better reflects the way we think about the program, not just because of the run-time expense of extra unnecessary additions/concatenations.


> If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants.

In principle, the keyhole optimizer could make that assumption. In practice, there is a limit to how much effort people put into the optimizer. Constant-folding method calls is probably past the point of diminishing returns.



--
Steven

Steven D'Aprano

unread,
May 11, 2013, 2:00:34 AM5/11/13
to python...@python.org
On 11/05/13 15:44, Random832 wrote:
> On 05/11/2013 01:36 AM, Steven D'Aprano wrote:
>> Not unless you guarantee that compile-time folding of string literals with '+' will be a language feature rather than an optional optimization.
>
> What makes you think that implicit concatenation being compile-time isn't optional?

http://docs.python.org/3/reference/lexical_analysis.html#string-literal-concatenation

In the sense that there is no ISO standard for Python, *everything* is optional if Guido decrees that it is. But compile-time implicit concatenation is a documented language feature, not an implementation-dependent optimization.



--
Steven

Andrew Barnert

unread,
May 11, 2013, 3:13:27 AM5/11/13
to Steven D'Aprano, python...@python.org
On May 10, 2013, at 22:53, Steven D'Aprano <st...@pearwood.info> wrote:

> On 11/05/13 15:12, Andrew Barnert wrote:
>
>> Why does it need to be compile time? Do people really run into cases that frequently where the cost of concatenating or dedenting strings at import time is significant?
>
>
> String constants do not need to be concatenated only at import time.
>
> Strings frequently need to be concatenated at run-time, or at function call time, or inside loops. For constants known at compile time, it is better to use a string literal rather than a string calculated at run-time for the same reason that it is better to write 2468 rather than 2000+400+60+8 -- because it better reflects the way we think about the program, not just because of the run-time expense of extra unnecessary additions/concatenations.

Well, you have the choice of either:

count = 2000 + 400 + 60 + 8
for e in hugeiter:
foo(e, count)

Or:

for e in hugeiter:
foo(e, 2468) # 2000 + 400 + 60 + 8

And again, considering that the whole point of string concatenation is dealing with cases that are hard to fit into 80 cols otherwise, the former option is, if anything, even more appropriate.

>> If so, it seems like something more dramatic might be warranted, like allowing the compiler to assume that method calls on literals have the same effect at compile time as at runtime so it can turn them into constants.
>
> In principle, the keyhole optimizer could make that assumption. In practice, there is a limit to how much effort people put into the optimizer. Constant-folding method calls is probably past the point of diminishing returns.

Adding new optimizations just for the hell of it is obviously not a good idea. But we're talking about the cost of adding an optimization to vs. adding a new type of auto-dedenting string literal. It seems like about the same scope either way, and the former doesn't require any changes to the grammar, docs, other implementations, etc.--or, more importantly, existing user code. And it might even improve other related cases.

If the problem is so important we're seriously considering changing the syntax, it seems a little unwarranted to reject the optimization out of hand. Or, contrarily, if the optimization is obviously not worth doing, changing the syntax to let people do the same optimization manually seems excessive.

Stefan Behnel

unread,
May 11, 2013, 5:37:54 AM5/11/13
to python...@python.org
Georg Brandl, 11.05.2013 07:24:
> Am 11.05.2013 01:43, schrieb Philip Jenvey:
>> On May 10, 2013, at 1:09 PM, Michael Foord wrote:
>>> On 10 May 2013 20:16, Antoine Pitrou wrote:
>>>
>>> I'm rather -1. It's quite convenient and I don't want to add some '+'
>>> signs everywhere I use it. I'm sure many people also have long string
>>> literals out there and will have to endure the pain of a dull task to
>>> "fix" their code.
>>>
>>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
>>> the "continuation" is on the same line.
>>>
>>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines.
>>
>> Strongly -1 on this proposal, I also use this quite often.
>
> -1 here. I use it a lot too, and find it very convenient, and while I could
> live with the change, I think it should have been made together with the lot
> of other syntax changes going to Python 3.

I used to sort-of dislike it in the past and only recently started using it
more often, specifically for dealing with long string literals. I really
like it for that, although I've also been bitten by the "missing comma" bug.

I guess I'm -0.5 on removing it.

Stefan

Stefan Behnel

unread,
May 11, 2013, 5:42:31 AM5/11/13
to python...@python.org
Ezio Melotti, 10.05.2013 22:40:
> ['this is a '
> 'long string',
> 'this is another '
> 'long string']
> I agree that requiring extra (...) in this case is reasonable, i.e.:
> [('this is a '
> 'long string'),
> ('this is another '
> 'long string')]

-1, IMHO this makes it more verbose and thus harder to read, because it
takes a while to figure out that the parentheses are not meant to surround
tuples in this case, which would be the one obvious reason to spot them
inside of a list.

In a way, it's the reverse of the "spot the missing comma" problem, more
like a "spot that there's really no comma". That's just as bad, if you ask me.

Stefan Behnel

unread,
May 11, 2013, 6:00:36 AM5/11/13
to python...@python.org
Steven D'Aprano, 11.05.2013 07:53:
> In principle, the keyhole optimizer could make that assumption. In
> practice, there is a limit to how much effort people put into the
> optimizer. Constant-folding method calls is probably past the point of
> diminishing returns.

Plus, such an optimisation can have a downside. Contrived example:

if DEBUG:
print('a'.replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa')
.replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa')
.replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa'))

Expanding this into a string literal will trade space for time, whereas the
original code clearly trades time for space. The same applies to string
splitting. A list of many short strings takes up more space than a split
call on one large string.

May not seem like a major concern in most cases that involve string
literals, but we shouldn't ignore the possibility that the author of the
code might have used the explicit method call quite deliberately.

Stefan

Mark Lawrence

unread,
May 11, 2013, 7:10:37 AM5/11/13
to python...@python.org
On 11/05/2013 06:15, Andrew Barnert wrote:
>
> Ah, that makes sense.
>
> Except that % formatting is supposed to be one of those "we haven't deprecated it, but we will, so stop using it" features, so it seems a little odd to add new syntax to make % formatting easier.
>

I don't think so, see
http://mail.python.org/pipermail/python-dev/2012-February/116790.html

--
If you're using GoogleCrap™ please read this
http://wiki.python.org/moin/GoogleGroupsPython.

Mark Lawrence

Stefan Drees

unread,
May 11, 2013, 7:43:53 AM5/11/13
to python...@python.org
Am 11.05.13 02:43, schrieb Greg Ewing:
> Antoine Pitrou wrote:
>> As for "+", saying it is a replacement is a bit simplified, because
>> the syntax definition (for method calls) or operator precedence (for
>> e.g. %-formatting) may force you to add parentheses.
>
> Maybe we could turn ... into a "string continuation
> operator":
>
> print("This is example %d of a line that is "...
> "too long" % example_number)
>

at least trying to follow the complete thread so only a
late feedback on this proposal from me:

The mysterious type [Ellipsis] comes to the rescue with all of its three
characters - helping to stay below 80 chars ?

In this message I avoid further adding or subtracting numbers to
not overflow the result ;-) but I somhow like the current two possible
ways of doing "it", as when - manually - migrating code eg. from php to
python I may either remove dots or replace these with plus signs.

So I have a fast working migrated code base and then - while the clients
work with it - I have a more relaxed schedule to further clean it up.

[Ellipsis]: http://docs.python.org/3.3/reference/datamodel.html#index-8

All the best,
Stefan.

Serhiy Storchaka

unread,
May 11, 2013, 10:12:27 AM5/11/13
to python...@python.org
11.05.13 13:00, Stefan Behnel написав(ла):

> Plus, such an optimisation can have a downside. Contrived example:
>
> if DEBUG:
> print('a'.replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa')
> .replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa')
> .replace('a', 'aaaaaaaa').replace('a', 'aaaaaaaa'))
>
> Expanding this into a string literal will trade space for time, whereas the
> original code clearly trades time for space. The same applies to string
> splitting. A list of many short strings takes up more space than a split
> call on one large string.
>
> May not seem like a major concern in most cases that involve string
> literals, but we shouldn't ignore the possibility that the author of the
> code might have used the explicit method call quite deliberately.

x = 0
if x:
x = 9**9**9

Joao S. O. Bueno

unread,
May 11, 2013, 10:21:02 AM5/11/13
to Stephen J. Turnbull, python-ideas
Please - check my e-mail correctly

On 11 May 2013 00:31, Stephen J. Turnbull <ste...@xemacs.org> wrote:
> MRAB writes:
>
> > I think I'd go more for a triple-quoted string with a prefix for
> > dedenting and removing newlines:
> >
> > f = [m'''
> > abc
> > def
> > ghi
> > ''']
> >
I think the prefix idea is obvious - and I used the letter "i" in my message -
for "idented" -0 it may be a pooorr choice indeed since it looks like
it may not be
noticed sometimes close to the quotes.

> > where f == ['abcdefghi'].
>
> Cool enough, but
>
>>>> f = [m'''
> ... abc
> ... def
> ... ghi
> ... ''']
>>>> f == ['abc def ghi']
> True

In my porposal, this woukld yield a Syntax Error - any contents of the
string would have to be
indented to the same level of the prefix. Sorry if that was not clear enough.

Serhiy Storchaka

unread,
May 11, 2013, 10:28:48 AM5/11/13
to python...@python.org
11.05.13 03:43, Greg Ewing написав(ла):

> Maybe we could turn ... into a "string continuation
> operator":
>
> print("This is example %d of a line that is "...
> "too long" % example_number)
>

Maybe "/"? ;)

João Bernardo

unread,
May 11, 2013, 11:03:09 AM5/11/13
to Guido van Rossum, Python-Ideas

Would it be reasonable to start deprecating this and eventually remove
it from the language?

-1... I find it very useful and clean for multiple lines and actually don't remember having bugs because of it.
It could be deprecated/removed when used on a single line though.


Kabie

unread,
May 11, 2013, 12:09:02 PM5/11/13
to Python-Ideas
+1 for this

I don't understand why would anyone wants to use this on a single line of text.

But please keep it for multiple lines usage.


2013/5/11 João Bernardo <jbv...@gmail.com>

Christian Tismer

unread,
May 11, 2013, 12:37:54 PM5/11/13
to Stefan Behnel, python...@python.org
On 11.05.13 11:37, Stefan Behnel wrote:
> Georg Brandl, 11.05.2013 07:24:
>> Am 11.05.2013 01:43, schrieb Philip Jenvey:
>>> On May 10, 2013, at 1:09 PM, Michael Foord wrote:
>>>> On 10 May 2013 20:16, Antoine Pitrou wrote:
>>>>
>>>> I'm rather -1. It's quite convenient and I don't want to add some '+'
>>>> signs everywhere I use it. I'm sure many people also have long string
>>>> literals out there and will have to endure the pain of a dull task to
>>>> "fix" their code.
>>>>
>>>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
>>>> the "continuation" is on the same line.
>>>>
>>>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines.
>>> Strongly -1 on this proposal, I also use this quite often.
>> -1 here. I use it a lot too, and find it very convenient, and while I could
>> live with the change, I think it should have been made together with the lot
>> of other syntax changes going to Python 3.
> I used to sort-of dislike it in the past and only recently started using it
> more often, specifically for dealing with long string literals. I really
> like it for that, although I've also been bitten by the "missing comma" bug.
>
> I guess I'm -0.5 on removing it.
>

I'm +1 on removing it, if it is combined with better indentation options
for triple-quoted strings.

So if there was some notation (not specified yet how) that triggers correct
indentation at compile time without extra functional hacks, so that

long_text = """
this text is left justified
and this line indents by two spaces
"""

is stripped the leading and trailing \n and indentation is justified,
then I think the need for the implicit whitespace operator would be small.

cheers -- chris

--
Christian Tismer :^) <mailto:tis...@stackless.com>
Software Consulting : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/
14482 Potsdam : PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776 fax +49 (30) 700143-0023
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Juancarlo Añez

unread,
May 11, 2013, 12:39:35 PM5/11/13
to Python-Ideas
After reading about other people's use-cases, I' now:

-1

I think that all that's required for solving Guido's original use case is a new warning in pylint, pep8, or flake. PEP8 could be updated to discourage the use of automatic concatenation in those places.

The warning would apply only to automatic concatenations within parameter passing and structures, and not to assignments or formatting through %.

Doing it this way would solve the use case by declaring certain uses of automatic concatenation a "code smell", and automating detection of the bad uses, without any changes to the language.

All that Guido needs to do is change PEP8, and wait for the static analyzers to follow.

Cheers,

--
Juancarlo Añez

Nick Coghlan

unread,
May 11, 2013, 12:48:30 PM5/11/13
to Christian Tismer, Stefan Behnel, python...@python.org
On Sun, May 12, 2013 at 2:37 AM, Christian Tismer <tis...@stackless.com> wrote:

> So if there was some notation (not specified yet how) that triggers correct
> indentation at compile time without extra functional hacks, so that
>
> long_text = """
> this text is left justified
> and this line indents by two spaces
> """
>
> is stripped the leading and trailing \n and indentation is justified,
> then I think the need for the implicit whitespace operator would be small.

Through participating in this thread, I've realised that the
distinction between when I use a triple quoted string (with or without
textwrap.dedent()) and when I use implicit string concatenation is
whether or not I want the newlines in the result. Often I can avoid
the issue entirely by splitting a statement into multiple pieces, but

I think Guido's right that if we didn't have implicit string
concatenation there's no way we would add it ("just use a triple
quoted string with escaped newlines" or "just use runtime string
concatenation"), but given that we *do* have it, I don't think it's
worth the hassle of removing it over a bug that a lint program should
be able to pick up.

So I'm back to where I started, which is that if this kind of problem
really bothers anyone, start thinking seriously about the idea of a
standard library linter.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Nick Coghlan

unread,
May 11, 2013, 12:49:14 PM5/11/13
to Christian Tismer, Stefan Behnel, python...@python.org
On Sun, May 12, 2013 at 2:48 AM, Nick Coghlan <ncog...@gmail.com> wrote:
> On Sun, May 12, 2013 at 2:37 AM, Christian Tismer <tis...@stackless.com> wrote:
>
>> So if there was some notation (not specified yet how) that triggers correct
>> indentation at compile time without extra functional hacks, so that
>>
>> long_text = """
>> this text is left justified
>> and this line indents by two spaces
>> """
>>
>> is stripped the leading and trailing \n and indentation is justified,
>> then I think the need for the implicit whitespace operator would be small.
>
> Through participating in this thread, I've realised that the
> distinction between when I use a triple quoted string (with or without
> textwrap.dedent()) and when I use implicit string concatenation is
> whether or not I want the newlines in the result. Often I can avoid
> the issue entirely by splitting a statement into multiple pieces, but

... not always.

(Sorry, got distracted and left the sentence unfinished).

MRAB

unread,
May 11, 2013, 12:55:43 PM5/11/13
to python-ideas
On 11/05/2013 04:37, Nick Coghlan wrote:
> On Sat, May 11, 2013 at 10:29 AM, Bruce Leban <br...@leapyear.org> wrote:
>> I got bit by this quite recently, leaving out a comma in a long list of
>> strings and I only found the bug by accident.
>>
>> This being python "ideas" I'll throw one out.
>>
>> Add another prefix character to strings:
>>
>> a = [m'abc'
>> 'def'] # equivalent to ['abcdef']
>
> As MRAB suggested, a prefix for a compile time dedent would likely be
> more useful - then you'd just use a triple quoted string and be done
> with it. The other one I occasionally wish for is a compile time
> equivalent of str.split (if we had that, we likely wouldn't see APIs
> like collections.namedtuple and enum.Enum accepting space separated
> strings).
>
> Amongst my ideas-so-farfetched-I-never-even-wrote-them-up (which is
> saying something, given some of the ideas I *have* written up) is a
> notation like:
>
> !processor!"STRING LITERAL"
>
> Where the compile time string processors had to be registered through
> an appropriate API (probably in the sys module). Then you would just
> define preprocessors like "merge" or "dedent" or "split" or "sh" of
> "format" and get the appropriate compile time raw string->AST
> translation.
>
> So for this use case, you would do:
>
> a = [!merge!"""\
> abc
> def"""
>
Do you really need the "!"? String literals can already have a prefix,
such as "r".

At compile time, the string literal could be preprocessed according to
its prefix (some kind of import hook, working on the AST?). The current
prefixes are "" (plain literal), "r", "b", "u", etc.

Christian Tismer

unread,
May 11, 2013, 1:05:05 PM5/11/13
to Nick Coghlan, Python-Ideas
Ah, I see we are on the same path here.
Just not sure if it is right to move into a compile-time preprocessor
language or to just handle the most common cases with a simple
prefix?
One example is code snippets which need proper de-indentation.
I think a simple stripping of white-space in

text = s"""
leftmost column
two-char indent
"""

would solve 95 % of common indentation and concatenation cases.
I don't think provision for merging is needed very often.
If text occurs deeply nested in code, then it is also quite likely to
be part of an expression, anyway.
My major use-case is text constants in a class or function that
is multiple lines long and should be statically ready to use without
calling a function.

(here an 's' as a strip prefix, but I'm not sold on that)

cheers - chris

--
Christian Tismer :^) <mailto:tis...@stackless.com>
Software Consulting : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/
14482 Potsdam : PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776 fax +49 (30) 700143-0023
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

M.-A. Lemburg

unread,
May 11, 2013, 1:24:02 PM5/11/13
to Christian Tismer, Nick Coghlan, Python-Ideas
On 11.05.2013 19:05, Christian Tismer wrote:
> I think a simple stripping of white-space in
>
> text = s"""
> leftmost column
> two-char indent
> """
>
> would solve 95 % of common indentation and concatenation cases.
> I don't think provision for merging is needed very often.
> If text occurs deeply nested in code, then it is also quite likely to
> be part of an expression, anyway.
> My major use-case is text constants in a class or function that
> is multiple lines long and should be statically ready to use without
> calling a function.
>
> (here an 's' as a strip prefix, but I'm not sold on that)

This is not a good solution for long lines where you don't want to
have embedded line endings. Taken from existing code:

_litmonth = ('(?P<litmonth>'
'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|'
'mär|mae|mrz|mai|okt|dez|'
'fev|avr|juin|juil|aou|aoû|déc|'
'ene|abr|ago|dic|'
'out'
')[a-z,\.;]*')

or
raise errors.DataError(
'Inconsistent revenue item currency: '
'transaction=%r; transaction_position=%r' %
(transaction, transaction_position))

We usually try to keep the code line length under 80 chars,
so splitting literals in that way is rather common, esp. in
nested code paths.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 11 2013)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46
2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

Ian Cordasco

unread,
May 11, 2013, 2:18:45 PM5/11/13
to Nick Coghlan, code-q...@python.org, python...@python.org
On Sat, May 11, 2013 at 12:48 PM, Nick Coghlan <ncog...@gmail.com> wrote:
> On Sun, May 12, 2013 at 2:37 AM, Christian Tismer <tis...@stackless.com> wrote:
>
>> So if there was some notation (not specified yet how) that triggers correct
>> indentation at compile time without extra functional hacks, so that
>>
>> long_text = """
>> this text is left justified
>> and this line indents by two spaces
>> """
>>
>> is stripped the leading and trailing \n and indentation is justified,
>> then I think the need for the implicit whitespace operator would be small.
>
> Through participating in this thread, I've realised that the
> distinction between when I use a triple quoted string (with or without
> textwrap.dedent()) and when I use implicit string concatenation is
> whether or not I want the newlines in the result. Often I can avoid
> the issue entirely by splitting a statement into multiple pieces, but
>
> I think Guido's right that if we didn't have implicit string
> concatenation there's no way we would add it ("just use a triple
> quoted string with escaped newlines" or "just use runtime string
> concatenation"), but given that we *do* have it, I don't think it's
> worth the hassle of removing it over a bug that a lint program should
> be able to pick up.
>
> So I'm back to where I started, which is that if this kind of problem
> really bothers anyone, start thinking seriously about the idea of a
> standard library linter.

Really this should be trivial for all of the linters that already
exist. That aside, (and this is not an endorsement for this proposal)
but can you not just do

long_text = """\
this is left justified \
and this is continued on the same line
and this is indented by two spaces
"""
I'm personally in favor of not allowing the concatenation to be on the
same line but allowing it across multiple lines. While linters would
be great for this, why not just introduce the SyntaxError since (as
has already been demonstrated) some of the concatenation already
happens at compile time.

Christian Tismer

unread,
May 11, 2013, 2:37:12 PM5/11/13
to M.-A. Lemburg, Python-Ideas

Your first example is a regex, which could be used as-is.

Your second example is indented five levels deep. That is a coding
style which I would propose to write differently for better readability.
And if you stick with it, why not use the "+"?

I want to support constant strings, which should not be somewhere
in the middle of code. Your second example is computed, anyway,
not the case that I want to solve.

cheers - chris

--
Christian Tismer :^) <mailto:tis...@stackless.com>
Software Consulting : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/
14482 Potsdam : PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776 fax +49 (30) 700143-0023
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

_______________________________________________

Nick Coghlan

unread,
May 11, 2013, 2:51:09 PM5/11/13
to python-ideas
On Sun, May 12, 2013 at 2:55 AM, MRAB <pyt...@mrabarnett.plus.com> wrote:
> Do you really need the "!"? String literals can already have a prefix,
> such as "r".
>
> At compile time, the string literal could be preprocessed according to
> its prefix (some kind of import hook, working on the AST?). The current
> prefixes are "" (plain literal), "r", "b", "u", etc.

1. Short prefixes are inherently cryptic (especially single letter ones)
2. The existing prefixes control how the source code is converted to a
string, they don't permit conversion to a completely different
construct
3. Short prefixes are not extensible and rapidly run into namespacing issues

As noted, I prefer not to solve this problem at all (and add a basic
lint capability instead). However, if we do try to solve it, then I'd
prefer a syntax that adds a general extensible capability rather than
one that piles additional complications on the existing string prefix
mess.

If we support dedent, do we also support merging adjacent whitespace
characters into a single string? Do we support splitting a string? Do
we support upper case or lower case or taking its length?

Two responses make sense to me: accept the status quo (perhaps with
linter support), or design and champion a general compile time string
processing capability (that doesn't rely on encoding tricks or a
custom import hook). Expanding on the already cryptic string prefix
system does *not* strike me as a reasonable idea at all.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Mark Janssen

unread,
May 11, 2013, 2:52:15 PM5/11/13
to Andrew Barnert, python...@python.org
>>> Maybe we could turn ... into a "string continuation
>>> operator":
>>>
>>> print("This is example %d of a line that is "...
>>> "too long" % example_number)
>>
>> I think that is an awesome idea.
>
> How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols).

It partitions the conceptual space. "+" is a mathematical operator,
but strings are not numbers. That's the negative argument for it.
The positive, further, argument is that the elipsis has a long history
of being a continuation indicator in text.

> By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples?

An interesting correlation indeed.

--
MarkJ
Tacoma, Washington

Ian Cordasco

unread,
May 11, 2013, 2:57:49 PM5/11/13
to Mark Janssen, python...@python.org
On Sat, May 11, 2013 at 2:52 PM, Mark Janssen <dreamin...@gmail.com> wrote:
>>>> Maybe we could turn ... into a "string continuation
>>>> operator":
>>>>
>>>> print("This is example %d of a line that is "...
>>>> "too long" % example_number)
>>>
>>> I think that is an awesome idea.
>>
>> How is this any better than + in the same position? It's harder to notice, and longer (remember that the only reason you're doing this is that you can't fit your strings into 80 cols).
>
> It partitions the conceptual space. "+" is a mathematical operator,
> but strings are not numbers. That's the negative argument for it.
> The positive, further, argument is that the elipsis has a long history
> of being a continuation indicator in text.

But + is already a supported operation on strings and has been since
at least python 2. It is already there and it doesn't require a new
dunder method for concatenating with the Ellipsis object. It's also
relatively fast and already performed at compile time. If we're going
to remove this implicit concatenation, why do we have to add a fancy
new feature that's non-obvious and going to need extra implementation?

>> By the way, is it just a coincidence that almost all of the people sticking up for keeping or replacing implicit concatenation instead of just scrapping it are using % formatting in their examples?
>
> An interesting correlation indeed.

Albeit one that is probably unrelated. I use str.format everywhere
(mostly because I don't support python 2.5 in most of my work) and I'm
against it. I just haven't given examples against it because others
have already presented examples that I would have provided.

Mark Janssen

unread,
May 11, 2013, 3:22:58 PM5/11/13
to Stephen J. Turnbull, python...@python.org
> > I think that is an awesome idea.
>
> Violates TOOWTDI.
>
> >>> print("This is an" + # traditional explicit operator
> ... " %s idea." % ("awesome" if False else "unimpressive"))
> This is an unimpressive idea.
> >>>

But you see you just helped me demonstrate my point: the Python
interpreter *itself* uses ... as a line-continuation operater!

Also, it won't violate TOOWTDI if the "+" operator is deprecated for
strings. Strings are different from numbers anyway, it's an old
habit/wart to use "+" for them.

*moving out of the way* :))
--
MarkJ
Tacoma, Washington

Ian Cordasco

unread,
May 11, 2013, 3:27:51 PM5/11/13
to Mark Janssen, python-ideas
On Sat, May 11, 2013 at 3:22 PM, Mark Janssen <dreamin...@gmail.com> wrote:
>> > I think that is an awesome idea.
>>
>> Violates TOOWTDI.
>>
>> >>> print("This is an" + # traditional explicit operator
>> ... " %s idea." % ("awesome" if False else "unimpressive"))
>> This is an unimpressive idea.
>> >>>
>
> But you see you just helped me demonstrate my point: the Python
> interpreter *itself* uses ... as a line-continuation operater!

It also uses it when you define a class or function, should those
declarations use Ellipsis everywhere too? (For reference:

>>> class A:
... a = 1
... def __init__(self, **kwargs):
... for k, v in kwargs.items():
... if k != 'a':
... setattr(self, k, v)
...
>>> i = A()

But this is getting off-topic and the question is purely rhetorical.)

--
Ian

M.-A. Lemburg

unread,
May 11, 2013, 5:14:14 PM5/11/13
to Christian Tismer, Python-Ideas

You're not addressing the main point I was trying to make :-)

Triple-quoted strings work for strings that are supposed to
have embedded newlines, but they don't provide a good alternative
for long strings without embedded newlines.

Regarding using '+' in these cases: of course that would be
possible, but it clutters up the code, often requires additional
parens, it's slower and can lead to other weird errors when
forgetting parens, which are not much different than the one
Guido mentioned in his original email.

In all the years I've been writing Python, I've only very rarely
had an issue with missing commas between strings. Most cases I
ran into were missing commas in lists of tuples, not strings:

l = [
'detect_target_type',
(None, Is, '"', +1, 'double_quoted_target')
(None, Is, '\'', +1, 'single_quoted_target'),
(None, IsIn, separators, 'unquoted_target', 'empty_target'),
]

This gives:

Traceback (most recent call last):
File "<stdin>", line 4, in <module>
TypeError: 'tuple' object is not callable

:-)

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 11 2013)
>>> Python Projects, Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46
2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

Philip Jenvey

unread,
May 11, 2013, 5:23:45 PM5/11/13
to Georg Brandl, python...@python.org

On May 10, 2013, at 10:24 PM, Georg Brandl wrote:

> Am 11.05.2013 01:43, schrieb Philip Jenvey:
>>
>> On May 10, 2013, at 1:09 PM, Michael Foord wrote:
>>
>>> On 10 May 2013 20:16, Antoine Pitrou <soli...@pitrou.net> wrote:
>>>
>>> I'm rather -1. It's quite convenient and I don't want to add some '+'
>>> signs everywhere I use it. I'm sure many people also have long string
>>> literals out there and will have to endure the pain of a dull task to
>>> "fix" their code.
>>>
>>> However, in your case, foo('a' 'b') could raise a SyntaxWarning, since
>>> the "continuation" is on the same line.
>>>
>>> I'm with Antoine. I love using implicit concatenation for splitting long literals across multiple lines.
>>
>> Strongly -1 on this proposal, I also use this quite often.
>
> -1 here. I use it a lot too, and find it very convenient, and while I could
> live with the change, I think it should have been made together with the lot
> of other syntax changes going to Python 3.

Also note that it was already proposed and rejected for Python 3.

http://www.python.org/dev/peps/pep-3126

--
Philip Jenvey

Ron Adam

unread,
May 11, 2013, 6:19:14 PM5/11/13
to python...@python.org

Greg, I meant to send my reply earlier to the list.


On 05/11/2013 12:39 AM, Greg Ewing wrote:
>> Also, doesn't this imply that ... is now an operator in some contexts,
> > but a literal in others?

Could it's use as a literal be depreciated? I haven't seen it used in that
except in examples.


> It would have different meanings in different contexts, yes.
>
> But I wouldn't think of it as an operator, more as a token
> indicating string continuation, in the same way that the
> backslash indicates line continuation.

Yep, it would be a token that the tokenizer would handle. So it would be
handled before anything else just as the line continuation '\' is. After
the file is tokenized, it is removed and won't interfere with anything else.

It could be limited to strings, or expanded to include numbers and possibly
other literals.

a = "a long text line "...
"that is continued "...
"on several lines."

pi = 3.1415926535...
8979323846...
2643383279

You can't do this with a line continuation '\'.


Another option would be to have dedented multi-line string tokens |""" and
|'''. Not too different than r""" or b""".

s = |"""Multi line string
|
|paragraph 1
|
|paragraph 2
|"""

a = |"""\
|a long text line \
|that is continued \
|on several lines.\
|"""

The rule for this is, for strings that start with |""" or |''', each
following line needs to be proceeded with whitespace + '|', until the
closing quote is reached. The tokenizer would just find and remove them as
it comes across them. Any '|' on a line after the first '|' would be
unaffected, so they don't need to be escaped.

IT's a very explicit syntax. It's very obvious what is part of the string
and what isn't. Something like this would end the endless debate on
dedents. That alone might be worth it. ;-)

I know the | is also a binary 'or' operator, but it's use for that is in a
different contex, so I don't think it would be a problem.

Both of these options would be implemented in the tokenizer and are really
just tools to formatting source code rather than actual additions or
changes to the language.

Cheers,
Ron

Greg Ewing

unread,
May 11, 2013, 7:46:30 PM5/11/13
to python...@python.org
Someone wrote:

> By the way, is it just a coincidence that almost all of the people sticking up
> for keeping or replacing implicit concatenation instead of just scrapping it are
> using % formatting in their examples?

In my case this is because it's the context in which I use
this feature most often.

--
Greg

Greg Ewing

unread,
May 11, 2013, 7:55:34 PM5/11/13
to python...@python.org
Ian Cordasco wrote:
> On Sat, May 11, 2013 at 2:52 PM, Mark Janssen <dreamin...@gmail.com> wrote:

>>It partitions the conceptual space. "+" is a mathematical operator,
>>but strings are not numbers.
>
> But + is already a supported operation on strings

I still think about these two kinds of concatenation in
different ways, though. When I use implicit concatenation,
I don't think in terms of taking two strings and joining
them together. I'm just writing a single string literal
that happens to span two source lines.

I believe that distinguishing them visually helps
readability. Using + for both makes things look more
complicated than they really are.

--
Greg

Greg Ewing

unread,
May 11, 2013, 7:59:17 PM5/11/13
to python...@python.org
Ian Cordasco wrote:
> But + is already a supported operation on strings and has been since
> at least python 2. It is already there and it doesn't require a new
> dunder method for concatenating with the Ellipsis object.

There would be no dunder method, because it's not a
run-time operation. It's a syntax for writing a string
literal that spans more than one line. Using it
between any two things that are not string literals
would be a syntax error.

--
Greg

Greg Ewing

unread,
May 11, 2013, 8:11:52 PM5/11/13
to python...@python.org
Mark Janssen wrote:
> Strings are different from numbers anyway, it's an old
> habit/wart to use "+" for them.
>
> *moving out of the way* :))

/me throws a dictionary at Mark Janssen with a bookmark
at the entry for "plus", showing that its usage in English
is much wider than it is in mathematics.

--
Greg

Stephen J. Turnbull

unread,
May 11, 2013, 8:30:04 PM5/11/13
to Mark Janssen, python...@python.org
Mark Janssen writes:

> > > I think that is an awesome idea.
> >
> > Violates TOOWTDI.
> >
> > >>> print("This is an" + # traditional explicit operator
> > ... " %s idea." % ("awesome" if False else "unimpressive"))
> > This is an unimpressive idea.
> > >>>
>
> But you see you just helped me demonstrate my point: the Python
> interpreter *itself* uses ... as a line-continuation operater!

No, it doesn't. It's a (physical) line *separator* there. This:

>>> "This is a syntax" +
File "<stdin>", line 1
"this is a syntax " +
^
SyntaxError: invalid syntax
>>>

is a syntax error. If "... " were a line continuation, it would
be a prompt for the rest of the line, but you never get there.

> Also, it won't violate TOOWTDI if the "+" operator is deprecated for
> strings. Strings are different from numbers anyway, it's an old
> habit/wart to use "+" for them.

They're both just mathematical objects that have operations defined on
them. Although in math we usually express multiplication by
juxtaposition, I personally think EIBTI applies here. Ie, IMO using
"+" makes a lot of sense although the precedence argument is a good
one (but not good enough for introducing another operator, especially
using a symbol that already has a different syntactic meaning).

I think it's pretty clear that deprecating compile-time concatenation
by juxtaposition would be massively unpopular, so the deprecation
should be delayed until there's a truly attractive alternative.

I think the various proposals for a dedenting syntax come close, but
there remains too much resistance for my taste, and I suspect Guido
won't push it. I also agree with those who think that it probably
should wait for Python 4, given that it was apparently considered and
rejected for Python 3.

Antoine Pitrou

unread,
May 11, 2013, 8:33:54 PM5/11/13
to python...@python.org
On Sun, 12 May 2013 11:55:34 +1200
Greg Ewing <greg....@canterbury.ac.nz> wrote:
> Ian Cordasco wrote:
> > On Sat, May 11, 2013 at 2:52 PM, Mark Janssen <dreamin...@gmail.com> wrote:
>
> >>It partitions the conceptual space. "+" is a mathematical operator,
> >>but strings are not numbers.
> >
> > But + is already a supported operation on strings
>
> I still think about these two kinds of concatenation in
> different ways, though. When I use implicit concatenation,
> I don't think in terms of taking two strings and joining
> them together. I'm just writing a single string literal
> that happens to span two source lines.
>
> I believe that distinguishing them visually helps
> readability. Using + for both makes things look more
> complicated than they really are.

Agreed.

Regards

Antoine.

Stephen J. Turnbull

unread,
May 11, 2013, 11:10:21 PM5/11/13
to Antoine Pitrou, python...@python.org
Antoine Pitrou writes:
> Greg Ewing <greg....@canterbury.ac.nz> wrote:

> > I believe that distinguishing them visually helps
> > readability. Using + for both makes things look more
> > complicated than they really are.

> Agreed.

In principle, I'm with Guido on this one. TOOWTDI and EIBTI weigh
heavily with me, and I have been bitten by the "sequence of strings
ends with no comma" bug more than once (though never twice in one day
;-). Nor do I really care whether concatenation is a runtime or
compile-time operation. But vox populi is deafening....

BTW, I see no reason not to optimize "'a' + 'b'", as you can always
force runtime evaluation with "''.join(['a','b'])" (which looks insane
here, but probably wouldn't in a case where forcing runtime evaluation
was useful).

Nick Coghlan

unread,
May 12, 2013, 1:10:26 AM5/12/13
to Ron Adam, python...@python.org
On Sun, May 12, 2013 at 8:19 AM, Ron Adam <ron...@gmail.com> wrote:
>
> Greg, I meant to send my reply earlier to the list.
>
>
>
> On 05/11/2013 12:39 AM, Greg Ewing wrote:
>>>
>>> Also, doesn't this imply that ... is now an operator in some contexts,
>>
>> > but a literal in others?
>
>
> Could it's use as a literal be depreciated? I haven't seen it used in that
> except in examples.

I take it you don't use Python for multi-dimensional array based
programming, then. The ellipsis literal was added at the request of
the numeric programming folks, so they had a notation for "all
remaining columns" in an index tuple, and it is still used for that
today. The only change related to this in Python 3 was to lift the
syntactic restriction that limited the literal form to container
subscripts. This change eliminated Python 2's discrepancy between
defining index tuples directly in the subscript and in saving them to
a variable first, or passing them as arguments to a function.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Ethan Furman

unread,
May 12, 2013, 8:06:21 AM5/12/13
to Python-Ideas
Wow.

Judging from the size of this thread one might think you had suggested enumerating the string literals. ;)

--
~Ethan~

Ben Darnell

unread,
May 13, 2013, 11:20:57 PM5/13/13
to Ezio Melotti, python-ideas
On Fri, May 10, 2013 at 4:40 PM, Ezio Melotti <ezio.m...@gmail.com> wrote:
On Fri, May 10, 2013 at 10:54 PM, MRAB <pyt...@mrabarnett.plus.com> wrote:I also think that forgetting a comma in a list of function args
between two string literal args is quite uncommon, whereas forgetting
it in a sequence of strings (list, set, dict, tuple) is much more
common, so this approach should cover most of the cases.

This is my experience as well.  When I've run into problems by forgetting a comma it's nearly always been in a list, not in function arguments.  (and it's never been between two items on the same line, so the proposal in one of the subthreads here to disallow implicit concatenation only between two strings on the same line wouldn't help much).

The problem is that in other languages, a trailing comma is forbidden, while in python it is optional.  This means that lists like
  [
    1,
    2,
    3,
  ]

may or may not have a comma after the third element.  The comma is there often enough that you can fall out of the habit of checking for it when you extend the list.  The most pythonic solution is therefore to follow the example of the single-element tuple and make the trailing comma mandatory ;)

-Ben

Juancarlo Añez

unread,
May 14, 2013, 7:10:04 AM5/14/13
to Ben Darnell, python-ideas

On Mon, May 13, 2013 at 10:50 PM, Ben Darnell <b...@bendarnell.com> wrote:
  [
    1,
    2,
    3,
  ]

Ouch!

--
Juancarlo Añez

Gregory P. Smith

unread,
May 14, 2013, 12:36:54 PM5/14/13
to Ron Adam, Python-Ideas
+1 to adding something like that.  i loathe code that uses textwrap.dedent on constants.  poor memory and runtime overhead.

I was just writing up a response to suggest adding auto-detended multi-line strings to take care of one of the major use cases.  I went with a naive d""" approach but I also like your | idea here.  though it might cause too many people to want to line up the opening | and the following |s (which isn't necessary at all and is actively harmful for code style if it forces tedious reindentation when refactoring code that alters the length of the lhs before the opening |""")

-gps

Mark Dickinson

unread,
May 14, 2013, 1:24:39 PM5/14/13
to M.-A. Lemburg, Python-Ideas
On Sat, May 11, 2013 at 6:24 PM, M.-A. Lemburg <m...@egenix.com> wrote:
On 11.05.2013 19:05, Christian Tismer wrote:
> I think a simple stripping of white-space in
>
>     text = s"""
>       leftmost column
>         two-char indent
>       """
>
> would solve 95 % of common indentation and concatenation cases.
> <snipped>

This is not a good solution for long lines where you don't want to
have embedded line endings. Taken from existing code:

_litmonth = ('(?P<litmonth>'
             'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|'
             'mär|mae|mrz|mai|okt|dez|'
             'fev|avr|juin|juil|aou|aoû|déc|'
             'ene|abr|ago|dic|'
             'out'
             ')[a-z,\.;]*')

or
                    raise errors.DataError(
                        'Inconsistent revenue item currency: '
                        'transaction=%r; transaction_position=%r' %
                        (transaction, transaction_position))

Agreed.  I use the implicit concatenation a lot for exception messages like the one above; we also tend to keep line length to 80 characters *and* use nice verbose exception messages.  I could live with adding the extra '+' characters and parentheses, but I think it would be a net loss of readability.

The _litmonth example looks like a candidate for re.VERBOSE and a triple-quoted string, though.

Mark

M.-A. Lemburg

unread,
May 14, 2013, 1:43:39 PM5/14/13
to Mark Dickinson, Python-Ideas

It's taken out of context, just to demonstrate some real world
example of how long strings are broken down to handy 80 char
code lines.

The _litmonth variable is used as component to build other REs
and those typically also contain (important) whitespace,
so re.VERBOSE won't work.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, May 14 2013)

Mark Dickinson

unread,
May 14, 2013, 1:57:39 PM5/14/13
to M.-A. Lemburg, Python-Ideas
On Tue, May 14, 2013 at 6:43 PM, M.-A. Lemburg <m...@egenix.com> wrote:
> The _litmonth example looks like a candidate for re.VERBOSE and a
> triple-quoted string, though.

It's taken out of context, just to demonstrate some real world
example of how long strings are broken down to handy 80 char
code lines.

The _litmonth variable is used as component to build other REs
and those typically also contain (important) whitespace,
so re.VERBOSE won't work.

Ah, okay.  Makes sense.

Thanks,

Mark

Jan Kaliszewski

unread,
May 14, 2013, 2:00:40 PM5/14/13
to python...@python.org
14.05.2013 19:24, Mark Dickinson wrote:

>>                     raise errors.DataError(
>>                         'Inconsistent revenue item currency: '
>>                         'transaction=%r; transaction_position=%r' %
>>                         (transaction, transaction_position))
>
> Agreed.  I use the implicit concatenation a lot for exception
> messages like the one above

Me too.

But what do you think about:

raise errors.DataError(
'Inconsistent revenue item currency: '

c'transaction=%r; transaction_position=%r' %
(transaction, transaction_position))

c'...' -- for explicit string (c)ontinuation or (c)oncatenation.

Regards.
*j

Antoine Pitrou

unread,
May 14, 2013, 2:05:01 PM5/14/13
to python...@python.org
On Sat, 11 May 2013 19:24:02 +0200
"M.-A. Lemburg" <m...@egenix.com> wrote:
>
> This is not a good solution for long lines where you don't want to
> have embedded line endings. Taken from existing code:
>
> _litmonth = ('(?P<litmonth>'
> 'jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|'
> 'mär|mae|mrz|mai|okt|dez|'
> 'fev|avr|juin|juil|aou|aoû|déc|'
> 'ene|abr|ago|dic|'
> 'out'
> ')[a-z,\.;]*')

For the record, I know this isn't the point of your message, but you're
probably missing 'fév' (accented) above :-)

Regards

Antoine.

Ramchandra Apte

unread,
May 15, 2013, 12:01:17 AM5/15/13
to python...@googlegroups.com, Python-Ideas, gu...@python.org


On Saturday, 11 May 2013 00:18:51 UTC+5:30, Guido van Rossum wrote:
I just spent a few minutes staring at a bug caused by a missing comma
-- I got a mysterious argument count error because instead of foo('a',
'b') I had written foo('a' 'b').

This is a fairly common mistake, and IIRC at Google we even had a lint
rule against this (there was also a Python dialect used for some
specific purpose where this was explicitly forbidden).

Now, with modern compiler technology, we can (and in fact do) evaluate
compile-time string literal concatenation with the '+' operator, so
there's really no reason to support 'a' 'b' any more. (The reason was
always rather flimsy; I copied it from C but the reason why it's
needed there doesn't really apply to Python, as it is mostly useful
inside macros.)

Would it be reasonable to start deprecating this and eventually remove
it from the language?

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Yes, it's very obscure and the benefits are outweighed by the problems it causes.

Christian Tismer

unread,
May 15, 2013, 8:18:35 AM5/15/13
to Greg Ewing, python...@python.org
On 12.05.13 01:55, Greg Ewing wrote:
> Ian Cordasco wrote:
>> On Sat, May 11, 2013 at 2:52 PM, Mark Janssen
>> <dreamin...@gmail.com> wrote:
>
>>> It partitions the conceptual space. "+" is a mathematical operator,
>>> but strings are not numbers.
>>
>> But + is already a supported operation on strings
>
> I still think about these two kinds of concatenation in
> different ways, though. When I use implicit concatenation,
> I don't think in terms of taking two strings and joining
> them together. I'm just writing a single string literal
> that happens to span two source lines.
>
> I believe that distinguishing them visually helps
> readability. Using + for both makes things look more
> complicated than they really are.
>

Thinking more about this, yes I see that "+" is really different
for various reasons, when you just want to write a long string.
"+" involves precedence rules, which is actually too much.

Writing continuation lines with '\' is much less convenient,
because you cannot insert comments.

What I still don't like is the pure absence of anything that makes
the concatenation more visible.
So I'm searching for different ways to denote concatenating of
subsequent strings.
Or to put it the other way round:
We also can see it as ways to denote the _interruption_ of a string.

Thinking out loud...
A string is built, then we break its construction into pieces that
are glued together by the parser.
Hmm, this sounds again more like triple-quoted strings.
Still searching...

--
Christian Tismer :^) <mailto:tis...@stackless.com>
Software Consulting : Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/
14482 Potsdam : PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776 fax +49 (30) 700143-0023
PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04
whom do you want to sponsor today? http://www.stackless.com/

Cameron Simpson

unread,
May 15, 2013, 10:24:36 PM5/15/13
to Jan Kaliszewski, python...@python.org
On 14May2013 20:00, Jan Kaliszewski <z...@chopin.edu.pl> wrote:
| 14.05.2013 19:24, Mark Dickinson wrote:
|
| >>                    raise errors.DataError(
| >>                        'Inconsistent revenue item currency: '
| >>                        'transaction=%r; transaction_position=%r' %
| >>                        (transaction, transaction_position))
| >
| >Agreed.  I use the implicit concatenation a lot for exception
| >messages like the one above
|
| Me too.
|
| But what do you think about:
|
| raise errors.DataError(
| 'Inconsistent revenue item currency: '
| c'transaction=%r; transaction_position=%r' %
| (transaction, transaction_position))
|
| c'...' -- for explicit string (c)ontinuation or (c)oncatenation.

I'm -1 on it myself.

I'd expect c'' to act like b'' or u'' or r'': making a "string"-ish
thing in a special way. But c'' doesn't; the nearest analog is r''
but c'' goes _backwards_.

I much prefer:
+ 'foo'
over
c'foo'

The former already works and is perfectly clear about what it's
doing. The "c" does not do it any better and is easier to miss,
visually.

Cheers,
--
Cameron Simpson <c...@zip.com.au>

On the contrary of what you may think, your hacker is fully aware
of your company's dress code. He is fully aware of the fact that it
doesn't help him to do his job.
- Gregory Hosler <gregory...@eno.ericsson.se>

It is loading more messages.
0 new messages