Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Python 2 ‘print’, coercing arguments to Unicode

846 views
Skip to first unread message

Ben Finney

unread,
Oct 6, 2015, 5:52:20 AM10/6/15
to pytho...@python.org
Howdy all,

In Python 2.7, I am seeing this behaviour for ‘print’::

Python 2.7.10 (default, Sep 13 2015, 20:30:50)
[GCC 5.2.1 20150911] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> from __future__ import print_function
>>> import io
>>> print(None)
None
>>> print(None, file=io.StringIO())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'

So, although my string literals are now Unicode objects, apparently
‘print’ still coerces objects using the bytes type ‘str’.

Binding the ‘str’ name to the Unicode type doesn't help::

>>> str = unicode
>>> print(None, file=io.StringIO())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'

The reason I need to do this is that I'm replacing the standard streams
(‘sys.stderr’, etc.) with ‘io.StringIO’ instances in a test suite. That
works great for everything but ‘print’.

Since this is a test suite for existing code, I don't have the option to
change all the existing statements; I need them to work as-is.

How can I convince ‘print’, everywhere throughout a module, that it
should coerce its arguments using ‘unicode’?

--
\ “Not using Microsoft products is like being a non-smoker 40 or |
`\ 50 years ago: You can choose not to smoke, yourself, but it's |
_o__) hard to avoid second-hand smoke.” —Michael Tiemann |
Ben Finney

Ben Finney

unread,
Oct 6, 2015, 6:45:39 AM10/6/15
to pytho...@python.org
Ben Finney <ben+p...@benfinney.id.au> writes:

> In Python 2.7, I am seeing this behaviour for ‘print’::
>
> Python 2.7.10 (default, Sep 13 2015, 20:30:50)
> [GCC 5.2.1 20150911] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from __future__ import unicode_literals
> >>> from __future__ import print_function
> >>> import io
> >>> print(None)
> None
> >>> print(None, file=io.StringIO())
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: unicode argument expected, got 'str'
>
> So, although my string literals are now Unicode objects, apparently
> ‘print’ still coerces objects using the bytes type ‘str’.

To eliminate ‘from __future__ import print_function’ as a possible
factor, here is another demonstration without that::

Python 2.7.10 (default, Sep 13 2015, 20:30:50)
[GCC 5.2.1 20150911] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> import sys
>>> import io
>>> print "foo"
foo
>>> print None
None
>>> sys.stdout = io.StringIO()
>>> print "foo"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'
>>> print None
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'

So it appears that even a string literal, which is explicitly Unicode by
the above ‘from __future__ import unicode_literals’, is still being
coerced to a bytes ‘str’ object by ‘print’.

How can I convince ‘print’, everywhere throughout a module, that it
should coerce its arguments using ‘unicode’?

--
\ “Pity the meek, for they shall inherit the earth.” —Donald |
`\ Robert Perry Marquis |
_o__) |
Ben Finney

Laura Creighton

unread,
Oct 6, 2015, 8:43:32 AM10/6/15
to Ben Finney, pytho...@python.org, l...@openend.se
I think the thing you want to converse is your stringIO not your print.
I think you do this using six.stringIO
https://pythonhosted.org/six/

But I have only read the doc, not done this in practice.

Laura

Laura Creighton

unread,
Oct 6, 2015, 8:48:40 AM10/6/15
to Ben Finney, pytho...@python.org, l...@openend.se
Hmm, now that I read the six document again
@six.python_2_unicode_compatible

seems exactly what you are asking for ...

https://pythonhosted.org/six/
Laura

Peter Otten

unread,
Oct 6, 2015, 10:31:11 AM10/6/15
to pytho...@python.org
I don't think this is possible with the print statement, but the print()
function can be replaced with anything you like:

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import unicode_literals
>>> from __future__ import print_function
>>> import io
>>> _print = print
>>> def print(*args, **kw):
... return _print(*map(unicode, args), **kw)
...
>>> print(None, file=io.StringIO())
>>> outstream = io.StringIO()
>>> print(None, file=outstream)
>>> outstream.getvalue()
u'None\n'
>>> print(None)
None


Ben Finney

unread,
Oct 6, 2015, 5:24:05 PM10/6/15
to pytho...@python.org
Laura Creighton <l...@openend.se> writes:

> Hmm, now that I read the six document again
> @six.python_2_unicode_compatible


Peter Otten <__pet...@web.de> writes:

> I don't think this is possible with the print statement, but the
> print() function can be replaced with anything you like:


Hmm. I am more looking for something that doesn't involve replacing
‘print’, but rather to hook into whatever it uses to coerce the type of
its arguments.

--
\ “The problem with television is that the people must sit and |
`\ keep their eyes glued on a screen: the average American family |
_o__) hasn't time for it.” —_The New York Times_, 1939 |
Ben Finney

Peter Otten

unread,
Oct 6, 2015, 6:55:36 PM10/6/15
to pytho...@python.org
Ben Finney wrote:

>> I don't think this is possible with the print statement, but the
>> print() function can be replaced with anything you like:
>
>
> Hmm. I am more looking for something that doesn't involve replacing
> ‘print’, but rather to hook into whatever it uses to coerce the type of
> its arguments.

Have a look at PyFile_WriteObject in Objects/fileobject.c.
As I understand the code it basically does

if isinstance(obj) and stream.encoding is not None:
s = obj.encode(stream.encoding))
else:
s = str(obj)
stream.write(s)

There's no way to get the original object.




Terry Reedy

unread,
Oct 6, 2015, 7:08:18 PM10/6/15
to pytho...@python.org
On 10/6/2015 6:45 AM, Ben Finney wrote:
> Ben Finney <ben+p...@benfinney.id.au> writes:

> How can I convince ‘print’, everywhere throughout a module, that it
> should coerce its arguments using ‘unicode’?

Use Python 3. I am only half joking. Switching to unicode instead of
bytes as the default text type fixed numerous bugs all at once.

--
Terry Jan Reedy


Ben Finney

unread,
Oct 6, 2015, 7:23:48 PM10/6/15
to pytho...@python.org
Peter Otten <__pet...@web.de> writes:

> Have a look at PyFile_WriteObject in Objects/fileobject.c.
> As I understand the code it basically does
>
> if isinstance(obj) and stream.encoding is not None:
> s = obj.encode(stream.encoding))
> else:
> s = str(obj)
> stream.write(s)

So as I understand it I'm looking for the hypothetical

from __future__ import print_coerce_args_to_unicode

and without that, I'm stuck with the hard-coded ‘str’ coercion.

Thanks, at least now I can stop flailing to try to get it working.

--
\ “Fear him, which after he hath killed hath power to cast into |
`\ hell; yea, I say unto you, Fear him.” –Jesus, as quoted in Luke |
_o__) 12:5 |
Ben Finney

Ben Finney

unread,
Oct 6, 2015, 7:25:28 PM10/6/15
to pytho...@python.org
This is all part of a transition to Python 3, so I am fully on board
with that. It doesn't help address the problem to tell me I want to do
what I'm already in pursuit of doing :-)

--
\ “Intellectual property is to the 21st century what the slave |
`\ trade was to the 16th.” —David Mertz |
_o__) |
Ben Finney

wxjm...@gmail.com

unread,
Oct 7, 2015, 3:50:55 AM10/7/15
to
Le mercredi 7 octobre 2015 01:08:18 UTC+2, Terry Reedy a écrit :
>
> [...] Switching to unicode instead of
> bytes as the default text type fixed numerous bugs all at once.
> [...]

This is a very interesting sentence.
0 new messages