[Python-Dev] Pickling of Enums

261 views
Skip to first unread message

Serhiy Storchaka

unread,
Feb 15, 2014, 2:01:36 PM2/15/14
to pytho...@python.org
How Enum items should be pickled, by value or by name?

I think that Enum will be used to collect system-depending constants, so
the value of AddressFamily.AF_UNIX can be 1 on one platform and 2 on
other. If pickle enums by value, then pickled AddressFamily.AF_INET on
on platform can be unpickled as AddressFamily.AF_UNIX on other platform.
This looks weird and contrary to the nature of enums.

_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Antoine Pitrou

unread,
Feb 15, 2014, 2:37:46 PM2/15/14
to pytho...@python.org
On Sat, 15 Feb 2014 21:01:36 +0200
Serhiy Storchaka <stor...@gmail.com> wrote:
> How Enum items should be pickled, by value or by name?
>
> I think that Enum will be used to collect system-depending constants, so
> the value of AddressFamily.AF_UNIX can be 1 on one platform and 2 on
> other. If pickle enums by value, then pickled AddressFamily.AF_INET on
> on platform can be unpickled as AddressFamily.AF_UNIX on other platform.
> This looks weird and contrary to the nature of enums.

I agree with you, they should be pickled by name. An enum is a kind of
global in this regard.

(but of course, before AF_UNIX was an enum it was pickled by value)

Regards

Antoine.

Ethan Furman

unread,
Feb 18, 2014, 12:11:35 PM2/18/14
to pytho...@python.org
On 02/15/2014 11:01 AM, Serhiy Storchaka wrote:
> How Enum items should be pickled, by value or by name?
>
> I think that Enum will be used to collect system-depending constants, so the value of AddressFamily.AF_UNIX can be 1 on
> one platform and 2 on other. If pickle enums by value, then pickled AddressFamily.AF_INET on on platform can be
> unpickled as AddressFamily.AF_UNIX on other platform. This looks weird and contrary to the nature of enums.

There is one more wrinkle to pickling by name (it's actually still there in pickle by value, just more obvious in pickle
by name) -- aliases. It seems to me the most common scenario to having a name represent different values on different
systems is when on system A they are different, but on system B they are the same:

System A:

class SystemEnum(Enum):
value1 = 1
value2 = 2

System B:

class SystemEnum(Enum):
value1 = 1
value2 = 1

If you're on system B there is no way to pickle (by name or value) value2 such that we get value2 back on system A. The
only way I know of to make that work would be to dispense with identity comparison, use the normal == comparison, and
have aliases actually be separate objects (we could still use singletons, but it would be one per name instead of the
current one per value, and it would also be an implementation detail).

Thoughts?

--
~Ethan~

Guido van Rossum

unread,
Feb 18, 2014, 12:47:43 PM2/18/14
to Ethan Furman, Python-Dev
I'm confused. Hasn't this all been decided by the PEP long ago?


Ethan Furman

unread,
Feb 18, 2014, 1:01:42 PM2/18/14
to gu...@python.org, Python-Dev
On 02/18/2014 09:47 AM, Guido van Rossum wrote:
>
> I'm confused. Hasn't this all been decided by the PEP long ago?

The PEP only mentions pickling briefly, as in "the normal rules apply". How pickling occurs is an implementation
detail, and it turns out that pickling by name is more robust.

Serhiy, as part of his argument for using the _name_ instead of the _value_ for pickling, brought up the point that
different systems could have different values for the same name. If true in practice (and I believe it is) this raises
the issue of aliases, which currently *cannot* be pickled by name because there is no distinct object for the alias. If
you ask for Color['alias_for_red'] you'll get Color.red instead.

Using identity comparison was part of the PEP.

I guess the question is which is more important? Identity comparison or this (probably) rare use-case? If we stick
with identity I'm not aware of any work-around for pickling enum members that are aliases on one system, but distinct on
another.

I've been talking about pickling specifically, but this applies to any serialization method.

--
~Ethan~
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Antoine Pitrou

unread,
Feb 18, 2014, 1:05:52 PM2/18/14
to pytho...@python.org
On Tue, 18 Feb 2014 10:01:42 -0800
Ethan Furman <et...@stoneleaf.us> wrote:
>
> I guess the question is which is more important? Identity comparison or this (probably) rare use-case? If we stick
> with identity I'm not aware of any work-around for pickling enum members that are aliases on one system, but distinct on
> another.

I don't think identity comparison is important. Enum values are
supposed to act like values, not full-blown objects.

OTOH, the "pickled aliases may end up different on other systems" issue
is sufficiently fringy that we may simply paper over it.

Regards

Antoine.

Guido van Rossum

unread,
Feb 18, 2014, 1:05:51 PM2/18/14
to Ethan Furman, Python-Dev
Hm. But there's an implementation that has made it unscathed through several betas and an RC. AFAICT that beta pickles enums by value. And I happen to think that that is the better choice (but I don't have time to explain this gut feeling until after 3.4 has been released).


Serhiy Storchaka

unread,
Feb 18, 2014, 1:21:01 PM2/18/14
to pytho...@python.org
18.02.14 19:11, Ethan Furman написав(ла):

> There is one more wrinkle to pickling by name (it's actually still there
> in pickle by value, just more obvious in pickle by name) -- aliases. It
> seems to me the most common scenario to having a name represent
> different values on different systems is when on system A they are
> different, but on system B they are the same:
>
> System A:
>
> class SystemEnum(Enum):
> value1 = 1
> value2 = 2
>
> System B:
>
> class SystemEnum(Enum):
> value1 = 1
> value2 = 1
>
> If you're on system B there is no way to pickle (by name or value)
> value2 such that we get value2 back on system A. The only way I know of
> to make that work would be to dispense with identity comparison, use the
> normal == comparison, and have aliases actually be separate objects (we
> could still use singletons, but it would be one per name instead of the
> current one per value, and it would also be an implementation detail).
>
> Thoughts?

There are aliases and aliases. If there are modern name and deprecated
name, then it should be one object referred by different names on all
systems. If there are different entities with accidentally equal values,
then they should be different objects.

Ethan Furman

unread,
Feb 18, 2014, 1:16:17 PM2/18/14
to gu...@python.org, Python-Dev
On 02/18/2014 10:05 AM, Guido van Rossum wrote:
> Hm. But there's an implementation that has made it unscathed through several betas and an RC. AFAICT that beta pickles
> enums by value. And I happen to think that that is the better choice (but I don't have time to explain this gut feeling
> until after 3.4 has been released).

This conversation wasn't in the PEP, but as I recall we decided to go with value instead of name for json because the
receiving end may not be running Python.

Is having json do it one way and pickle another a problem?

--
~Ethan~
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Guido van Rossum

unread,
Feb 18, 2014, 2:20:44 PM2/18/14
to Ethan Furman, Python-Dev
I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)

$ python3
Python 3.4.0rc1+ (default:2ba583191550, Feb 11 2014, 16:05:24)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket, pickle, json, pickletools
>>> socket.AF_INET
<AddressFamily.AF_INET: 2>
>>> pickle.dumps(socket.AF_INET)
b'\x80\x03csocket\nAddressFamily\nq\x00K\x02\x85q\x01Rq\x02.'
>>> json.dumps(socket.AF_INET)
'2'
>>> pickletools.dis(pickle.dumps(socket.AF_INET))
    0: \x80 PROTO      3
    2: c    GLOBAL     'socket AddressFamily'
   24: q    BINPUT     0
   26: K    BININT1    2
   28: \x85 TUPLE1
   29: q    BINPUT     1
   31: R    REDUCE
   32: q    BINPUT     2
   34: .    STOP
highest protocol among opcodes = 2
>>>





On Tue, Feb 18, 2014 at 10:16 AM, Ethan Furman <et...@stoneleaf.us> wrote:
On 02/18/2014 10:05 AM, Guido van Rossum wrote:
Hm. But there's an implementation that has made it unscathed through several betas and an RC. AFAICT that beta pickles
enums by value. And I happen to think that that is the better choice (but I don't have time to explain this gut feeling
until after 3.4 has been released).

This conversation wasn't in the PEP, but as I recall we decided to go with value instead of name for json because the receiving end may not be running Python.

Is having json do it one way and pickle another a problem?

--
~Ethan~



Guido van Rossum

unread,
Feb 18, 2014, 2:37:15 PM2/18/14
to Ethan Furman, Python-Dev
Well, I'm against that.


On Tue, Feb 18, 2014 at 11:26 AM, Ethan Furman <et...@stoneleaf.us> wrote:
On 02/18/2014 11:20 AM, Guido van Rossum wrote:

I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about
behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)

There's an open issue [1] to switch to pickling by name.

--
~Ethan~

[1] http://bugs.python.org/issue20653

Serhiy Storchaka

unread,
Feb 18, 2014, 2:46:27 PM2/18/14
to pytho...@python.org
18.02.14 20:16, Ethan Furman написав(ла):

> This conversation wasn't in the PEP, but as I recall we decided to go
> with value instead of name for json because the receiving end may not be
> running Python.
>
> Is having json do it one way and pickle another a problem?

We decided to go with value instead of name for JSON because JSON
doesn't support enums, but supports integers and strings, and because
enums are comparable with they values, but not with they names.

>>> json.loads(json.dumps(socket.AF_INET)) == socket.AF_INET
True

We simply had no other choice.

Ethan Furman

unread,
Feb 18, 2014, 2:26:29 PM2/18/14
to gu...@python.org, Python-Dev
On 02/18/2014 11:20 AM, Guido van Rossum wrote:
>
> I'm confused. AFAICT enums are pickled by value too. What am I missing? Are we confused about terminology or about
> behavior? (I'm just guessing that the pickling happens by value because I don't see the string AF_INET.)

There's an open issue [1] to switch to pickling by name.

--
~Ethan~

[1] http://bugs.python.org/issue20653

Serhiy Storchaka

unread,
Feb 18, 2014, 2:53:23 PM2/18/14
to pytho...@python.org
18.02.14 21:20, Guido van Rossum написав(ла):

> I'm confused. AFAICT enums are pickled by value too. What am I missing?
> Are we confused about terminology or about behavior? (I'm just guessing
> that the pickling happens by value because I don't see the string AF_INET.)

Pickling was not even working two weeks ago. [1]

[1] http://bugs.python.org/issue20534

Guido van Rossum

unread,
Feb 18, 2014, 2:55:58 PM2/18/14
to Serhiy Storchaka, Python-Dev
Well, I still think it should be done by value.


Ethan Furman

unread,
Feb 18, 2014, 3:48:33 PM2/18/14
to Python-Dev
On 02/18/2014 11:37 AM, Guido van Rossum wrote:
>
> Well, I'm against that.

Given the lack of a tidal wave of support for the idea, I'll let it die with that.

Still, many thanks to Serhiy for greatly improving the way pickling is implemented for Enums, even using values.

--
~Ethan~

Ethan Furman

unread,
Feb 18, 2014, 3:55:26 PM2/18/14
to pytho...@python.org
On 02/18/2014 11:53 AM, Serhiy Storchaka wrote:
> 18.02.14 21:20, Guido van Rossum написав(ла):
>> I'm confused. AFAICT enums are pickled by value too. What am I missing?
>> Are we confused about terminology or about behavior? (I'm just guessing
>> that the pickling happens by value because I don't see the string AF_INET.)
>
> Pickling was not even working two weeks ago. [1]

For the record, pickling worked just fine for protocols 2 and 3, and 4 didn't exist at the time.

--
~Ethan~

Reply all
Reply to author
Forward
0 new messages