Consider adding an iterable option to dataclass

35 views
Skip to first unread message

Neil Girdhar

unread,
Aug 10, 2018, 7:01:59 PM8/10/18
to python-ideas
It would be nice if dataclasses (https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass) had an option to make them a sequence.  This would make

dataclass(frozen=True, order=True, sequence=True)

an optionally-typed version of namedtuple.  It would almost totally supplant it except that namedtuples have a smaller memory footprint.

sequence would simply inherit from collections.abc.Sequence and implement the two methods __len__ and __getitme__.

Best,

Neil

Steven D'Aprano

unread,
Aug 10, 2018, 8:30:53 PM8/10/18
to python...@python.org
On Fri, Aug 10, 2018 at 04:01:59PM -0700, Neil Girdhar wrote:
> It would be nice if dataclasses
> (https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass)
> had an option to make them a sequence.

Do you have a use-case or reason for this other than "it would be nice"?
Nice in what way? We already have namedtuple, and for backwards
compatibility if no other reason it won't be going away. What benefit do
we get from allowing dataclasses to do what namedtuple already does?

You already mentioned one disadvantage: namedtuple is much more memory
efficient. What corresponding benefit do you see?

Dataclass already supports explicit conversion to tuples and dicts. What
use-cases for sequence-ness don't they support?

Conceptually, I think of dataclasses as a record or a struct, not as a
sequence. (I'll admit that I think of namedtuples the same way, and
almost never make use of their tuple-ness.) I would find it strange for
dataclass to support a sequence API out of the box.


--
Steve
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Eric V. Smith

unread,
Aug 10, 2018, 10:14:43 PM8/10/18
to python...@python.org
On 8/10/2018 7:01 PM, Neil Girdhar wrote:
> It would be nice if dataclasses
> (https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass)
> had an option to make them a sequence.  This would make
>
> dataclass(frozen=True, order=True, sequence=True)
>
> an optionally-typed version of namedtuple.  It would almost totally
> supplant it except that namedtuples have a smaller memory footprint.

Note that type.NamedTuple already gives you typed namedtuples.
Admittedly the feature set is different from dataclasses, though.

> sequence would simply inherit from collections.abc.Sequence and
> implement the two methods __len__ and __getitme__.

Unless I'm misunderstanding you, this falls in to the same problem as
setting __slots__: you need to return a new class, in this case since
you can't add inheritance after the fact. I don't think
__isinstancecheck__ helps you here, but maybe I'm missing something (I'm
not a big user of inheritance or ABCs).

Not that returning a new class is impossible, it's just that I didn't
want to do it in the first go-round with dataclasses.

For slots, I have a sample @add_slots() at
https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py.
Maybe we could do something similar with @add_sequence() and test it
out? It would have to be a little more sophisticated than @add_slots(),
since it would need to iterate over __dataclass_fields__, etc.

I'm on vacation next week, maybe I'll play around with this.

Eric

Neil Girdhar

unread,
Aug 11, 2018, 5:48:38 AM8/11/18
to python...@googlegroups.com, python...@python.org
My only motivation for this idea is so that I can forget about namedtuple.  Thinking about it again today, I withdraw my suggestion until I one day see a need for it.

On Fri, Aug 10, 2018 at 10:14 PM Eric V. Smith <er...@trueblade.com> wrote:
On 8/10/2018 7:01 PM, Neil Girdhar wrote:
> It would be nice if dataclasses
> (https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass)
> had an option to make them a sequence.  This would make
>
> dataclass(frozen=True, order=True, sequence=True)
>
> an optionally-typed version of namedtuple.  It would almost totally
> supplant it except that namedtuples have a smaller memory footprint.

Note that type.NamedTuple already gives you typed namedtuples.
Admittedly the feature set is different from dataclasses, though.

> sequence would simply inherit from collections.abc.Sequence and
> implement the two methods __len__ and __getitme__.

Unless I'm misunderstanding you, this falls in to the same problem as
setting __slots__: you need to return a new class, in this case since
you can't add inheritance after the fact. I don't think
__isinstancecheck__ helps you here, but maybe I'm missing something (I'm
not a big user of inheritance or ABCs).

Not that returning a new class is impossible, it's just that I didn't
want to do it in the first go-round with dataclasses.

That's a fair point.  I'm sure you know that your decorator could always return a new class that inherits from both Sequence and the original class.  As a user of dataclass, I never assumed that it wouldn't do this.
 
For slots, I have a sample @add_slots() at
https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py.
Maybe we could do something similar with @add_sequence() and test it
out? It would have to be a little more sophisticated than @add_slots(),
since it would need to iterate over __dataclass_fields__, etc.

I'm on vacation next week, maybe I'll play around with this.

Cool, have a great vacation. 

Eric
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

--

---
You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/8C9iVJsba5A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to python-ideas...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ivan Levkivskyi

unread,
Aug 13, 2018, 7:09:32 AM8/13/18
to Eric V. Smith, python-ideas
On 11 August 2018 at 01:29, Eric V. Smith <er...@trueblade.com> wrote:
On 8/10/2018 7:01 PM, Neil Girdhar wrote:
[...]
[...]

sequence would simply inherit from collections.abc.Sequence and implement the two methods __len__ and __getitme__.

Unless I'm misunderstanding you, this falls in to the same problem as setting __slots__: you need to return a new class, in this case since you can't add inheritance after the fact. I don't think __isinstancecheck__ helps you here, but maybe I'm missing something (I'm not a big user of inheritance or ABCs).


Here are three points to add:

1. collections.abc.Sequence doesn't have a __subclasshook__, i.e. it doesn't support structural behaviour. There was an idea as a part of PEP 544 to make Sequence and Mapping structural, but it was rejected after all.
2. Mutating __bases__ doesn't require creating a new class. So one can just add Sequence after creation. That said, I don't like this idea, `typing` used to do some manipulations with bases, and it caused several confusions and subtle bugs, until it was "standardised" in PEP 560.
3. In my experience with some real life code the most used tuple API in named tuples is unpacking, for example:

    class Row(NamedTuple):
        id: int
        name: str

    rows: List[Row]

    for id, name in rows:
        ...

I proposed to add it some time ago in https://github.com/ericvsmith/dataclasses/issues/21, it will be enough to just generate an __iter__ (btw such classes will be automatically subclasses of collections.abc.Iterable, which is structural):

@data(iterable=True)
class Point:
    x: int
    y: int
origin = Point(0, 0)
x, y = origin

But this idea was postponed/deferred. Maybe we can reconsider it?

--
Ivan


Reply all
Reply to author
Forward
0 new messages