[Python-ideas] Add recordlcass to collections module

Martin Bammer

unread,

Sep 1, 2018, 3:48:07 AM9/1/18

to python...@python.org

Hi,

what about adding recordclass
(https://bitbucket.org/intellimath/recordclass) to the collections module

It is like namedtuple, but elements are writable and it is written in C
and thus much faster.

And for convenience it could be named as namedlist.

Regards,

Martin

_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Steven D'Aprano

unread,

Sep 1, 2018, 4:57:27 AM9/1/18

to python...@python.org

On Sat, Sep 01, 2018 at 09:47:04AM +0200, Martin Bammer wrote:
> Hi,
>
> what about adding recordclass
> (https://bitbucket.org/intellimath/recordclass) to the collections module

The first thing you need to do is ask the author of that library whether
or not he or she is willing to donate the library to the Python stdlib,
which (among other things) means keeping to the same release schedule as
the rest of the stdlib.

> It is like namedtuple, but elements are writable and it is written in C
> and thus much faster.

Faster than what?

> And for convenience it could be named as namedlist.

Why? Is it a list?

How or why is it better than dataclasses?

--
Steve

Jonathan Fine

unread,

Sep 1, 2018, 12:11:53 PM9/1/18

to Martin Bammer, python-ideas

Hi Martin

Summary: Thank you. Your suggestion has good points. I suggest to
advance it (i) provide a pure Python implementation of namedlist, and
(ii) ask that the Python docs for namedtuple provide a link to
namedlist.

Thank you, Martin, for bringing
https://bitbucket.org/intellimath/recordclass to this list's
attention. Here's my first impressions.

Here's the good things I've noticed (although not closely examined).

1. This is released software, available through pip.
2. There's a link on the page to an example in a Jupyter notebook.
3. That page gives performance statistics for the C-implementation.
4. The key idea is simple and well expressed.
5. The promoter (you) is not the package's author.

Of all the suggestions made to this list, I'd say based on the above
that this one is in the top quarter. The credit for this belong
mostly, of course its author Zaur Shibzukhov. By the way, there's a
mirror of the bitbucket repository here
https://github.com/intellimath/recordclass.

Here's my suggestions for going forward. They're based on my guess
that there's some need for a mutable variant of named tuple, but not
the same need for a C implementation. And they're based on what I
like, rather than the opinions of many.

1. Produce a pure Python implementation of recordclass.

2. Instead, as you said, call it namedlist.

3. Write some docs for the new class, similar to
https://docs.python.org/3/library/collections.html#collections.namedtuple

4. Once you've done 1-3 above, request that the Python docs reference
the new class in the "See also" for named tuple.

Mutable and immutable is, for me, a key concept in Python. Here's an
easy way to 'modify' a tuple:

>>> orig = tuple(range(5)); orig
(0, 1, 2, 3, 4)
>>> tmp = list(orig)
>>> tmp = list(orig); tmp
[0, 1, 2, 3, 4]
>>> tmp[3] += tmp[1]; tmp[4] += tmp[2]
>>> tmp
[0, 1, 2, 4, 6]
>>> result = tuple(tmp); result
(0, 1, 2, 4, 6)

Of course, 'modify' means create a new one, changed in some way. And
if the original is a namedtuple, that it makes sense to use namedlist.

Here are some final remarks. (All my own opinions, not deep truth.)

1. Focus on getting and meeting the expressed needs of users. A link
from the Python docs will help here.

2. Don't worry about performance of the pure Python implementation. It
won't hold back progress.

3. I'd personally like to see something like numpy, but for
combinatorial rather than numerical computing. Perhaps the
memoryslots.c (on which recordclass depends) might be useful here. But
that's further in the future.

Once again, thank you for Martin, for bringing this to our attention.
And to Zaur for writing the software.

--
best regards

Jonathan

Angus Hollands

unread,

Sep 1, 2018, 1:09:21 PM9/1/18

to python...@python.org

From: "Steven D'Aprano" <st...@pearwood.info>
To: python...@python.org
Cc:
Bcc:
Date: Sat, 1 Sep 2018 18:25:21 +1000
Subject: Re: [Python-ideas] Add recordlcass to collections module

On Sat, Sep 01, 2018 at 09:47:04AM +0200, Martin Bammer wrote:
> Hi,
>
> what about adding recordclass
> (https://bitbucket.org/intellimath/recordclass) to the collections module

The first thing you need to do is ask the author of that library whether
or not he or she is willing to donate the library to the Python stdlib,
which (among other things) means keeping to the same release schedule as
the rest of the stdlib.

> It is like namedtuple, but elements are writable and it is written in C
> and thus much faster.

Faster than what?

> And for convenience it could be named as namedlist.

Why? Is it a list?

How or why is it better than dataclasses?

--
Steve

_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas

It would need to be a list of the members are mutable. As to the other questions, yes, do we need another module in the standard library?Angus

Jonathan Goble

unread,

Sep 1, 2018, 1:34:01 PM9/1/18

to Angus Hollands, python...@python.org

On Sat, Sep 1, 2018 at 1:08 PM Angus Hollands <goos...@gmail.com> wrote:

As to the other questions, yes, do we need another module in the standard library?

Wouldn't need a new module. This would be a perfect fit for the existing collections module where namedtuple already resides.

I Googled "pypi namedlist", and the top three results were all other implementations of named lists or something similar:

- namedlist <https://pypi.org/project/namedlist/>, which I have personally used and found extremely useful

- list-property <https://pypi.org/project/list-property/>

- mutabletuple <https://pypi.org/project/mutabletuple/>

Clearly the concept is useful enough to have several competing implementations on PyPI, and to me that is a point in favor of picking an implementation and adding it to the stdlib as the one obvious way to do it. So +1 from me.

Robert Vanden Eynde

unread,

Sep 1, 2018, 1:39:05 PM9/1/18

to python-ideas

What's the difference between you proposition and dataclasses ? Introduced in Python 3.7 ?

_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas

Jonathan Goble

unread,

Sep 1, 2018, 1:45:40 PM9/1/18

to Robert Vanden Eynde, python-ideas

On Sat, Sep 1, 2018 at 1:38 PM Robert Vanden Eynde <rober...@gmail.com> wrote:

What's the difference between you proposition and dataclasses ? Introduced in Python 3.7 ?

A named list would allow sequence operations such as iteration. Dataclasses, IIUC, do not support sequence operations.

Thautwarm Zhao

unread,

Sep 1, 2018, 2:23:27 PM9/1/18

to python-ideas

---------- Forwarded message ----------
From: Martin Bammer <mrb...@gmail.com>
To: python...@python.org
Cc:
Bcc:
Date: Sat, 1 Sep 2018 09:47:04 +0200
Subject: [Python-ideas] Add recordlcass to collections module
Hi,

what about adding recordclass
(https://bitbucket.org/intellimath/recordclass) to the collections module

It is like namedtuple, but elements are writable and it is written in C
and thus much faster.

And for convenience it could be named as namedlist.

Regards,

Martin

There are a problem which prevent you to reach your goals.

As list in Python is already that efficient, a wrapper of this list in C to supply so-called namedlist interface could not be that efficient.

Member accessing in Python bytecode requires looking up corresponding attribute in a hashtable, if you want to access a list by attribute, actually there is an overhead. This is nothing to do with C implementation. See https://docs.python.org/3/library/dis.html#opcode-LOAD_ATTR

Steven D'Aprano

unread,

Sep 1, 2018, 8:19:40 PM9/1/18

to python...@python.org

On Sat, Sep 01, 2018 at 05:10:49PM +0100, Jonathan Fine wrote:
> Hi Martin
>
> Summary: Thank you. Your suggestion has good points. I suggest to
> advance it (i) provide a pure Python implementation of namedlist, and
> (ii) ask that the Python docs for namedtuple provide a link to
> namedlist.

Before Martin (and you) get carried away doing these things, there's a
lot more to do first.

For starters, how about answering the questions I asked? Recapping:

- The package author describes this as a record class, not a list,
and it doesn't seem to support any list operations, so why do you
and Martin want to change the name to namedlist?

- What would it mean to insert, sort, append etc named items in a
list? When would you want to do it?

- Have you asked the author what he thinks about putting it in the
standard library?

(The author calls it a "proof of concept", and it is version 0.5.
That doesn't sound like the author considers this a mature product.)

- How is this different from data classes?

If the answer is that this supports iteration, why not add iteration to
data classes? See this thread:

https://mail.python.org/pipermail/python-ideas/2018-August/052683.html

[...]

> 1. Focus on getting and meeting the expressed needs of users. A link
> from the Python docs will help here.

It's not the job of the Python docs to link to every and any third-party
package that somebody might find useful.

It might -- perhaps -- make sense for the docs to mention or link to
third-party libraries such as numpy which are widely recognised as "best
of breed". (Not that numpy needs a link from the std lib.) But in
general it is hardly fair for us to single out some arbitrary third-
party libraries for official recognition while other libraries, perhaps
better or more worthy, are ignored.

Put yourself in the shoes of somebody who has worked hard to get a
package into a mature state, and then the Python docs start linking to a
competing alpha-quality package just because by pure chance, that was
the package that got mentioned on Python-Ideas first.

--
Steve

Zaur Shibzukhov

unread,

Sep 2, 2018, 2:24:19 PM9/2/18

to python-ideas

As the author of `recordclass` I would like to shed some light...

Recorclass originated as a response to the [question](https://stackoverflow.com/questions/29290359/existence-of-mutable-named-tuple-in-python/29419745#29419745) on stackoverflow.

`Recordclass` was conceived and implemented as a type that, by api, memory and speed, would be completely identical to` namedtuple`, except that it would support an assignment in which any element could be replaced without creating a new instance, as in ` namedtuple`. Those. would be almost identical to `namedtuple` and support the assignment (` __setitem__` / `setslice__`).

The effectiveness of namedtuple is based on the effectiveness of the `tuple` type in python. In order to achieve the same efficiency it was necessary to create a type `memoryslots`. Its structure (`PyMemorySlotsObject`) is identical to the structure of` tuple` (`PyTupleObject`) and therefore takes up the same amount of memory as` tuple`.

`Recordclass` is defined on top of` memoryslots` just like `namedtuple` above` tuple`. Attributes are accessed via a descriptor (`itemgetset`), which supports both` __get__` and `__set__` by the element index.

The class generated by `recordclass` is:

`` `
from recordclass import memoryslots, itemgetset

class C (memoryslots):
    __slots__ = ()

    _fields = ('attr_1', ..., 'attr_m')

    attr_1 = itemgetset (0)
    ...
    attr_m = itemgetset (m-1)

    def __new __ (cls, attr_1, ..., attr_m):
        'Create new instance of {typename} ({arg_list})'
        return memoryslots .__ new __ (cls, attr_1, ..., attr_m)
`` `
etc. following the `namedtuple` definition scheme.

As a result, `recordclass` takes up as much memory as` namedtuple`, it supports quick access by `__getitem__` /` __setitem__` and by attribute name via the protocol of the descriptors.

Regards,

Zaur

суббота, 1 сентября 2018 г., 10:48:07 UTC+3 пользователь Martin Bammer написал:

Message has been deleted

pawel....@daftcode.pl

unread,

Sep 2, 2018, 3:22:30 PM9/2/18

to python-ideas

This recordclass lib is indeed impressive. I've made some tests on some simple cases and it indeed is fast (on par with namedtuple), bravo!

I think it would nice to have this in the standard library. Possible problem: doesn't it violate The Zen of Python's There should be one-- and preferably only one --obvious way to do it principle? It already is quite crowdy there, as we have: SimpleNamespace, namedtuple/NamedTuple and dataclass, all in the standard library. Every one of them has its specific niche and recordclass would possibly also have, so it shouldn't be a problem, though (at least with a dedicated docs page).

As for the name I don't like "namedlist", since recordclass doesn't support the most basic list operations, like append or concatenate (+); and it shouldn't since its instances are defined as fixed length objects.

Martin Bammer

unread,

Sep 2, 2018, 4:57:56 PM9/2/18

to Steven D'Aprano, python...@python.org

Hi,

then intention of my first mail was to start a discussion about this
topic about the pros and cons and possible alternatives.
As long as it is not clear that recordclass or something like that is
accepted to be implemented to the collections module
I do not want to spend any effort on this.

My wish that the collections module gets something like namedtuple, but
writable, is based on my personal experience when projects are becoming
bigger and data structures more complex it is sometimes useful to named
items and not just an index. This improves the readability and makes
development and maintenance of the code easier.

Another important topic for me is performance. When I write applications
then they should finish their tasks quickly. The performance of
recordclass was one reason for me to use it (some benchmarks with Python
2 can be found on here
https://gist.github.com/grantjenks/a06da0db18826be1176c31c95a6ee572).
I've done some more recent and additional benchmarks with Python 3.7 on
Linux which you can find here https://github.com/brmmm3/compare-recordclass.
These new benchmarks show that namedtuple is as fast as recordclass in
all cases, but with named attribute access. Named attribute access is
faster with recordclass.

Compared to dataclass:
dataclass wins only on the topic object size. When it comes to speed and
functionality (indexing, sorting) dataclass would be my last choice.
Yes it is possible to make dataclass fast by using __slots__, but this
always an extra programming effort. namedtuple and recordclass are easy
to use with small effort.

Adding new items:
This is not possible with namedtuple and also not possible with
recordclass. I see no reason why a namedlist should support this,
because with these object types you define new object types and these
types should not change.

I hope 3.8 will get a namedlist and maybe it will be the recordclass
module (currently my choice). As the author of this module already has
responded to this discussion I hope he willing to contribute his code to
the Python project.

Best regards,
Martin

Wes Turner

unread,

Sep 2, 2018, 6:03:18 PM9/2/18

to Zaur Shibzukhov, Python-Ideas

On Sunday, September 2, 2018, Zaur Shibzukhov <szp...@gmail.com> wrote:

---
Zaur Shibzukhov

2018-09-02 22:11 GMT+03:00 Wes Turner <wes.t...@gmail.com>:
Does the value of __hash__ change when attributes of a recordclass change?

Currently recordclass's __hash__ didn't implemented.

https://docs.python.org/3/glossary.html#term-hashable

https://docs.python.org/3/reference/datamodel.html#object.__hash__

http://www.attrs.org/en/stable/hashing.html

Greg Ewing

unread,

Sep 2, 2018, 7:11:06 PM9/2/18

to Python-Ideas

Zaur Shibzukhov wrote:

> `Recordclass` is defined on top of` memoryslots` just like `namedtuple`
> above` tuple`. Attributes are accessed via a descriptor (`itemgetset`),
> which supports both` __get__` and `__set__` by the element index.
>

> As a result, `recordclass` takes up as much memory as` namedtuple`, it
> supports quick access by `__getitem__` /` __setitem__` and by attribute
> name via the protocol of the descriptors.

I'm not sure why you need a new C-level type for this. Couldn't you
get the same effect just by using __slots__?

e.g.

class C:

__slots__ = ('attr_1', ..., 'attr_m')

def __new __ (cls, attr_1, ..., attr_m):

self.attr_1 = attr_1
...
self.attt_m = attr_m

--
Greg

Steven D'Aprano

unread,

Sep 2, 2018, 7:50:54 PM9/2/18

to python...@python.org

On Sun, Sep 02, 2018 at 10:56:50PM +0200, Martin Bammer wrote:

> Compared to dataclass:
> dataclass wins only on the topic object size. When it comes to speed and
> functionality (indexing, sorting) dataclass would be my last choice.

I see no sign that recordclass supports sorting. (But I admit that I
haven't tried it.)

What would it mean to sort a recordclass?

Person = recordclass('Person', 'personalname familyname address')
fred = Person("Fred", "Anderson", "123 Main Street")
fred.sort()
print(fred)

=> output:
Person(personalname='123 Main Street', familyname='Anderson', address='Fred')

[...]

> Adding new items:
> This is not possible with namedtuple and also not possible with
> recordclass. I see no reason why a namedlist should support this,

If you want to change the name and call it a "list", then it needs to
support the same things that lists support.

> because with these object types you define new object types and these
> types should not change.

Sorry, I don't understand that. How do you get "no insertions" from
"can't change the type"? A list remains a list when you insert into it.

In case it isn't clear, I think there is zero justification for renaming
recordclass to namedlist. I don't think "named list" makes sense as a
concept, and recordclass surely doesn't implement a list-like interface.

As for the idea of adding a recordclass or mutable-namedtuple or
whatever to the stdlib, the idea seems reasonable but its not clear to
me that dataclass wouldn't be suitable.

--
Steve

Jacco van Dorp

unread,

Sep 3, 2018, 3:25:36 AM9/3/18

to python-ideas

This feels really useful to me to make some quick changes to a database - perhaps a database layer could return an class of type Recordclass, and then you just simply mutate it and shove it back into the database. Pseudocode:

record = database.execute("SELECT * FROM mytable WHERE primary_key = 15")

record.mostRecentLoggedInTime = time.time()

database.execute(f"UPDATE mytable SET mostRecentLoggedInTime = {record.mostRecentLoggedInTime} WHERE primary_key = {record.primary_key}":)

Or any smart database wrapper might just go:

database.updateOrInsert(table = mytable, record = record)

And be smart enough to figure out that we already have a primary key unequal to some sentinel value like None, and do an update, while it could do an insert if the primary key WAS some kind of sentinel value.

which is something I really wanted to do in the past with namedTuples, but had to use dicts for instead.

Also, it's rather clear that namedList is a really bad name for a Recordclass. It's cleary not intended to be a list. It's a record you can take out from somewhere, mutate, and push back in. We often use namedTuples as records now, but we can't just mutate those to shove them back in - you have to make new ones, and unless you write a smart wrapper for database handling yourself, you can't just shove them in either. Recordclass could be the gateway drug to a smart database access layer that reduces the amount of SQL we need to write - and that's a good thing in my opinion.

Jonathan Goble

unread,

Sep 3, 2018, 3:42:25 AM9/3/18

to Jacco van Dorp, python-ideas

On Mon, Sep 3, 2018 at 3:25 AM Jacco van Dorp <j.van...@deonet.nl> wrote:

Also, it's rather clear that namedList is a really bad name for a Recordclass. It's cleary not intended to be a list. It's a record you can take out from somewhere, mutate, and push back in.

So call it "namedrecord", perhaps?

Wes Turner

unread,

Sep 3, 2018, 4:01:10 AM9/3/18

to j.van...@deonet.nl, Python-Ideas

On Mon, Sep 3, 2018 at 3:25 AM Jacco van Dorp <j.van...@deonet.nl> wrote:

This feels really useful to me to make some quick changes to a database - perhaps a database layer could return an class of type Recordclass, and then you just simply mutate it and shove it back into the database. Pseudocode:

record = database.execute("SELECT * FROM mytable WHERE primary_key = 15")
record.mostRecentLoggedInTime = time.time()
database.execute(f"UPDATE mytable SET mostRecentLoggedInTime = {record.mostRecentLoggedInTime} WHERE primary_key = {record.primary_key}":)

Or any smart database wrapper might just go:

database.updateOrInsert(table = mytable, record = record)

And be smart enough to figure out that we already have a primary key unequal to some sentinel value like None, and do an update, while it could do an insert if the primary key WAS some kind of sentinel value.

SQLAlchemy.orm solves for this (with evented objects with evented attributes):

http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#session-object-states

- Transient, Pending, Persistent, Deleted, Detached

http://docs.sqlalchemy.org/en/latest/orm/session_api.html#sqlalchemy.orm.attributes.flag_modified

- flag_modified isn't necessary in most cases because attribute mutation on mapped classes deriving from Base(declarative_base()) is evented

http://docs.sqlalchemy.org/en/latest/orm/session_events.html#attribute-change-events

http://docs.sqlalchemy.org/en/latest/orm/tutorial.html

There are packages for handling attribute states with the Django ORM, as well:

- https://github.com/romgar/django-dirtyfields

- https://github.com/Suor/django-dirty

What would be the performance impact of instead subclassing from recordclass? IDK.

pyrsistent.PRecord(PMap) is immutable and supports .attribute access:

https://github.com/tobgu/pyrsistent#precord

Chris Angelico

unread,

Sep 3, 2018, 4:17:49 AM9/3/18

to python-ideas

On Mon, Sep 3, 2018 at 5:23 PM, Jacco van Dorp <j.van...@deonet.nl> wrote:
> This feels really useful to me to make some quick changes to a database -
> perhaps a database layer could return an class of type Recordclass, and then
> you just simply mutate it and shove it back into the database. Pseudocode:
>
> record = database.execute("SELECT * FROM mytable WHERE primary_key = 15")
> record.mostRecentLoggedInTime = time.time()
> database.execute(f"UPDATE mytable SET mostRecentLoggedInTime =
> {record.mostRecentLoggedInTime} WHERE primary_key = {record.primary_key}":)
>
> Or any smart database wrapper might just go:
>
> database.updateOrInsert(table = mytable, record = record)
>
> And be smart enough to figure out that we already have a primary key unequal
> to some sentinel value like None, and do an update, while it could do an
> insert if the primary key WAS some kind of sentinel value.

In its purest form, what you're asking for is an "upsert" or "merge" operation:

https://en.wikipedia.org/wiki/Merge_(SQL)

In a multi-user transactional database, there are some fundamentally
hard problems to implementing a merge. I'm not 100% certain, so I
won't say "impossible", but it is certainly *extremely difficult* to
implement an operation like this in application-level software without
some form of race condition.

ChrisA

Wes Turner

unread,

Sep 3, 2018, 4:33:07 AM9/3/18

to Chris Angelico, Python-Ideas

On Mon, Sep 3, 2018 at 4:17 AM Chris Angelico <ros...@gmail.com> wrote:

On Mon, Sep 3, 2018 at 5:23 PM, Jacco van Dorp <j.van...@deonet.nl> wrote:
> This feels really useful to me to make some quick changes to a database -
> perhaps a database layer could return an class of type Recordclass, and then
> you just simply mutate it and shove it back into the database. Pseudocode:
>
> record = database.execute("SELECT * FROM mytable WHERE primary_key = 15")
> record.mostRecentLoggedInTime = time.time()
> database.execute(f"UPDATE mytable SET mostRecentLoggedInTime =
> {record.mostRecentLoggedInTime} WHERE primary_key = {record.primary_key}":)
>
> Or any smart database wrapper might just go:
>
> database.updateOrInsert(table = mytable, record = record)
>
> And be smart enough to figure out that we already have a primary key unequal
> to some sentinel value like None, and do an update, while it could do an
> insert if the primary key WAS some kind of sentinel value.

In its purest form, what you're asking for is an "upsert" or "merge" operation:

https://en.wikipedia.org/wiki/Merge_(SQL)

In a multi-user transactional database, there are some fundamentally
hard problems to implementing a merge. I'm not 100% certain, so I
won't say "impossible", but it is certainly *extremely difficult* to
implement an operation like this in application-level software without
some form of race condition.

http://docs.sqlalchemy.org/en/latest/orm/contextual.html#contextual-thread-local-sessions

- scoped_session

http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#merging

http://docs.sqlalchemy.org/en/latest/orm/session_basics.html

obj = ExampleObject(attr='value')

assert obj.id is None

session.add(obj)

session.flush()

assert obj.id is not None

session.commit()

Chris Angelico

unread,

Sep 3, 2018, 4:40:26 AM9/3/18

to Python-Ideas

Yep. What does it do if it's on a back-end database that doesn't
provide a merge/upsort intrinsic? What if you have a multi-column
primary key? There are, of course, easier sub-forms of this (eg you
mandate that the PK be a single column and be immutable), but if there
is any chance that any other client might simultaneously be changing
the PK of your row, a perfectly reliable upsert/merge basically
depends on the DB itself providing that functionality.

Wes Turner

unread,

Sep 3, 2018, 5:23:36 AM9/3/18

to Chris Angelico, Python-Ideas

There's yet another argument for indeed, immutable surrogate primary keys.

With appropriate foreign key constraints,

changing any part of the [composite] PK is a really expensive operation

because all references must also be updated (w/ e.g. ON UPDATE CASCADE),

and that doesn't fix e.g. existing URLs or serialized references in cached JSON documents.

Far better, IMHO, to just enforce a UNIQUE constraint on those column(s).

UUIDs don't require a central key allocation service

(such as AUTOINCREMENT, which is now fixed in MySQL AFAIU);.

Should the __hash__() of a recordclass change when attributes are modified?

http://www.attrs.org/en/stable/hashing.html has a good explanation.

In general,

neither .__hash__() nor id(obj) are good candidates for a database primary key

because when/if there are collisions (birthday paradox)

-- e.g. when an INSERT or UPSERT or INSERT OR REPLACE fails --

it has to change.

Sorry getting OT,

something like COW immutability is actually desirable with SQL databases, too.

Database backups generally require offline intervention in order to rollback;

if there's even a backup which contains those transactions.

https://en.wikipedia.org/wiki/Temporal_database#Implementations_in_notable_products (SELECT, )

https://django-reversion.readthedocs.io/en/stable/

Zaur Shibzukhov

unread,

Sep 3, 2018, 2:17:14 PM9/3/18

to python-ideas

понедельник, 3 сентября 2018 г., 2:11:06 UTC+3 пользователь Greg Ewing написал:

Zaur Shibzukhov wrote:

> `Recordclass` is defined on top of` memoryslots` just like `namedtuple`
> above` tuple`. Attributes are accessed via a descriptor (`itemgetset`),
> which supports both` __get__` and `__set__` by the element index.
>
> As a result, `recordclass` takes up as much memory as` namedtuple`, it
> supports quick access by `__getitem__` /` __setitem__` and by attribute
> name via the protocol of the descriptors.

I'm not sure why you need a new C-level type for this. Couldn't you
get the same effect just by using __slots__?

e.g.

class C:

__slots__ = ('attr_1', ..., 'attr_m')

def __new __ (cls, attr_1, ..., attr_m):
self.attr_1 = attr_1
...
self.attt_m = attr_m

Yes, you can. The only difference is that access by index to fields are slow. So if you don't need fast access by index but only by name then using __slots__ is enough. Recordclass is actually a fixed array with named access to the elements in the same manner as namedtuple is a actually a tuple with named access to it's elements.

Zaur Shibzukhov

unread,

Sep 4, 2018, 8:04:54 AM9/4/18

to Wes Turner, Python-Ideas

---

Zaur Shibzukhov

2018-09-03 1:02 GMT+03:00 Wes Turner <wes.t...@gmail.com>:

On Sunday, September 2, 2018, Zaur Shibzukhov <szp...@gmail.com> wrote:

---
Zaur Shibzukhov

2018-09-02 22:11 GMT+03:00 Wes Turner <wes.t...@gmail.com>:
Does the value of __hash__ change when attributes of a recordclass change?

Currently recordclass's __hash__ didn't implemented.

https://docs.python.org/3/glossary.html#term-hashable

https://docs.python.org/3/reference/datamodel.html#object.__hash__

http://www.attrs.org/en/stable/hashing.html

There is correction:

recordclass and it's base memoryslots didn't implement __hash__, but memoryslots implement richcompare (almost as python's list).

Chris Barker via Python-ideas

unread,

Sep 4, 2018, 3:04:39 PM9/4/18

to Zaur Shibzukhov, Python-Ideas

Chiming in here:

dataclasses was just added to the stdlib.

I understand that record class is not the same thing, but the use cases do overlap a great deal.

So I think the cord goal for anyone that wants to see this in the stdlib is to demonstrate tbat recordclass

Adds significant enough value to justify something so similar.

Personally, I don’t see it.

-CHB

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA 98115   (206) 526-6317   main reception

Chris....@noaa.gov

Eric V. Smith

unread,

Sep 4, 2018, 6:16:47 PM9/4/18

to Chris Barker, Zaur Shibzukhov, Python-Ideas

On 9/4/2018 3:03 PM, Chris Barker via Python-ideas wrote:
> Chiming in here:
>
> dataclasses was just added to the stdlib.
>
> I understand that record class is not the same thing, but the use cases
> do overlap a great deal.
>
> So I think the cord goal for anyone that wants to see this in the stdlib
> is to demonstrate tbat recordclass
> Adds significant enough value to justify something so similar.

I've seen three things mentioned that might be different from dataclasses:
- instance size
- speed (not sure of what: instance creation? field access?)
- iterating over fields

But I've not seen concrete examples of the first two where dataclasses
doesn't perform well enough. For the third one, there's already a thread
on this mailing list: "Consider adding an iterable option to dataclass".
I'm contemplating adding it.

> Personally, I don’t see it.

I'm skeptical, too.

Eric

>
> -CHB
>
> On Tue, Sep 4, 2018 at 2:04 PM Zaur Shibzukhov <szp...@gmail.com

> <mailto:szp...@gmail.com>> wrote:
>
>
>
> ---
> /Zaur Shibzukhov/

>
>
> 2018-09-03 1:02 GMT+03:00 Wes Turner <wes.t...@gmail.com

> <mailto:wes.t...@gmail.com>>:

>
>
> On Sunday, September 2, 2018, Zaur Shibzukhov <szp...@gmail.com

> <mailto:szp...@gmail.com>> wrote:
>
>
>
> ---
> /Zaur Shibzukhov/

>
>
> 2018-09-02 22:11 GMT+03:00 Wes Turner <wes.t...@gmail.com

> <mailto:wes.t...@gmail.com>>:

>
> Does the value of __hash__ change when attributes of a
> recordclass change?
>
>
> Currently recordclass's __hash__ didn't implemented.
>
>
> https://docs.python.org/3/glossary.html#term-hashable
>
> https://docs.python.org/3/reference/datamodel.html#object.__hash__
>
> http://www.attrs.org/en/stable/hashing.html
>
>
> There is correction:
> recordclass and it's base memoryslots didn't implement __hash__, but
> memoryslots implement richcompare (almost as python's list).
>
>
> On Sunday, September 2, 2018, Zaur Shibzukhov

> Python...@python.org <mailto:Python...@python.org>

> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA 98115   (206) 526-6317   main reception
>

> Chris....@noaa.gov <mailto:Chris....@noaa.gov>

Reply all

Reply to author

Forward