question about added status

12 views
Skip to first unread message

Qiwen Chen

unread,
Feb 20, 2019, 5:28:23 PM2/20/19
to zodb
when an unsaved persistent object is set as sub-objects of another persistent object with saved status, the unsaved persistent object becomes Added. However, when i check _p_jar, it is still None. I will have to manually add the object to the connection in order to populate _p_jar with such connection. Is this a bug? Is there a reliable way to check if an object is added or not?

Jason Madden

unread,
Feb 21, 2019, 7:44:22 AM2/21/19
to Qiwen Chen, zodb


> On Feb 20, 2019, at 16:24, Qiwen Chen <qwc...@gmail.com> wrote:
>
> when an unsaved persistent object is set as sub-objects of another persistent object with saved status, the unsaved persistent object becomes Added.

'Added' is not a value that Persistent._p_status or Persistent._p_state can have. Those can have the values 'unsaved', 'ghost', 'sticky', 'changed', 'saved' or -1, 0, 1, 2, respectively (https://github.com/zopefoundation/persistent/blob/master/persistent/interfaces.py). So I'm not quite sure what you mean.

Is 'Added' an application level concept? Are you talking about being present in the private attribute `_added` of a ZODB `Connection`? Or do you mean "reachable in the object graph from the connection's root"? I'll assume this last meaning because that's the most general (and is basically what it means to a Connection).

> However, when i check _p_jar, it is still None.

That's correct. Simply because an object is reachable from a connection's root doesn't mean its `_p_jar` has been set. Most commonly, that happens automatically at transaction commit time, when a connection conceptually traverses all objects reachable from its root and takes note of any added or updated ones:

py> import persistent, ZODB, ZODB.DemoStorage, transaction
py> db = ZODB.DB(ZODB.DemoStorage.DemoStorage())
py> class O(persistent.Persistent):
... pass
py> parent = conn.root.parent = O()
# Just because we're reachable doesn't mean we're saved or have a jar
py> parent._p_status
'unsaved'
py> parent._p_jar
py> transaction.commit()
# Once we commit, the modified objects are added to the database and connection
py> parent._p_status
'saved'
py> parent._p_jar
<Connection at 10bc6ff20>

# The same goes for a new child object added to a saved object
py> child = parent.child = O()
py> child._p_status
'unsaved'
py> child._p_jar # None
py> transaction.commit()
py> child._p_status
'saved'
py> child._p_jar
<Connection at 10bc6ff20>


> I will have to manually add the object to the connection in order to populate _p_jar with such connection.

I think that generally, manually adding objects to a connection should not be necessary. There are cases in multi-database setups where an object is initially reachable from multiple connections in different databases, and which one to add it to is ambiguous; that case can require manually adding an object to the correct database connection or you get an InvalidObjectReference exception. But that's rare in my experience.

> Is this a bug?

No, I believe it's by design. Just because an object is *temporarily* reachable doesn't mean that it will be reachable at connection commit time. Prematurely adding it to a connection allocates an OID and generates unnecessary garbage in that case:

# Start fresh for clarity
py> db = ZODB.DB(ZODB.DemoStorage.DemoStorage())
py> conn = db.open()
# Create an object, make it reachable from a database connection
py> parent = conn.root.parent = O()
# Explicitly add it now
py> conn.add(parent)
# This has the effect of allocating an OID for it, meaning it *will* be stored
py> parent._p_oid
b'\x13\x03\xd7\xee\xe8yu\xd7'

# Now go on and do some other work that results in "rolling back" this object
py> conn.root.parent = None
py> transaction.commit()
py> conn.root.parent # No longer present

# Yet this unreachable object was still stored in the database,
# where it will have to be garbage collected away
py> conn.get(b'\x13\x03\xd7\xee\xe8yu\xd7')
<__main__.O object at 0x10ba80a78 oid 0x1303d7eee87975d7 in <Connection at 10bc15aa8>>
# it's visible to new connections, even though it's not reachable
py> conn2 = db.open()
py> conn2.root.parent # None
py> conn2.get(b'\x13\x03\xd7\xee\xe8yu\xd7')
<__main__.O object at 0x10bda8ef8 oid 0x1303d7eee87975d7 in <Connection at 10be3c058>>

# But if we pack the database...
py> db.pack()
# It is no longer visible to *new* connections
py> conn3 = db.open()
py> conn3.get(b'\x13\x03\xd7\xee\xe8yu\xd7')
Traceback (most recent call last)
...
POSKeyError: 0x1303d7eee87975d7

# Very confusingly, it is still visible from connections that had it cached
py> conn3.close()
py> conn2.close()
py> conn.close()
py> conn = db.open()
py> conn2 = db.open()
py> conn3 = db.open()
py> conn.get(b'\x13\x03\xd7\xee\xe8yu\xd7')
<__main__.O object at 0x10ba80a78 oid 0x1303d7eee87975d7 in <Connection at 10bc15aa8>>
py> conn2.get(b'\x13\x03\xd7\xee\xe8yu\xd7')
<__main__.O object at 0x10bda8ef8 oid 0x1303d7eee87975d7 in <Connection at 10be3c058>>
py> conn3.get(b'\x13\x03\xd7\xee\xe8yu\xd7')
Traceback (most recent call last)
...
POSKeyError: 0x1303d7eee87975d7

> Is there a reliable way to check if an object is added or not?

I suppose it's application specific. In an application using zope.container and zope.location, you could check that there's a traversal path back to the root; zope.keyreference's IConnection adapter can also be helpful.

I don't think ZODB offers a shortcut to find out "is this object reachable from the connection's root" before commit time.

~Jason

Qiwen Chen

unread,
Feb 21, 2019, 12:04:13 PM2/21/19
to zodb
Thank you very much for the explanation. This is very helpful
Reply all
Reply to author
Forward
0 new messages