I recently noticed a severe slowdown in copy/paste operations with dexterity objects.
After some investigation I found out that the problems seems to be in ZEXP. For some reason every dexterity object has a persistent reference to it's parent object. When doing a ZEXP all referenced objects are included, which means that you always get the full tree of dexterity objects. As a result ZEXP gets slower and slower the more objects you have. Copy/Paste is doing a ZEXP when copying an object and therefore also gets slower.
As an example let's assume the following contents: (Folder and Subfolder are Dexterity Containers, Files are Dexterity Items)
I recently noticed a severe slowdown in copy/paste operations with dexterity objects.
After some investigation I found out that the problems seems to be in ZEXP. For some reason every dexterity object has a persistent reference to it's parent object. When doing a ZEXP all referenced objects are included, which means that you always get the full tree of dexterity objects. As a result ZEXP gets slower and slower the more objects you have. Copy/Paste is doing a ZEXP when copying an object and therefore also gets slower.
As an example let's assume the following contents: (Folder and Subfolder are Dexterity Containers, Files are Dexterity Items)
> I recently noticed a severe slowdown in copy/paste operations with > dexterity objects.
Interesting.
> After some investigation I found out that the problems seems to be in > ZEXP. > For some reason every dexterity object has a persistent reference to > it's parent object.
This is actually done in plone.folder, and so I think it'll be pretty standard in Plone 4. I've CC'd Andi and Hanno who may know more. __parent__ pointers are what's going to allow us to eventually stop mixing a ton of acquisition stuff into our code, so they're there on purpose. What you're saying is worrying, though!
> When doing a ZEXP all referenced objects are included, which means > that you always get the full tree of dexterity objects. > As a result ZEXP gets slower and slower the more objects you have. > Copy/Paste is doing a ZEXP when copying an object and therefore also > gets slower.
Scary.
> As an example let's assume the following contents: > (Folder and Subfolder are Dexterity Containers, Files are Dexterity > Items)
> Hi >> I recently noticed a severe slowdown in copy/paste operations with >> dexterity objects.
Damn. Someone called this potential problem to my attention last week, but I hadn't gotten around to seeing if it actually existed yet.
>> Any suggestions how this problem can be solved?
> Not off the top of my head, but I'm hoping Hanno and Andi may have > some thoughts?
It's not immediately obvious to me what approach we can take to fixing this. During ZEXP export, the code is simply following the references in the ZODB pickles without actually unpickling objects. So it doesn't really have a way to evaluate which reference corresponds to the __parent__ pointer, afaicr.
On Wed, Feb 24, 2010 at 1:19 AM, Martin Aspeli <optil...@gmail.com> wrote: > This is actually done in plone.folder, and so I think it'll be pretty > standard in Plone 4. I've CC'd Andi and Hanno who may know more. > __parent__ pointers are what's going to allow us to eventually stop > mixing a ton of acquisition stuff into our code, so they're there on > purpose. What you're saying is worrying, though!
Gah! Do we really store __parent__ pointers on content? I thought we explicitly decided not to do this yet, as it has too many unknowns and open problems.
>> When doing a ZEXP all referenced objects are included, which means >> that you always get the full tree of dexterity objects. >> As a result ZEXP gets slower and slower the more objects you have. >> Copy/Paste is doing a ZEXP when copying an object and therefore also >> gets slower.
> Scary.
Not scary, but exactly the expected behavior ;)
If we want to use persistent partent pointers, we have to deal with all the adjustments whenever content is moved around, cut/copy/pasted and exported/imported. Zope3 has a lot of different infrastructure for these tasks, that know about parent pointers and handle them correctly. None of that code is in use in Zope2.
For now I'd strongly suggest we remove parent pointers from any content item. Using them for non persistent view components is quite fine and we can keep doing that, but there's far too many subtle problems when using them on persistent content.
> On Wed, Feb 24, 2010 at 1:19 AM, Martin Aspeli <optil...@gmail.com> wrote: >> This is actually done in plone.folder, and so I think it'll be pretty >> standard in Plone 4.
right.
this might actually also be the reason for the pretty hefty slowdown when running some tests against p4, for example `p.a.blob`. i haven't yet had the time to investigate, but with this pointer i'll try to have a look during friday's tuneup.
> Gah! Do we really store __parent__ pointers on content?
we do. i've just checked again (r26681 & r26682), and the pointers are indeed set on every content item (which makes sense when it's supposed to replace acquisition, of course).
> For now I'd strongly suggest we remove parent pointers from any > content item. Using them for non persistent view components is quite > fine and we can keep doing that, but there's far too many subtle > problems when using them on persistent content.
considering that this problem will probably show up in a lot of places, i'd be +1. martin, what do you think? i can poke a bit more on friday, but i think that's also when we should have a decision at the latest...
> On 24.02.2010, at 11:36, Hanno Schlichting wrote: >> For now I'd strongly suggest we remove parent pointers from any >> content item. Using them for non persistent view components is quite >> fine and we can keep doing that, but there's far too many subtle >> problems when using them on persistent content.
> considering that this problem will probably show up in a lot of places, i'd be +1. martin, what do you think? i can poke a bit more on friday, but i think that's also when we should have a decision at the latest...
btw, could someone — perhaps thomas as the discoverer :) — please file a (critical) bug report on the p4 tracker for this?
> On 24.02.2010, at 11:36, Hanno Schlichting wrote: >> On Wed, Feb 24, 2010 at 1:19 AM, Martin Aspeli <optil...@gmail.com> wrote: >>> This is actually done in plone.folder, and so I think it'll be pretty >>> standard in Plone 4.
> right.
> this might actually also be the reason for the pretty hefty slowdown when running some tests against p4, for example `p.a.blob`. i haven't yet had the time to investigate, but with this pointer i'll try to have a look during friday's tuneup.
>> Gah! Do we really store __parent__ pointers on content?
> we do. i've just checked again (r26681 & r26682), and the pointers are indeed set on every content item (which makes sense when it's supposed to replace acquisition, of course).
>> For now I'd strongly suggest we remove parent pointers from any >> content item. Using them for non persistent view components is quite >> fine and we can keep doing that, but there's far too many subtle >> problems when using them on persistent content.
> considering that this problem will probably show up in a lot of places, i'd be +1. martin, what do you think? i can poke a bit more on friday, but i think that's also when we should have a decision at the latest...
We added them to Dexterity (and then pushed down to plone.folder) on the assumption that we'd want this for future compatibility in an acquisition-free world. I don't think they're used anywhere important, so we can remove them if that's the right thing to do.
Not quite sure what to do with existing Dexterity instances, though...
Andi, will you follow up on this as the owner of p.folder? I'll test with Dexterity and fix that up once we've got a new p.folder release, but for now Plone 4 is the more important consumer of that package.
> On 24.02.2010, at 12:59, Andreas Zeidler wrote: >> On 24.02.2010, at 11:36, Hanno Schlichting wrote: >>> For now I'd strongly suggest we remove parent pointers from any >>> content item. Using them for non persistent view components is quite >>> fine and we can keep doing that, but there's far too many subtle >>> problems when using them on persistent content.
>> considering that this problem will probably show up in a lot of places, i'd be +1. martin, what do you think? i can poke a bit more on friday, but i think that's also when we should have a decision at the latest...
> btw, could someone — perhaps thomas as the discoverer :) — please file a (critical) bug report on the p4 tracker for this?
I'm not sure if the issue really exists in plone4. In a quick test with plone 4.0a5 and archetypes objects doing a zexp export no parents were included. Even though the objects do have a __parent__ pointer. Maybe I missed something or the newer zope version can handle the __parent__ pointer?
I had the issue with plone3.3 and dexterity 1.0a3 Can anybody reproduce the issue with plone4?
> Andi, will you follow up on this as the owner of p.folder?
yes, i will. like i said, i'll try to reproduce thomas' results on friday (during the tuneup). his latest mail makes me a little less worried, though... :)
> I'll test > with Dexterity and fix that up once we've got a new p.folder release, > but for now Plone 4 is the more important consumer of that package.
Andreas Zeidler <a...@zitc.de> writes: > On 24.02.2010, at 11:36, Hanno Schlichting wrote: >> On Wed, Feb 24, 2010 at 1:19 AM, Martin Aspeli <optil...@gmail.com> wrote: >>> This is actually done in plone.folder, and so I think it'll be pretty >>> standard in Plone 4.
> right.
> this might actually also be the reason for the pretty hefty slowdown > when running some tests against p4, for example `p.a.blob`. i haven't > yet had the time to investigate, but with this pointer i'll try to > have a look during friday's tuneup.
>> Gah! Do we really store __parent__ pointers on content?
> we do. i've just checked again (r26681 & r26682), and the pointers > are indeed set on every content item (which makes sense when it's > supposed to replace acquisition, of course).
I don't suppose an implementation using persistent.wref could be used to avoid this problem? Does the ZEXP machinery honor persistent.wref in the same way that the gc does? Could it be made to do so?
>> On 24.02.2010, at 11:36, Hanno Schlichting wrote:
>>> On Wed, Feb 24, 2010 at 1:19 AM, Martin Aspeli<optil...@gmail.com> wrote:
>>>> This is actually done in plone.folder, and so I think it'll be pretty >>>> standard in Plone 4.
>> right.
>> this might actually also be the reason for the pretty hefty slowdown >> when running some tests against p4, for example `p.a.blob`. i haven't >> yet had the time to investigate, but with this pointer i'll try to >> have a look during friday's tuneup.
>>> Gah! Do we really store __parent__ pointers on content?
>> we do. i've just checked again (r26681& r26682), and the pointers >> are indeed set on every content item (which makes sense when it's >> supposed to replace acquisition, of course).
> I don't suppose an implementation using persistent.wref could be used to > avoid this problem? Does the ZEXP machinery honor persistent.wref in > the same way that the gc does? Could it be made to do so?
That might work. The ZEXP export code does not follow weak references, IIRC. David
> I'm not sure if the issue really exists in plone4. > In a quick test with plone 4.0a5 and archetypes objects doing a zexp export no parents were included. > Even though the objects do have a __parent__ pointer. > Maybe I missed something or the newer zope version can handle the __parent__ pointer?
i've grep'd the code a bit, but couldn't find anything that would treat `__parent__` in any special way. i've also added a test[*] trying to reproduce the problem, but it passes both on plone 3 and 4. well, actually i should say both on zope 2.10 and 2.12 as the test is not really plone-specific. in any case, the parents are not part of the export, so maybe the issue is dexterity-specific after all.
however, perhaps i'm missing something or maybe the test setup is wrong. could somebody please review[*] & give me some feedback? tomorrow i'll also try to test ttw as well as moving the test into `p.a.folder` to check against "real" plone content.
Martin Aspeli <optil...@gmail.com> writes: > On 25 February 2010 05:47, David Glick <dgl...@gmail.com> wrote:
>> That might work. The ZEXP export code does not follow weak references, >> IIRC.
> Does the "new" acquisition?
I was thinking that whatever code sets the __parent__ refs (I believe someone said it was moved into p.a.folder?) should do so in such a way that uses persistent.wref. I think it might have to involve a descriptor of some sort since wrefs need to be called.
On Thu, Feb 25, 2010 at 2:12 AM, Ross Patterson <m...@rpatterson.net> wrote: > I was thinking that whatever code sets the __parent__ refs (I believe > someone said it was moved into p.a.folder?) should do so in such a way > that uses persistent.wref. I think it might have to involve a > descriptor of some sort since wrefs need to be called.
Having to call them rules out their use for this use-case. The ILocation API is pretty clear here and requires __parent__ to be an attribute that is the parent object - not a callable. One point of introducing these is getting API compatibility with all the Zope Toolkit code and all of that is expecting a simple attribute.
Descriptors and properties don't work with Acquisition. We want to get rid of Acquisition in the end, but we are going to have a transition period. In that period we'll need to support a mix of parent pointers and Acquisition, which rules out the use of descriptors and properties here as well.
> On Thu, Feb 25, 2010 at 2:12 AM, Ross Patterson <m...@rpatterson.net> > wrote: >> I was thinking that whatever code sets the __parent__ refs (I believe >> someone said it was moved into p.a.folder?) should do so in such a >> way >> that uses persistent.wref. I think it might have to involve a >> descriptor of some sort since wrefs need to be called.
> Having to call them rules out their use for this use-case. The > ILocation API is pretty clear here and requires __parent__ to be an > attribute that is the parent object - not a callable. One point of > introducing these is getting API compatibility with all the Zope > Toolkit code and all of that is expecting a simple attribute.
> Descriptors and properties don't work with Acquisition. We want to get > rid of Acquisition in the end, but we are going to have a transition > period. In that period we'll need to support a mix of parent pointers > and Acquisition, which rules out the use of descriptors and properties > here as well.
The parent pointers themselves need not be acquisition wrapped though. In fact they currently aren't wrapped unless they've just been set. Using a descriptor would make the behavior more consistent in that regard. Of course it also would add additional complexity, and if this can be addressed more straightforwardly, that would be preferable.
> On Feb 25, 2010, at 1:13 AM, Hanno Schlichting <ha...@hannosch.eu> wrote:
>> On Thu, Feb 25, 2010 at 2:12 AM, Ross Patterson <m...@rpatterson.net> wrote:
>>> I was thinking that whatever code sets the __parent__ refs (I believe >>> someone said it was moved into p.a.folder?) should do so in such a way >>> that uses persistent.wref. I think it might have to involve a >>> descriptor of some sort since wrefs need to be called.
>> Having to call them rules out their use for this use-case. The >> ILocation API is pretty clear here and requires __parent__ to be an >> attribute that is the parent object - not a callable. One point of >> introducing these is getting API compatibility with all the Zope >> Toolkit code and all of that is expecting a simple attribute.
>> Descriptors and properties don't work with Acquisition. We want to get >> rid of Acquisition in the end, but we are going to have a transition >> period. In that period we'll need to support a mix of parent pointers >> and Acquisition, which rules out the use of descriptors and properties >> here as well.
> The parent pointers themselves need not be acquisition wrapped though. In > fact they currently aren't wrapped unless they've just been set. Using a > descriptor would make the behavior more consistent in that regard. Of > course it also would add additional complexity, and if this can be addressed > more straightforwardly, that would be preferable.
Wouldn't just setting obj.__parent__ = aq_base(container) achieve the same?
> On 25 February 2010 15:26, Alec Mitchell <ap...@columbia.edu> wrote: >> On Feb 25, 2010, at 1:13 AM, Hanno Schlichting <ha...@hannosch.eu> wrote:
>>> On Thu, Feb 25, 2010 at 2:12 AM, Ross Patterson <m...@rpatterson.net> wrote:
>>>> I was thinking that whatever code sets the __parent__ refs (I believe >>>> someone said it was moved into p.a.folder?) should do so in such a way >>>> that uses persistent.wref. I think it might have to involve a >>>> descriptor of some sort since wrefs need to be called.
>>> Having to call them rules out their use for this use-case. The >>> ILocation API is pretty clear here and requires __parent__ to be an >>> attribute that is the parent object - not a callable. One point of >>> introducing these is getting API compatibility with all the Zope >>> Toolkit code and all of that is expecting a simple attribute.
>>> Descriptors and properties don't work with Acquisition. We want to get >>> rid of Acquisition in the end, but we are going to have a transition >>> period. In that period we'll need to support a mix of parent pointers >>> and Acquisition, which rules out the use of descriptors and properties >>> here as well.
>> The parent pointers themselves need not be acquisition wrapped though. In >> fact they currently aren't wrapped unless they've just been set. Using a >> descriptor would make the behavior more consistent in that regard. Of >> course it also would add additional complexity, and if this can be addressed >> more straightforwardly, that would be preferable.
> Wouldn't just setting obj.__parent__ = aq_base(container) achieve the same?
In this case, __parent__ would be a descriptor wrapping a persistent.weakref you mean?
What happens to existing Dexterity content if we suddenly add a __parent__ descriptor on the DexterityContent class? Will existing attributes still shadow it?
And what do we do about content that doesn't have such a __parent__ descriptor? Remember that the setting of __parent__ is actually done in __setitem__ on the container, not by the object itself! Do we set the descriptor as an instance variable? Does that even work? :)
> On 25 February 2010 23:29, Laurence Rowe <laurencer...@gmail.com> wrote: >> On 25 February 2010 15:26, Alec Mitchell <ap...@columbia.edu> wrote: >>> On Feb 25, 2010, at 1:13 AM, Hanno Schlichting <ha...@hannosch.eu> wrote:
>>>> On Thu, Feb 25, 2010 at 2:12 AM, Ross Patterson <m...@rpatterson.net> wrote:
>>>>> I was thinking that whatever code sets the __parent__ refs (I believe >>>>> someone said it was moved into p.a.folder?) should do so in such a way >>>>> that uses persistent.wref. I think it might have to involve a >>>>> descriptor of some sort since wrefs need to be called.
>>>> Having to call them rules out their use for this use-case. The >>>> ILocation API is pretty clear here and requires __parent__ to be an >>>> attribute that is the parent object - not a callable. One point of >>>> introducing these is getting API compatibility with all the Zope >>>> Toolkit code and all of that is expecting a simple attribute.
>>>> Descriptors and properties don't work with Acquisition. We want to get >>>> rid of Acquisition in the end, but we are going to have a transition >>>> period. In that period we'll need to support a mix of parent pointers >>>> and Acquisition, which rules out the use of descriptors and properties >>>> here as well.
>>> The parent pointers themselves need not be acquisition wrapped though. In >>> fact they currently aren't wrapped unless they've just been set. Using a >>> descriptor would make the behavior more consistent in that regard. Of >>> course it also would add additional complexity, and if this can be addressed >>> more straightforwardly, that would be preferable.
>> Wouldn't just setting obj.__parent__ = aq_base(container) achieve the same?
> In this case, __parent__ would be a descriptor wrapping a > persistent.weakref you mean?
> What happens to existing Dexterity content if we suddenly add a > __parent__ descriptor on the DexterityContent class? Will existing > attributes still shadow it?
> And what do we do about content that doesn't have such a __parent__ > descriptor? Remember that the setting of __parent__ is actually done > in __setitem__ on the container, not by the object itself! Do we set > the descriptor as an instance variable? Does that even work? :)
On Thu, Feb 25, 2010 at 7:26 AM, Alec Mitchell <ap...@columbia.edu> wrote: > On Feb 25, 2010, at 1:13 AM, Hanno Schlichting <ha...@hannosch.eu> wrote:
>> On Thu, Feb 25, 2010 at 2:12 AM, Ross Patterson <m...@rpatterson.net> wrote:
>>> I was thinking that whatever code sets the __parent__ refs (I believe >>> someone said it was moved into p.a.folder?) should do so in such a way >>> that uses persistent.wref. I think it might have to involve a >>> descriptor of some sort since wrefs need to be called.
>> Having to call them rules out their use for this use-case. The >> ILocation API is pretty clear here and requires __parent__ to be an >> attribute that is the parent object - not a callable. One point of >> introducing these is getting API compatibility with all the Zope >> Toolkit code and all of that is expecting a simple attribute.
>> Descriptors and properties don't work with Acquisition. We want to get >> rid of Acquisition in the end, but we are going to have a transition >> period. In that period we'll need to support a mix of parent pointers >> and Acquisition, which rules out the use of descriptors and properties >> here as well.
> The parent pointers themselves need not be acquisition wrapped though. In > fact they currently aren't wrapped unless they've just been set. Using a > descriptor would make the behavior more consistent in that regard. Of > course it also would add additional complexity, and if this can be addressed > more straightforwardly, that would be preferable.
This is what happens when I post too early in the morning. The __parent__ pointers are wrapped. Generally, the acquisition parent of the __parent__ is the child object itself (which is strange and unintuitive of course), except when first set, at which point the parent is wrapped with its usual traversal path. Getting rid of the wrappers with a descriptor may still be a win in terms of eliminating this confusinon, unless there's a compelling case for e.g. needing to make security assertions on the __parent__ objects.
I agree with Laurence though, this feels like a huge hack. If the problem isn't with __parent__ generally, as Andi seems to indicate, then I don't think this is a rabbit hole we want to go down.
> I agree with Laurence though, this feels like a huge hack.
+1
> If the problem isn't with __parent__ generally, as Andi seems to indicate, > then I don't think this is a rabbit hole we want to go down.
well, after some ttw testing and looking at pickles, i realized that the reason plone isn't affected is because __parent__ only gets set if the object provides `ILocation` (in `plone.folder` that is), and afaics that isn't used anywhere. it might be pulled in for some objects via `zope.app.container` or something, but at least ATCT content doesn't have it.
so iow, the problem probably is due to the __parent__ pointers, but since plone's not using `ILocation` it indeed only shows with dexterity. thomas was right here, except for the pointers. they were never set on the objects themselves, but kindly provided by acquisition (what else!? :)).
good night,
andi
ps: luckily we don't need any migration/cleanup either.
On Mon, Mar 1, 2010 at 12:14 AM, Andreas Zeidler <a...@zitc.de> wrote: > well, after some ttw testing and looking at pickles, i realized that the reason plone isn't affected is because __parent__ only gets set if the object provides `ILocation` (in `plone.folder` that is), and afaics that isn't used anywhere. it might be pulled in for some objects via `zope.app.container` or something, but at least ATCT content doesn't have it.
Ah, awesome. So the only things affected are Dexterity and stuff based on those weirdo helper classes in plone.app.content - neither of which Plone uses.
> so iow, the problem probably is due to the __parent__ pointers, but since plone's not using `ILocation` it indeed only shows with dexterity. thomas was right here, except for the pointers. they were never set on the objects themselves, but kindly provided by acquisition (what else!? :)).
Aha. And indeed Acquisition wrappers have a __parent__ that is an alias for aq_parent. The actual aq_base(object) won't have a __parent__.
On 1 March 2010 07:30, Hanno Schlichting <ha...@hannosch.eu> wrote:
> On Mon, Mar 1, 2010 at 12:14 AM, Andreas Zeidler <a...@zitc.de> wrote: >> well, after some ttw testing and looking at pickles, i realized that the reason plone isn't affected is because __parent__ only gets set if the object provides `ILocation` (in `plone.folder` that is), and afaics that isn't used anywhere. it might be pulled in for some objects via `zope.app.container` or something, but at least ATCT content doesn't have it.
> Ah, awesome. So the only things affected are Dexterity and stuff based > on those weirdo helper classes in plone.app.content - neither of which > Plone uses.
True, but that still leaves the question about what to do with Dexterity. ;-)
- Remove the functionality from plone.folder?
- Leave the functionality in plone.folder and remove ILocation from Dexterity objects (seems dumb/dangerous)
- Add a __parent__ get/set property that delegates to a persistent.weakref (any acquisition problems with this?)
On 1 March 2010 01:33, Martin Aspeli <optil...@gmail.com> wrote:
> On 1 March 2010 07:30, Hanno Schlichting <ha...@hannosch.eu> wrote: >> On Mon, Mar 1, 2010 at 12:14 AM, Andreas Zeidler <a...@zitc.de> wrote: >>> well, after some ttw testing and looking at pickles, i realized that the reason plone isn't affected is because __parent__ only gets set if the object provides `ILocation` (in `plone.folder` that is), and afaics that isn't used anywhere. it might be pulled in for some objects via `zope.app.container` or something, but at least ATCT content doesn't have it.
>> Ah, awesome. So the only things affected are Dexterity and stuff based >> on those weirdo helper classes in plone.app.content - neither of which >> Plone uses.
> True, but that still leaves the question about what to do with Dexterity. ;-)
> - Remove the functionality from plone.folder?
+1. It seems silly to leave this functionality in place if it is not used.
> - Leave the functionality in plone.folder and remove ILocation from > Dexterity objects (seems dumb/dangerous)
I would remove ILocation from Dexterity objects too, as it will no longer fulfil those expectations.
> - Add a __parent__ get/set property that delegates to a > persistent.weakref (any acquisition problems with this?)