[Plone-Users] Performance Issue Profiling question

1 view
Skip to first unread message

evadmin

unread,
Jan 5, 2010, 2:14:00 AM1/5/10
to plone...@lists.sourceforge.net

Hi,

I have a constant performance issue that keeps popping up in our Plone 2.5
with ZEO setup (2 servers each, with 2 clients per server): when a user logs
into the site and tries to view our custom content type (called Papers), the
very first time they try to view any paper it takes anywhere from 30-60
seconds for the page to load. After that, the page loads fine for the
duration of the users session (and all concurrent user sessions on the same
client).

After profiling the offending action (viewing a paper) - i find that the
paper-view template is actually completing in under a second. The Call
Profiler (and PageTemplate Profiler) both indicate that the hangup is
occurring after the folder's access_rule is executed and before the
paper-view template is invoked. Profiling doesn't suggest which other
actions could be getting called in this timeframe, can anyone point me in
the right direction?

+0.0023 0.0001 0.0% +- access_rule (_exec)
+0.0031 0.0001 0.0% +- access_rule (_exec) (2nd call)
54.9572 98.2% |
+54.9604 0.0206 0.0% +- paper_view (__call__)

According to the profiling, all the global_section navigation rendering &
other main_template macros are being called after this....so I'm at a loss
to guess where I should look next. Of course, it's possible that the problem
is happening somewhere else and we're just blaming this action...but it's
the only place in our website where I can consistently repeat the behavior.

--
access rule (located in site root, but in the profiling it's being called
twice in the context of the folder at root/private/papers/ ) :

request = container.REQUEST
request.set('HTTP_ACCEPT_LANGUAGE', 'en')
--


There is an apache rule setup for the folder where all the papers are stored
( /root/private/papers can actually be called from multiple sub-folders
i.e. /root/private/someGroupspace/papers ) - but I'm not sure if this could
have something to do with the problem or not.

Any thoughts would be appreciated, thanks!

AD

ps - I also tried profiling by calling up a paper using the '/getCatInfo'
script instead of using the default 'view' method, but the script (which
just calls 'portal_catalog.manage_objectInformation' ) behaves exactly the
same as the full blown view template.

http://localhost:8080/VirtualHostBase/http/www.mysite.com:80/mysite/VirtualHostRoot/private/papers/testPaper1/getCatInfo
(total: 59.80s)
Elapsed Time Spent Percentage Action
+0.0061 0.0001 0.0% +- access_rule
+0.0069 0.0001 0.0% +- access_rule (2nd call)
54.4423 91.0% |
+54.4493 5.3387 8.9% +- getCatInfo

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4253915.html
Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev
_______________________________________________
Plone-Users mailing list
Plone...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plone-users

evadmin

unread,
Jan 5, 2010, 2:41:29 PM1/5/10
to plone...@lists.sourceforge.net

put more simply:

I'm trying to profile an object view that freezes up - can anyone tell me
what code could possibly be executing between the first "access_rule"
execution and the default 'view' template being called?

+0.0023 0.0001 0.0% +- access_rule (_exec)
+0.0031 0.0001 0.0% +- access_rule (_exec) (2nd call)
54.9572 98.2% |
+54.9604 0.0206 0.0% +- paper_view (__call__)

If the hangup occurs inside this range, it seems like the problem must lie
outside of the page template, main_template sections and the object's
index/metadata values, right?


evadmin wrote:
>
> Hi,
>
> I have a constant performance issue that keeps popping up in our Plone 2.5
> with ZEO setup (2 servers each, with 2 clients per server): when a user
> logs into the site and tries to view our custom content type (called
> Papers), the very first time they try to view any paper it takes anywhere
> from 30-60 seconds for the page to load. After that, the page loads fine
> for the duration of the users session (and all concurrent user sessions on
> the same client).

> .
> .


> .
>
> http://localhost:8080/VirtualHostBase/http/www.mysite.com:80/mysite/VirtualHostRoot/private/papers/testPaper1/getCatInfo
> (total: 59.80s)
> Elapsed Time Spent Percentage Action
> +0.0061 0.0001 0.0% +- access_rule
> +0.0069 0.0001 0.0% +- access_rule (2nd call)
> 54.4423 91.0% |
> +54.4493 5.3387 8.9% +- getCatInfo
>
>

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4256943.html

ajung

unread,
Jan 5, 2010, 11:02:42 PM1/5/10
to plone...@lists.sourceforge.net

Use ZopeProfiler - it gives you much clearer results.

-aj

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4259138.html

evadmin

unread,
Jan 6, 2010, 9:43:52 AM1/6/10
to plone...@lists.sourceforge.net

Thanks Andreas, I have it installed and will test off-hours (because of the performance hit)

 

One last question  -  if I’m seeing the delays documented in the profiling results, can I safely assume the cause of the problem does not lie in the load balancing or Apache setup  that’s handling the web requests on the ‘front end’?  I was under the impression that profiling is only internal to Plone/Zope but didn’t know for sure.

 

Alfred Dellicicchi

Senior Developer

 

Office: (617) 532-6026

Main: (617) 426-5622

 

Eduventures, Inc.

101 Federal Street, 12th Floor

Boston MA, 02110

www.eduventures.com


View message @ http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4259138.html
To unsubscribe from Performance Issue Profiling question, click here.

 



View this message in context: RE: [Plone-Users] Performance Issue Profiling question

evadmin

unread,
Jan 6, 2010, 1:06:38 PM1/6/10
to plone...@lists.sourceforge.net

Thanks for the tip! I profiled the code and it seems to coincide almost exactly with this issue reported a long time ago (but I assume was addressed?)

 

https://mail.zope.org/pipermail/zodb-dev/2003-August/005651.html

 

there is a huge delay (30-60 seconds) when we load a ‘paper’ – but only when Zope is making a Request call (and python calls connection.py to get to the ZEO server)  

 

this would match up with our observations:  the clients will hang up once (for all connected users) and then work fine for a period of time (until the client cache is updated?).

 

I need to see if any changes were made to cache settings, otherwise I’ll need someone more knowledgeable with ZEO.

 

Thanks again

 

From: ajung [via Plone] [mailto:[hidden email]]
Sent: Tuesday, January 05, 2010 11:03 PM
To: Alfred Dellicicchi
Subject: Re: [Plone-Users] Performance Issue Profiling question

 

Use ZopeProfiler - it gives you much clearer results.

-aj

View message @ http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4259138.html
To unsubscribe from Performance Issue Profiling question, click here.

 

Dieter Maurer

unread,
Jan 7, 2010, 5:11:44 AM1/7/10
to evadmin, plone...@lists.sourceforge.net
evadmin wrote at 2010-1-6 10:06 -0800:
>
>Thanks for the tip! I profiled the code and it seems to coincide almost exactly with this issue reported a long time ago (but I assume was addressed?)
>
>https://mail.zope.org/pipermail/zodb-dev/2003-August/005651.html
>
>there is a huge delay (30-60 seconds) when we load a 'paper' - but only when Zope is making a Request call (and python calls connection.py to get to the ZEO server)

The referenced message indicates an average time of about 5 ms
per object load. That does not seem unreasonable.
When I once measured the ZEO timing, I got about 3 ms per object
if the object happened to be in the OS disk cache and 30 ms otherwise.
I have seen similar timings for access to relational databases.


When you observe 30-60 s to access a "paper" this may indicate
a persistency design weakness. Accessing a single object should not
result in loading thousands of objects unless either the object
is located via a search (searching may access lots of objects)
or something may be weakly designed.

Your profile should give you some hints where the excessive loads
come from. You may (or may not) be able to avoid those excessive loads.


There was once a problem report in the ZODB mailing list.
The ZEO (client) code contains a "sleep(0.001)" but usually, the "sleep"
code cannot have a smaller time granularity that the system clock
(often 10 ms on Linux). The actual sleep time therefore could be significantly
longer than 1 ms. A workaround was to reconfigure the Linux kernel
to use a 1 ms clock. I am unsure whether this ZEO weakness has been
fixed meanwhile.


--
Dieter

evadmin

unread,
Jan 7, 2010, 3:41:03 PM1/7/10
to plone...@lists.sourceforge.net

Thank you for the response Dieter - if I may inquire further:

I guess I was reading the other forum post wrong because I was looking at
the second 'percall' column.

Below is a profiling of the issue as observed on my site: times are in the
30-60 second range consistently, but not every time

ncalls tottime percall cumtime percall filename:lineno(function)
1 45.580 45.580 51.777 51.777
meeting-agenda-al-amm-2009:0(Request)
1 4.791 4.791 6.170 6.170 paper_view:0(__call__)
1 0.735 0.735 0.742 0.742 downloadNow.gif:0(Request)
1 0.692 0.692 0.700 0.700 printThis.jpg:0(Request)
3 0.240 0.080 0.240 0.080 intraParent:0(__call__)
48/24 0.235 0.005 0.235 0.010 translate:0(__call__)
2/1 0.228 0.114 0.453 0.453
getAddableTypesInMenu:0(__call__)
2 0.138 0.069 0.138 0.069 ltab:0(Request)
1 0.116 0.116 0.116 0.116 getSections:0(__call__)

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4268651.html


Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------

Dieter Maurer

unread,
Jan 8, 2010, 2:03:05 AM1/8/10
to evadmin, plone...@lists.sourceforge.net
evadmin wrote at 2010-1-7 12:41 -0800:
>Below is a profiling of the issue as observed on my site: times are in the
>30-60 second range consistently, but not every time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 1 45.580 45.580 51.777 51.777
>meeting-agenda-al-amm-2009:0(Request)
> 1 4.791 4.791 6.170 6.170 paper_view:0(__call__)
> 1 0.735 0.735 0.742 0.742 downloadNow.gif:0(Request)
> 1 0.692 0.692 0.700 0.700 printThis.jpg:0(Request)
> 3 0.240 0.080 0.240 0.080 intraParent:0(__call__)
> 48/24 0.235 0.005 0.235 0.010 translate:0(__call__)
> 2/1 0.228 0.114 0.453 0.453
>getAddableTypesInMenu:0(__call__)
> 2 0.138 0.069 0.138 0.069 ltab:0(Request)
> 1 0.116 0.116 0.116 0.116 getSections:0(__call__)

This is "ZopeProfiler"'s high level view. It shows you timings
for whole requests, script and template calls.

For your purpose (finding out where in your system an excessive
amount of ZODB loads are performed), you need the low level view.

Interesting points in the profile are the rows where the value
in the "cumtime" column drops sharply. They correspond to
calls where the execution paths split.

evadmin

unread,
Jan 8, 2010, 7:04:08 AM1/8/10
to plone...@lists.sourceforge.net

Here's the python profile.

http://n2.nabble.com/file/n4271983/python%2Bprofiling%2B-%2Bpaper%2Bcall.txt
python+profiling+-+paper+call.txt

There ARE a lot of loads (and understatement)....hmm.

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4271983.html


Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------

evadmin

unread,
Jan 8, 2010, 7:12:24 AM1/8/10
to plone...@lists.sourceforge.net

Sorry - My last post of the python profile results looks like it included
more than the 'paper' load, it also included our home page load.

i'll have to profile it again and post just the paper load in a little
while.

Thanks again for your help.

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4272021.html


Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------

evadmin

unread,
Jan 8, 2010, 11:16:13 AM1/8/10
to plone...@lists.sourceforge.net

Here's the full profile, Zope then Python level.

http://n2.nabble.com/file/n4273397/profile.txt profile.txt

If there's a flaw in design then why doesn't this problem happen every time
I view a paper? This question shows my ignorance of the persistence concept,
i guess.

I thought it might be caused by reference fields (since different papers
have references to different objects, which in turn have references) - but
if that were the case, then I'd expect that I could recreate the performance
problem on back-to back tries....which doesn't happen.

Sounds like I need to learn more about the persistence issue or pay someone
who does know.

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4273397.html


Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------

evadmin

unread,
Jan 8, 2010, 11:41:59 AM1/8/10
to plone...@lists.sourceforge.net

Hi Andreas,

Our issue has much in common with this other one you helped with previously
- but I know that might mean nothing.

1) We store all our papers in a normal folder and the library has been
steadily growing over the past 2 years.

2) we have plone 2.5 & a custom object with a bunch of reference fields

3) those ref. fields are mostly indexed (though not in the metadata)

4) we're running on ZEO (though many people do)

http://www.opensource-archive.org/archive/index.php/t-18344.html

I'll have to review the custom object's definition files again - I'll try
putting the papers in a large folder first!

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4273551.html

Dieter Maurer

unread,
Jan 11, 2010, 3:56:16 AM1/11/10
to evadmin, plone...@lists.sourceforge.net
evadmin wrote at 2010-1-8 08:41 -0800:
>Our issue has much in common with this other one you helped with previously
>- but I know that might mean nothing.
>
>1) We store all our papers in a normal folder and the library has been
>steadily growing over the past 2 years.

This might be the problem.

When you access a normal folder, then all its persistent references
are loaded. This causes loading one (potentially) huge persistant object.

In principle, the ZODB has an optimazation such that a persistant
reference can be loaded without loading the object itself.
To this end, the reference encodes the object's class.
However, at least in some ZODB versions, a low level feature has
prevented this optimization to take effect: loading a persistent
reference caused a load of the referenced object in order to determine
its class.

Switch to a "BTreeFolder2" (or the equivalent Plone type
("LargePloneFolder")) and see whether the effect goes away.

> ....


>I'll have to review the custom object's definition files again - I'll try
>putting the papers in a large folder first!

Good.

--
Dieter

Dieter Maurer

unread,
Jan 11, 2010, 4:14:21 AM1/11/10
to evadmin, plone...@lists.sourceforge.net
evadmin wrote at 2010-1-8 08:16 -0800:
>
>Here's the full profile, Zope then Python level.
>
>http://n2.nabble.com/file/n4273397/profile.txt profile.txt

The profile tells you that almost the complete time is spent
in "BaseRequest.traverse". As the time is dominated by storage
loads, this also means that almost all loads come from this
"traverse".

Usually, your application affects traversal only marginally.
This means: it is less likely that a design flaw in your "paper"
objects themselves are responsible.

I expect that the use of a normal folder for your "paper" collection
(rather than a "large folder") together with the failing ZODB optimization
is mainly to blame for the high number of loads (see previous message).

>If there's a flaw in design then why doesn't this problem happen every time
>I view a paper?

If my assumption in the former message is correct, then
the problem occurs because the ZODB does not know the class
for the content items of your (paper containing) folder.
To determine the class, it needs to load each of your papers.
Once the class is determined, the persistent reference can
be instantiated as a so called "ghost object" inside the folder object.

Only when the folder object is flushed from the ZODB cache (e.g.
after the object has been changed) the load series happens again.


If switching to a large folder reduces the number of loads significantly,
you have found a ZODB bug. As written in the previous message,
the ZODB should optimize this case -- and fails to do so due to
some low level interference. Consider to file a ZODB bug report
in this case.

>This question shows my ignorance of the persistence concept,
>i guess.

The ZODB cache is responsible for the change in behaviour.

>I thought it might be caused by reference fields (since different papers
>have references to different objects, which in turn have references) - but
>if that were the case, then I'd expect that I could recreate the performance
>problem on back-to back tries....which doesn't happen.

Almost surely, reference fields are not to blame for what you observe.

Dieter Maurer

unread,
Jan 14, 2010, 2:55:33 AM1/14/10
to evadmin, plone...@lists.sourceforge.net
Dieter Maurer wrote at 2010-1-11 10:14 +0100:
> ...

>If switching to a large folder reduces the number of loads significantly,
>you have found a ZODB bug. As written in the previous message,
>the ZODB should optimize this case -- and fails to do so due to
>some low level interference. Consider to file a ZODB bug report
>in this case.

Filing a bug report is likely not to help.

As I found out, I had already done so back
in 2006, together with a patch to solve the problem:

https://bugs.launchpad.net/zope2/+bug/143657

Unfortunately, nothing happened: patch evaluation has never finished...
The bug report has probably been forgotten...


The bug affects containers containing
"ExtensionClass.Base" instances (almost all Zope/Plone level objects
are such instances) as direct persistent children.

When such a container is unghostified (accessed for the first
time after a fresh load from the ZODB or an invalidation),
each contained "Base" instance is fetched from the ZODB
(at the cost of in the order of 3 to 30 ms per object when ZEO is used).


As a consequence, use "Large Folder"s whenever the number of content
items exceeds the order of 100.
If a standard folder contains more then 1.000 objects, there is a risk
that (some) accesses can cost 30 s or more....


--
Dieter

------------------------------------------------------------------------------
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev

ajung

unread,
Jan 14, 2010, 3:31:09 AM1/14/10
to plone...@lists.sourceforge.net

A Zope committer should be able to commit its own fixes and improvements to
the
Zope repository on its own. One should not wait for someone coming along the
way doing work other peoples work.

Andreas

Dieter Maurer wrote:
>
> Dieter Maurer wrote at 2010-1-11 10:14 +0100:
>> ...
>>If switching to a large folder reduces the number of loads significantly,
>>you have found a ZODB bug. As written in the previous message,
>>the ZODB should optimize this case -- and fails to do so due to
>>some low level interference. Consider to file a ZODB bug report
>>in this case.
>
> Filing a bug report is likely not to help.
>
> As I found out, I had already done so back
> in 2006, together with a patch to solve the problem:
>
> https://bugs.launchpad.net/zope2/+bug/143657
>
> Unfortunately, nothing happened: patch evaluation has never finished...
> The bug report has probably been forgotten...
>
>

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4391331.html


Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------

Yuri

unread,
Jan 14, 2010, 4:12:30 AM1/14/10
to Plone Users List
Dieter Maurer ha scritto:

> Dieter Maurer wrote at 2010-1-11 10:14 +0100:
>
>> ...
>> If switching to a large folder reduces the number of loads significantly,
>> you have found a ZODB bug. As written in the previous message,
>> the ZODB should optimize this case -- and fails to do so due to
>> some low level interference. Consider to file a ZODB bug report
>> in this case.
>>
>
> Filing a bug report is likely not to help.
>
> As I found out, I had already done so back
> in 2006, together with a patch to solve the problem:
>
> https://bugs.launchpad.net/zope2/+bug/143657
>
> Unfortunately, nothing happened: patch evaluation has never finished...
> The bug report has probably been forgotten...
>

but we have bluebream now! :P

Large folder have the problem the not list the next/prev links, don't
ask me why... you can see the link in the contents tab, but not in the
main view.

>
> The bug affects containers containing
> "ExtensionClass.Base" instances (almost all Zope/Plone level objects
> are such instances) as direct persistent children.
>
> When such a container is unghostified (accessed for the first
> time after a fresh load from the ZODB or an invalidation),
> each contained "Base" instance is fetched from the ZODB
> (at the cost of in the order of 3 to 30 ms per object when ZEO is used).
>
>
> As a consequence, use "Large Folder"s whenever the number of content
> items exceeds the order of 100.
> If a standard folder contains more then 1.000 objects, there is a risk
> that (some) accesses can cost 30 s or more....
>
>
>
>
>

ajung

unread,
Jan 14, 2010, 4:28:35 AM1/14/10
to plone...@lists.sourceforge.net

And Bluebream relates to this whole thread how?

-aj


Yuri-11 wrote:
>
>
>
> but we have bluebream now! :P
>
>

--
View this message in context: http://n2.nabble.com/Performance-Issue-Profiling-question-tp4253915p4391546.html


Sent from the General Questions mailing list archive at Nabble.com.

------------------------------------------------------------------------------

Jon Stahl

unread,
Jan 14, 2010, 8:32:39 AM1/14/10
to ajung, plone...@lists.sourceforge.net
FYI, Plone 4 unifies large and normal folders. See
http://dev.plone.org/plone/ticket/9316

:jon

Dieter Maurer

unread,
Jan 15, 2010, 5:56:27 AM1/15/10
to ajung, plone...@lists.sourceforge.net
ajung wrote at 2010-1-14 00:31 -0800:
>A Zope committer should be able to commit its own fixes and improvements to
>the
>Zope repository on its own. One should not wait for someone coming along the
>way doing work other peoples work.

Your remark can be interpreted as:

Don't waste time with bug reports.

There is only one way to help improve Zope code: change the code
yourself.

This would miss improvement potential ;-)


By the way, back in 2006 I have not been a Zope committer.

And I am no longer a Zope committer....
After you, claiming to speak for an ever increasing number of Zope
committers, threadened to kick me off the Zope mailing lists,
I no longer want to have business with Zope committers.

>Dieter Maurer wrote:
>>
>> Dieter Maurer wrote at 2010-1-11 10:14 +0100:
>>> ...
>>>If switching to a large folder reduces the number of loads significantly,
>>>you have found a ZODB bug. As written in the previous message,
>>>the ZODB should optimize this case -- and fails to do so due to
>>>some low level interference. Consider to file a ZODB bug report
>>>in this case.
>>
>> Filing a bug report is likely not to help.
>>
>> As I found out, I had already done so back
>> in 2006, together with a patch to solve the problem:
>>
>> https://bugs.launchpad.net/zope2/+bug/143657
>>
>> Unfortunately, nothing happened: patch evaluation has never finished...
>> The bug report has probably been forgotten...

--
Dieter

Reply all
Reply to author
Forward
0 new messages