I oversee an environment that runs a Zope/ZEO cluster on some very fast HP
EVA storage (50+ disks RAID1+0 array), attached to some very fast HP blade
servers (2x dual core Opteron @3Ghz with 14GB RAM).
Multiple Plone sites are served by the Zope instances running in this
cluster.
Several of the tasks that get run on the Zope instances are very
long-running and stress the storage layer considerably (eg, load and store
every content object in the database).
I have had the opportunity over the last few days to directly compare
apples-to-apples how the performance of these tasks are affected by the
CPU speed of the machines in question, as one of the machines in the
cluster has been replaced by a 2x quad core Opteron blade @2.3Ghz (instead
of 3Ghz). So, more CPU cores, but each one slower. We run fewer instances
than the max CPUs regardless, so there's no contention there.
Anyhow, I have been running a few of these long-running tasks, and the
results have somewhat surprised me. A certain task that consistently takes
~20minutes to run consistently becomes ~28minutes when the *ZEO* server is
run on the quad-core machines (2.3Ghz). It doesn't matter if the Zope
instance running the job is on 3Ghz or 2.3Ghz, the job takes about 28
minutes regardless. If the ZEO server is moved to the 3Ghz machine, the
job drops back to taking ~20 minutes like usual. Once again it doesn't
matter if the Zope instance running the job is on 2.3Ghz or 3Ghz.
So my observation is simply that raw CPU speed for the ZEO server directly
impacts in a measurable way on the overall performance of the cluster.
This was a little surprising to me, as I have always understood the
'received wisdom' in the Zope community to be that 'The Zope instances are
CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are
highly spec'd machines running the database on a very fast fibre-channel
SAN, but I'm still surprised to see CPU speed for object load/store turn
in to a measurable bottleneck.
So I guess in terms of questions for the list:
* Is this surprising to anyone else?
* Is there anyone out there running their ZEO server on some super-fast
hardware such as IBM Power6? [2]
Best regards,
Darryl Dixon
Winterhouse Consulting Ltd
http://www.winterhouseconsulting.com
[1] eg, see under Scalability:
http://plone.org/documentation/tutorial/introduction-to-the-zodb/an-introduction-to-the-zodb
[2] http://www-03.ibm.com/press/us/en/pressrelease/21580.wss
_______________________________________________
Enterprise mailing list
Enter...@lists.plone.org
http://lists.plone.org/mailman/listinfo/enterprise
On 26.03.2009 21:12 Uhr, Darryl Dixon - Winterhouse Consulting wrote:
>
> So my observation is simply that raw CPU speed for the ZEO server directly
> impacts in a measurable way on the overall performance of the cluster.
> This was a little surprising to me, as I have always understood the
> 'received wisdom' in the Zope community to be that 'The Zope instances are
> CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are
> highly spec'd machines running the database on a very fast fibre-channel
> SAN, but I'm still surprised to see CPU speed for object load/store turn
> in to a measurable bottleneck.
>
Did you encounter or have you measured IO contention?
- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAknMOmkACgkQCJIWIbr9KYzo8ACgtpJvlWuxnaUP6/PM2lTKSLX/
q34AoLVASSY84fArcNR4eZJO/uIMnTYv
=fEfa
-----END PGP SIGNATURE-----
So my observation is simply that raw CPU speed for the ZEO server directly
impacts in a measurable way on the overall performance of the cluster.
This was a little surprising to me, as I have always understood the
'received wisdom' in the Zope community to be that 'The Zope instances are
CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are
highly spec'd machines running the database on a very fast fibre-channel
SAN, but I'm still surprised to see CPU speed for object load/store turn
in to a measurable bottleneck.
> On 26.03.2009 21:12 Uhr, Darryl Dixon - Winterhouse Consulting wrote:
>
>>
>> So my observation is simply that raw CPU speed for the ZEO server
>> directly
>> impacts in a measurable way on the overall performance of the cluster.
>> This was a little surprising to me, as I have always understood the
>> 'received wisdom' in the Zope community to be that 'The Zope instances
>> are
>> CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are
>> highly spec'd machines running the database on a very fast fibre-channel
>> SAN, but I'm still surprised to see CPU speed for object load/store turn
>> in to a measurable bottleneck.
>>
>
> Did you encounter or have you measured IO contention?
>
The EVA is a shared storage, so there is some background IO always being
performed by other users of the array, but this is very low volume and the
overall throughput while these jobs run is well below the IOPS the EVA is
capable of. Actual throughput volume is also below the maximum
capabilities (100+ MB/sec on 2gbits FC HBAs). Also, I/O wait time on the
machines in question is basically non-existent while this job is in
progress.
Interestingly, the flip-side is also true: the ZEO process while this job
runs consistently sits anywhere from 30-70% of a 3Ghz core, so it is
definitely using a pretty serious amount of CPU time.
regards,
Darryl Dixon
Winterhouse Consulting Ltd
http://www.winterhouseconsulting.com
_______________________________________________
Absolutely agree.
> But if I've got long-running individual processes (Python scripts in my
> Zope instance that are perhaps kicked off by a cron job), then I'd rather
> have a single CPU with some serious horse power.
>
Yes, definitely.
> On a lot of the more serious multi-CPU machines these days, the hardware
> defaults aren't necessarily tweaked to defaults that lend themselves as
> well to such long-running processes. If you could disable hyperthreading,
> you should for such an application, otherwise, since your Python/Zope
> process is pegged to a single CPU at any time, since you're hyperthreaded,
> your CPUs are virtually 'doubled', but your max CPU available for a single
> process is cut in half.
>
The point is well made but in this case is moot, the CPUs in question are
AMD Opteron with true multi-core.
re only using 13% CPU.
>
> You're better off with faster CPU and fewer
> of them for your scenario, it seems.
>
Indeed I agree, but the interesting thing seems to be that the serious CPU
would be so useful for the ZEO server, which I found quite unexpected.
> Best of luck with your future benchmarking!
Thanks :)
Darryl Dixon
Winterhouse Consulting Ltd
http://www.winterhouseconsulting.com
_______________________________________________
-------- Original Message --------
Subject: RE: [Enterprise Plone] Zope / Plone speed affected by ZEO
hardware
From: "Darryl Dixon - Winterhouse Consulting"
<darryl...@winterhouseconsulting.com>
So what does a ZEO Server do?
- It loads, stores objects from disk
- It performs conflict resolution if two stores occur simultaneously
- It invalidates client storages when object store occurs.
- Anything else?
What version of ZODB are you running?
Are you using authentication in ZEO?
Is there any metrics you can gather from the ZEO server?
- zeoserverlog analyzing some log files may produce some information.
http://svn.zope.org/ZODB/trunk/src/ZEO/scripts
NOTE: This question may want to be re-phrased and asked on zodb-dev.
It could very well be a Plone (application level issue), i.e. thrashing the ZEO
server. like a bad application beating up a RDBMS badly. I'm sure
zodb-dev would
be interested in getting some metrics around your usage.
cheers
alan
--
Alan Runyan
Enfold Systems, Inc.
http://www.enfoldsystems.com/
phone: +1.713.942.2377x111
fax: +1.832.201.8856