large number of pages

33 views
Skip to first unread message

windandwaves

unread,
Jan 11, 2009, 3:25:42 PM1/11/09
to SilverStripe Development
Hi All

For a personal project (promoting sustainability issues) I am building
a website with 10,000+ pages - and it aint working. I could just work
with plain DataObjects, but using pages has several advantages: SEO,
Hierarchy, ability to "browse" pages without having to build anything,
etc... If I were to use DataObjects then I would essentially be re-
creating the functionality of SiteTree, that seems silly.

So, I increased the memory in PHP to 160M, but I still cant open my
CMS after I created the pages.

I have looked at several post on this matter and I am looking for
further ideas. I have implemented: http://open.silverstripe.com/ticket/826
- without much noticeable improvement.

In my five cents worth, I think this will one of the most important
challenges for the Silverstripe CMS - and also a way to distinguish
itself from other CMS-es: speed and agility.

Any thoughts? Could we write a function that checks for the number of
pages in the CMS and takes some shortcut if there are over X pages?

Thank you

Nicolaas

Sigurd Magnusson

unread,
Jan 11, 2009, 4:21:03 PM1/11/09
to silverst...@googlegroups.com
a) I've noticed that you can access the CMS with lower memory
requirements by autoloading a page, e.g. site.com/admin/show/1
b) In my experience you will quite quickly find the appropriate memory
limit, e.g. 256 MB
c) Agree, this memory issue is something we should resolve. The CMS is
supposed to only load pages as you browse to them, so I think the
solution is to find where there is a bug that is trying to use/load
all pages in an unnecessary memory wasting manner.

Sig

Nicolaas Thiemen Francken - Sunny Side Up

unread,
Jan 11, 2009, 4:39:08 PM1/11/09
to silverst...@googlegroups.com
2009/1/12 Sigurd Magnusson <sig...@silverstripe.com>

a) I've noticed that you can access the CMS with lower memory
requirements by autoloading a page, e.g. site.com/admin/show/1
b) In my experience you will quite quickly find the appropriate memory
limit, e.g. 256 MB
c) Agree, this memory issue is something we should resolve. The CMS is
supposed to only load pages as you browse to them, so I think the
solution is to find where there is a bug that is trying to use/load
all pages in an unnecessary memory wasting manner.



Thank you for the reply Siggy. Here is an update: I now have 1900 pages and a memory limit of 400M - and I still can not load the CMS. However, I am not getting "out of memory message" so perhaps it is something else (I have increased the response times parameters to 120 seconds). The frustrating thing is that we are not dealing with a lot of data here. Using /admin/show/1, it seems that I got close, but still did not make it. 

The complete data-set of all pages on the site is less than one megabyte (less for 250Kb for SiteTree).  This shows that SS bloats a lot of data and that we should perhaps look for a solution that reduces this bloat in general rather than something that is CMS specific as more will be gained that way.

Thanks again

Nicolaas


 

Sig

On 12/01/2009, at 9:25 AM, windandwaves wrote:

>
> Hi All
>
> For a personal project (promoting sustainability issues) I am building
> a website with 10,000+ pages - and it aint working.  I could just work
> with plain DataObjects, but using pages has several advantages:  SEO,
> Hierarchy, ability to "browse" pages without having to build anything,
> etc... If I were to use DataObjects then I would essentially be re-
> creating the functionality of SiteTree, that seems silly.
>
> So, I increased the memory in PHP to 160M, but I still cant open my
> CMS after I created the pages.
>
> I have looked at several post on this matter and I am looking for
> further ideas.  I have implemented: http://open.silverstripe.com/ticket/826
> - without much noticeable improvement.
>
> In my five cents worth, I think this will one of the most important
> challenges for the Silverstripe CMS - and also a way to distinguish
> itself from other CMS-es: speed and agility.
>
> Any thoughts?  Could we write a function that checks for the number of
> pages in the CMS and takes some shortcut if there are over X pages?
>
> Thank you
>
> Nicolaas
>
>
>





--
Nicolaas Thiemen Francken
 Director - Sunny Side Up Ltd  
 skype: nicolaasthiemen
 within NZ phone 0800 771 777
 overseas call +64 274 771 777
 n...@sunnysideup.co.nz
 http://www.sunnysideup.co.nz
 - client login: http://www.rakau.com/
 - new quotes: http://www.sunnysideup.co.nz/services
 - please support: http://www.localorganics.net/
 - newsletter: http://www.sunnysideup.co.nz/contact



Ingo Schommer

unread,
Jan 11, 2009, 4:43:21 PM1/11/09
to silverst...@googlegroups.com
I think the first challenge is identifying the bottlenecks - the big
one probably
being CMS tree generation, but there might be others like link
checkers, sitetree.xml,publishall etc.
How about we use the keyword "scaling" in open.silverstripe.com to
logically group
tickets related to this?

Lazy loading which is planned for 2.4 will help here (http://open.silverstripe.com/ticket/2986
).
This will also coincide with a class-based identity map, which will
make sure
a specific DataObject will exist only once in memory.

I bet there's some freeing of memory (mostly destruction of ununsed
objects)
which we can do throughout the code,
so thats an area where everybody's free to chime in with patches.

As Sig said, loading the tree through ajax on demand will greatly reduce
the amount of SiteTree objects needing to be queried. Thats of course
dependent on your hierarchy, doesn't work so well for lots of "flat"
structures.

I've collected a couple of scaling-related tickets here:
http://open.silverstripe.com/query?status=assigned&status=new&status=reopened&group=status&order=priority&col=id&col=summary&col=owner&col=type&col=priority&col=component&col=version&keywords=%7Escaling


On 12/01/2009, at 9:25 AM, windandwaves wrote:

>
-------
Ingo Schommer | Senior Developer
SilverStripe
http://silverstripe.com

Phone: +64 4 978 7330 ext 42
Skype: chillu23

Hamish Campbell

unread,
Jan 11, 2009, 7:44:17 PM1/11/09
to SilverStripe Development
Aside from the profiler, a benchmarking system might be useful too.

On Jan 12, 10:43 am, Ingo Schommer <i...@silverstripe.com> wrote:
> I think the first challenge is identifying the bottlenecks - the big  
> one probably
> being CMS tree generation, but there might be others like link  
> checkers, sitetree.xml,publishall etc.
> How about we use the keyword "scaling" in open.silverstripe.com to  
> logically group
> tickets related to this?
>
> Lazy loading which is planned for 2.4 will help here (http://open.silverstripe.com/ticket/2986
> ).
> This will also coincide with a class-based identity map, which will  
> make sure
> a specific DataObject will exist only once in memory.
>
> I bet there's some freeing of memory (mostly destruction of ununsed  
> objects)
> which we can do throughout the code,
> so thats an area where everybody's free to chime in with patches.
>
> As Sig said, loading the tree through ajax on demand will greatly reduce
> the amount of SiteTree objects needing to be queried. Thats of course
> dependent on your hierarchy, doesn't work so well for lots of "flat"  
> structures.
>
> I've collected a couple of scaling-related tickets here:http://open.silverstripe.com/query?status=assigned&status=new&status=...
>
> On 12/01/2009, at 9:25 AM, windandwaves wrote:
>
>
>
>
>
> > Hi All
>
> > For a personal project (promoting sustainability issues) I am building
> > a website with 10,000+ pages - and it aint working.  I could just work
> > with plain DataObjects, but using pages has several advantages:  SEO,
> > Hierarchy, ability to "browse" pages without having to build anything,
> > etc... If I were to use DataObjects then I would essentially be re-
> > creating the functionality of SiteTree, that seems silly.
>
> > So, I increased the memory in PHP to 160M, but I still cant open my
> > CMS after I created the pages.
>
> > I have looked at several post on this matter and I am looking for
> > further ideas.  I have implemented:http://open.silverstripe.com/ticket/826
> > - without much noticeable improvement.
>
> > In my five cents worth, I think this will one of the most important
> > challenges for the Silverstripe CMS - and also a way to distinguish
> > itself from other CMS-es: speed and agility.
>
> > Any thoughts?  Could we write a function that checks for the number of
> > pages in the CMS and takes some shortcut if there are over X pages?
>
> > Thank you
>
> > Nicolaas
>
> -------
> Ingo Schommer | Senior Developer
> SilverStripehttp://silverstripe.com

Sam Minnee

unread,
Jan 11, 2009, 9:11:56 PM1/11/09
to SilverStripe Development
Hi everyone,

I've had a quick look at this and it looks like permission checks are
a major culprit here. I'm going to add some in-request caching, so
that the permissions are only queried once for a page view.

Sam Minnee

unread,
Jan 11, 2009, 10:43:00 PM1/11/09
to SilverStripe Development
Alright, on my test install with 643 pages, I got the number of
queries from 15,312 to 244

See the individual commit logs for exactly what I did:

http://open.silverstripe.com/changeset/69977
http://open.silverstripe.com/changeset/69981
http://open.silverstripe.com/changeset/69982
http://open.silverstripe.com/changeset/69983
http://open.silverstripe.com/changeset/69986

There's more that can be done - 244 queries is still a lot - but I
think this will take it away from "crisis point" and we can leave
further optimisations for 2.3.1 or 2.4.0

It raises the question of how this happened. We could probably stand
to include the total number of queries in our test execution, to look
for big changes.

Ingo Schommer

unread,
Jan 11, 2009, 10:58:25 PM1/11/09
to SilverStripe Development
Yeah mea culpa, I've added the permission checks. Database-permissions
always add overload, even without usage of complicated ACLs you at
least double the number of (unoptimized) data queries. I didn't want
to prematurely optimize this fact, especially because stale
permissions can cause security issues. But Sam's right, in such an
often queried usecase as SiteTree permissions optimization is
critical.

> We could probably stand
> to include the total number of queries in our test execution, to look
> for big changes.
The problem is that the query number is pretty much relevant to the
last test run, and not a binary pass/fail scenario, unless we want to
add a hard limit to the test. I think some form of built-in debug view
that shows optional performance analysis for a specific call (one of
which is number of performed queries, ideally with the top 10
queries). Have a look at the Symfony Debug Toolbar for inspiration:
http://www.symfony-project.org/book/1_2/16-Application-Management-Tools#Web%20Debug%20Toolbar.
This implies manual testing, but would make this information a lot
more accessible in order to prevent these mistakes in development-
rather than in review.

BTW, a nice post about framework performance testing over at the
Symfony project: http://www.symfony-project.org/blog/2007/06/11/is-symfony-too-slow-for-real-world-usage

On Jan 12, 4:43 pm, Sam Minnee <sam.min...@gmail.com> wrote:
> Alright, on my test install with 643 pages, I got the number of
> queries from 15,312 to 244
>
> See the individual commit logs for exactly what I did:
>
> http://open.silverstripe.com/changeset/69977http://open.silverstripe.com/changeset/69981http://open.silverstripe.com/changeset/69982http://open.silverstripe.com/changeset/69983http://open.silverstripe.com/changeset/69986

Sigurd Magnusson

unread,
Jan 11, 2009, 11:02:04 PM1/11/09
to silverst...@googlegroups.com
Sam - cool! - Nicolaas - can you let us know if it gets your CMS
loading in less than a few hundred megs of RAM! :)

Sig

Nicolaas Thiemen Francken - Sunny Side Up

unread,
Jan 11, 2009, 11:06:30 PM1/11/09
to silverst...@googlegroups.com


2009/1/12 Sigurd Magnusson <sig...@silverstripe.com>


Sam - cool! - Nicolaas - can you let us know if it gets your CMS
loading in less than a few hundred megs of RAM! :)

it was loaded before theLCD light rays reached my retina
 

Sig

On 12/01/2009, at 4:43 PM, Sam Minnee wrote:

>
> Alright, on my test install with 643 pages, I got the number of
> queries from 15,312 to 244
>
> See the individual commit logs for exactly what I did:
>
> http://open.silverstripe.com/changeset/69977
> http://open.silverstripe.com/changeset/69981
> http://open.silverstripe.com/changeset/69982
> http://open.silverstripe.com/changeset/69983
> http://open.silverstripe.com/changeset/69986
>
> There's more that can be done - 244 queries is still a lot - but I
> think this will take it away from "crisis point" and we can leave
> further optimisations for 2.3.1 or 2.4.0
>
> It raises the question of how this happened.  We could probably stand
> to include the total number of queries in our test execution, to look
> for big changes.
>




Michael Gall

unread,
Jan 11, 2009, 11:06:33 PM1/11/09
to silverst...@googlegroups.com
I had a look at this ages ago when I did that manifest builder rebuild. I experimented with the "no cache" DataObjectSets which reduced memory usage but there was a few memory leaks that I couldn't track down.

As far as permissions go, you could easily serialize all the ACLs/Groups and cache it in the filesystem or the db and unvalidate the cache when there is a Group/permission change.


Cheers,

Michael
--
Checkout my new website: http://myachinghead.net
http://wakeless.net

Mark Rickerby

unread,
Jan 11, 2009, 11:10:36 PM1/11/09
to silverst...@googlegroups.com
Hey guys,

With regards to the testing / monitoring of SQL queries, I think the
most important thing to look for is not so much the exact number of
queries, or expected totals, but *changes* between commits.

For example, if a page view generally uses 100 queries, and someone
checks in some new code and that number shoots up to 1000 (eg: they
added a foreach loop that contains a lurking N+1 query). The issue is
less obvious from looking at the application code, but glaringly
blatant at the SQL level.

You wouldn't know about such N+1 issues otherwise, because the
code/behavior itself would be working fine.

Sigurd Magnusson

unread,
Jan 11, 2009, 11:19:47 PM1/11/09
to silverst...@googlegroups.com
And I would suggest memory consumption is similar - if something goes
from 50MB of usage to 500MB - as you might have a loop running a
function without hitting the database; and again the behaviour might
still "run fine" :)

Nicolaas Thiemen Francken - Sunny Side Up

unread,
Jan 11, 2009, 11:27:55 PM1/11/09
to silverst...@googlegroups.com
to analyse the problem I took all queries (?showqueries=1) and I split them by SELECT, FROM WHERE, GROUP BY, ORDER BY, and LIMIT.  Once you have a complete array of all of these, you can do counts grouping by, for example SELECT, FROM.  In doing so, you can easily identify tables that get queried more than once for different information or other repetitive stuff. Perhaps we can include a small summary at then end of running a page with the showqueries GET parameter. 

In terms of a test, you could test for counts described in the above method.  A count of more than one for any query that has the same SELECT, FROM , GROUP BY, and ORDER BY statement is worthwhile investigating in terms of redundancies. From my experience in working with small hosted platforms, it is the mysql server and the amount of queries that most often causes the server to go down.

Nicolaas

2009/1/12 Mark Rickerby <cor...@gmail.com>

Keri Henare

unread,
Jan 11, 2009, 11:27:51 PM1/11/09
to silverst...@googlegroups.com
> I think some form of built-in debug view
> that shows optional performance analysis for a specific call (one of
> which is number of performed queries, ideally with the top 10
> queries). Have a look at the Symfony Debug Toolbar for inspiration:
> http://www.symfony-project.org/book/1_2/16-Application-Management-Tools#Web%20Debug%20Toolbar

As a Symfony developer, I find the Web Debug Toolbar really helpful.
Something similar would be great for Silverstripe.
Something similar was created for Django: http://rob.cogit8.org/blog/2008/Sep/19/introducing-django-debug-toolbar/

---------------------------------------------------
Keri Henare

[e] ke...@henare.co.nz
[m] 021 874 552
[w] www.kerihenare.com
Reply all
Reply to author
Forward
0 new messages