Umbraco For Large Data Set Sites (in Practice) - Lessons Learned, Solutions Floated, Questions Asked

richard...@gmail.com

unread,

Aug 1, 2012, 3:22:31 PM8/1/12

to umbra...@googlegroups.com

Hello All,

First let me preemptively apologize if I am treading over already well-worn ground. I joined this list early in its history, but was immediately distracted with other concerns and I am still very much in the process of playing catch-up. So far I have seen some speculation about performance concerns for large sites, but I have not seen anyone with a true "use case" hands-on experience, so I thought I would share ours. One of my other developers will likely weigh into the discussion thread (assuming we get a discussion going), so we should be able to answer any questions or concerns promptly.

Background (if you want to skip a slightly long winded explanation of our specific experience, feel free to do a find on the term Lessons Learned to skip this section):

We built NRAPVF.org in the early summer of 2010. It was our first major foray into Umbraco development, so plenty of mistakes were made but we ended up with a site that both appeared and functioned beautifully. In total we had a very manageable amount of content, numbering in the low thousands of nodes. Much of the total size of the site was based on "Election Data", and on launch archive election data was held outside of Umbraco, with the plan to import it back into Umbraco at a later date. Based on the success of this site, in 2011 we began to transition the much larger website nraila.org to umbraco. These sister sites share many features, so the decision was made to have them in the same Umbraco instance with multiple domains. NRA-ILA has news, fact sheets, and article content going back to the mid 1990s. Knowing that we would end up with thousands of nodes, we extensively utilized Examine for rendering lists of content, and site performance prior to launch was fantastic. At launch time we had somewhere in the neighborhood of 100,000 nodes of content.

The first ILA launch attempt was on January 17th of 2012. Failure of epic proportions. Dogs and cats living together, mass hysteria. Within minutes of switching the DNS, the server's CPU load spiked, the request queue starting flooding with unsatisfied requests, and the site quickly started throwing 500 errors. We immediately started looking into how we built ILA and looking for ways we could make it more efficient, assuming that our construction of the site was to blame. We moved the site out onto an Amazon instance where we would have a bit more control and set up a contract with a load testing company to start creating massive load tests to find the failure point. Assuming that our extensive use of Examine searches might be the cause of the CPU spikes we implemented output caching, and ran a 5000 user, 5 hour long load test and passed with flying colors. Around January 21st 2012, with confidence we tried to launch the site once again. Though it took a little bit longer to fail, the result was nearly identical. At this point we contacted Umbraco support to get a little expert assistance on our issue. I started learning more about memory dumps than I ever wanted to know. Casey Neehouse started looking through our macros to see which massive mistake we made was causing our issue.

Long story short, we found our problem and were shocked to find out it wasn't something we did directly. Being a website that had been running for many years, we made sure any of the most popular links that were floating out on the web resolved properly by using content Aliases. One of these popular links was the NRA ILA RSS feed. After adding 404 pages into our load test we were able to crash the server almost instantly with only a few dozen users. Commenting out the 404 handlers appeared to have zero effect. On our request Casey looked into the 404 handler and found the following two things:

1) The reason commenting out the 404 handlers didn't affect the outcome of our tests is that the parser was set up in such a way that it considered the commented block to still be a valid XML item. Only by deleting them did they stop executing.

2) The page alias 404handler did not one but two wild-card XSL searches through the entire content tree. With our enormous data set this process was taking multiple seconds of processing time to satisfy every single request that made it through, whether a true 404 or just an Alias. Coupled with the thousands of RSS feed aggregates hitting our RSS feed alias, our site was doomed to failure from the start. The silver lining is that we discovered a critical failure immediately instead of it hiding until a later date when the client chose to delete a page and cause a small flood of 404 traffic.

Armed with this knowledge we implemented an Examine index based page alias handler (with caching), and re-launched the site with no issues. We weren't using the alt-template 404 handler, or the user 404 handler, so we just chose to remove those and never implemented an Examine based version of either of those.

With election season approaching we have since imported all archive election data into Umbraco, as well as continuing to add relevant content, our total node count at this point is 176883. Traffic has increased, and recently server stability has been inconsistent. The poor server that the site is running on is unfairly overburdened and we will soon be moving to a much more capable box, and I expect the move will fix any stability issues we have encountered. We are also planning on moving the site to a load balanced solution, to better handle both content publishing and front-end traffic spikes. The upcoming move to a new bank of servers and the switch to load balancing, as well as the 10:30pm phone calls from the hosting provider asking for permission to restart the app pool because the box is locking up has prompted a fresh round of site performance sleuthing. Please understand that I am not asking anyone to solve anything for me, I'm just trying to give as much background information as possible.

Lessons Learned:

1) Assumptions that hundreds of sites and thousands of developers are using Umbraco, so site failures must be entirely contained with our macros and general site setup was seriously wrong thinking on our part. We would likely have found our initial launch issue much faster had we considered internal Umbraco inefficiencies as a legitimate possibility. We still are using and continue to love Umbraco, so please don't consider this to be a shot across the bow. Umbraco is a terrific piece of software, but assuming platform infallibility was really stupid on my part.

2) Umbraco simply wasn't designed for massive sites. It isn't necessarily a platform failing it just wasn't the primary goal of the platform. Also, doing some sort of unit testing for massive amounts of content is difficult. Algorithms that are awesome for a site with 1000 nodes might be crippling for a site with 100,000 nodes, if the Big O for them is O(n^2) like the page alias 404 handler was. I know that there has certainly been some internal recognition of this with the development of Examine to begin with. Anyone that made a site that has used the old school XSLTSearch macro can see this in practice. Should that same efficiency principal be applied more back towards some of the internal processes?

3) Because of 2, parts of Umbraco dependent on searching the content tree can be very slow with a large number of nodes. Uncached requestHandler requests take somewhere in the neighborhood of 3-tenths of a second. Page aliases take multiple seconds.

4) The back office becomes very brittle with a very large number of nodes. Recursive publishing of a piece of content that has a large number of children is likely to bring everything grinding to a halt. Publishing takes between 30 seconds and 2 minutes. Sorting is touch and go. Oddly enough saving is typically very fast. I haven't dug deep enough into publishing to see exactly what part of the process eats the most time, but suffice it to say that publishing is a drag, and undisciplined actions in the back office can definitely impact site stability.

Solutions Floated:

First of all, I can't promise that I can make our dataset directly available for anyone to use for testing. We obviously don't own the content, the client does, and as politically divisive as this particular client is, they probably won't be keen on us giving it out. However, if there is enough interest I am certainly willing to either run it up the flagpole or perhaps write something that would perhaps keep the content length, but "greek" any of the text nodes. I'll wait to see what the prevailing wind is before moving forward with this.

1) Examine-based 404handlers (at least for ones that would crawl the content tree) are orders of magnitude faster. We are talking O(n^2) versus O(1) for Lucene's massive hash-table index. However, you sacrifice immediate availability because the index operates in an async manner. We know that this works fantastically well. It is roughly .04 seconds slower in total page render time to render from the Examine based Alias versus a cached request.

Example:
Total render time for Examine alias of a page: 0.822135099382518
Total render time for uncached Request directly to same page:1.27912682890105
Total render time for cached Request to same page:0.803657494826148

2) I have read Aaron Powell's "Improving Routing" thread with great interest. I will confess I haven't tried Stephen's newly available Routing-Table based requestHandler yet, but I hope to give it a look sometime this week. After our success with our Examine based Alias handler I have started sketching out a routeHandler that functions in a similar fashion. Loose explanation of what I am looking into currently to follow:

a) Take alias handling out of the 404 handler, and roll it into the initial request, so aliases and real urls are handled in the same initial Examine search. Index includes page id, alias urls, urlname, path ids.
b) Check cache for path, if in cache return ID to requestHandler, proceed.
c) Query the end of our request against this (just the urlname part), or'd together with the full path to hit Aliases as well.
d) If alias hits, return ID to requestHandler, proceed.
e) If page name hits, for each page name hit OR together weighted path IDs to construct path, check against request. If path fits, return ID to requestHandler, proceed. This entire step could be removed by having the full friendly path in the index, but we would need to be sure to update the index for all children when a parent node's page name is changed. We will likely look at doing both ways and see what the publishing impact is. I lean towards sacrificing a bit on the publish side to gain on the request side, if nothing else based on code execution volume alone.
f) If neither hit, fall back to current XSL search to handle pages that are newly created or updated and haven't hit the index yet.
g)Cache result.

Best case we have a true O(1) for a page request, and at worst case (true 404 or unindexed page) we have the same cost as currently constructed plus the reasonable hit of the new O(1) algorithm.

Feel free to point out why this concept is flawed and fundamentally stupid. Would the potential of doing a lucene search for every request be prohibitively resource intensive, even if the typical result is faster? Would it be worse than doing an XSL search of a massive tree is right now?

3) INSERT PUBLISHING SPEED MODIFICATIONS HERE. Okay, let's all pretend the previous sentence was a really great idea. I haven't dug deeply enough into the publishing structure to see where I might be able to shed some time. Even if time can't be gained, could the process be made async so that content creators aren't forced to wait before moving on? Could an async process be triggered in such a way to where it doesn't consume a request thread and won't block requests to the site until it finishes, like it does currently? If this thread has legs I promise to come back with some Publishing thoughts when I have some that aren't pure conjecture.

Random Questions Section:

Perhaps I should throw this in a different thread, but I'll stick it here just in case someone actually made it this far. Does anyone have a reason Umbraco loves to use static hash tables to handle many of its cache items rather than using the actual application cache? We have to lock all of those to be thread safe, and if memory gets tight the application doesn't have the option to chew them up to create space. Is it purely performance based? Coding preference? I'm sure there is a good reason, I am just curious what that reason is.

I realize there are multiple topics covered here, and that some of this might be better off discussed in individual threads, but I wanted to start here and branch out when a discussion required, as well as give a historical overview of our specific large dataset experiences.

Laurence Gillian

unread,

May 21, 2013, 9:32:29 AM5/21/13

to umbra...@googlegroups.com

This was an interesting read. Thank you. Lau

Stephen Gay

unread,

May 22, 2013, 2:50:28 AM5/22/13

to umbra...@googlegroups.com

On Wednesday, August 1, 2012 9:22:31 PM UTC+2, Tom Richardson wrote:

2) I have read Aaron Powell's "Improving Routing" thread with great interest. I will confess I haven't tried Stephen's newly available Routing-Table based requestHandler yet, but I hope to give it a look sometime this week. After our success with our Examine based Alias handler I have started sketching out a routeHandler that functions in a similar fashion. Loose explanation of what I am looking into currently to follow:

Have you been able to do further testing with newer versions of Umbraco?
I'd be happy to assist with testing 6.1, should you want to.
It _should_ either be more efficient "out of the box" or provide more ways to extend and tweak it.
Stephan

Message has been deleted

sniffdk

unread,

May 28, 2013, 9:02:29 AM5/28/13

to umbra...@googlegroups.com

Yes, very interesting read, thanks for sharing !

Damiaan Peeters

unread,

May 29, 2013, 2:29:41 PM5/29/13

to umbra...@googlegroups.com

Thanks for sharing your experiences. Can you share your current version number of umbraco?

I was once told that, if you are running very big umbraco sites, you should consider replacing the umbraco.config with an in-memory copy. Have you walked this road? No experience on this one.

Kind regards
Damiaan

On Wednesday, August 1, 2012 9:22:31 PM UTC+2, Tom Richardson wrote:

Eran Meir

unread,

May 30, 2013, 7:27:23 AM5/30/13

to umbra...@googlegroups.com

Someone should write a best practises article for running big sites of umbraco

Shannon Deminick

unread,

Jun 3, 2013, 8:25:41 PM6/3/13

to umbra...@googlegroups.com

Thanks a ton for the feedback! As you said, originally Umbraco may not have been built with running extremely large sites in mind. The Umbraco codebase has also been around for quite some time and there is a ton of legacy code in there which we are slowly upgrading and improving. We need to maintain as much backwards compatibility as possible which is why this process is much more difficult than you expect. As Stephen mentioned we have worked hard on rewriting the entire request pipeline (4.10+) for which it will now be very easy to extend in 6.1 and we've also abstracted the entire caching layer for which we'll expose publicly soon. This means you could simply swap out the caching provider to just use Examine if you wanted instead of an XML file. We'd love to release all of this now but in reality these things take some time and testing :)

A note on Examine though, it will not natively support Load Balanced environments simply because that is the nature of Lucene. If you want to make Examine work properly in LB environments you may have to setup a specialized way to do so depending on the data you want to index. It *might* work OOTB for published data since events for publishing will fire on all LB nodes but I haven't fully tested that. It however will not work OOTB for non-published data without some work on your end. This all depends on how your LB environment is setup: is it replicated or are all nodes sharing a single SAN data store. Depending on the setup there's various ways to make it work ... the key is that you can only have one thread writing to the index at one time. Reading from the index is no problem so if you are just publishing your index to your live site that is fine but if you are editing data on your live site, and depending on your setup, the data might not make it into the index.

There's quite a few very large Umbraco sites in the world floating around. I think many of them have implemented some custom bespoke routing/caching/lookup mechanism and I think many of them have opted for a Lucene solution of some sort. However I think front-end performance with OOTB Umbraco should work pretty well with a very large amount of data (in newer versions), the problem comes from having such a huge XML file and performing back office tasks. You nailed it with the concerns of publishing and sorting in the back office. These are probably the most CPU intensive operations but we have already been working on these things and optimizing them in the last few versions. Again maintaining backwards compatibility is always an issue with these changes so we cannot just make it run ultra fast with a huge change as this would break many packages that rely on certain events being fired.

As for making things async, we've already started doing that too :) A few operations have been changed to use async webforms processes so it passes the execution off to a non-request thread. That said, it is not out of request async, the publisher would still need to wait but they will not get timeouts. A reason for this again is backwards compatibility and dealing with some legacy code which relies on having a valid HttpContext. Once we change most things over to the new business logic API we won't have to worry about that anymore. In v7 everything is REST based and everything will run with WebAPI async calls... v7 is a while away however :) In the meantime, those processor intensive operations in the back office will be changed to be async.

Regarding caching. In 6.1 we started streamlining how all cache is handled so that we can invalidate it properly/consistently and ensure that it works in LB scenarios. Now that cache invalidation is streamlined it is very easy to see that there is a lot of over-caching happening in the core. This is mostly due to the legacy business logic API and it's performance shortcomings. There's more work to do for this as well which will come soon in the next few 6.1.x releases. Http cache is not particularly good for maintaining cache that always must be there. When you start getting high cache turnover because perhaps your IIS caching limits are too low, then the app becomes even slower. I'm not sure what particular hash tables you are referring to though as most of the caching in the core does use http application cache.

Stephen Gay

unread,

Jun 4, 2013, 5:39:04 AM6/4/13

to umbra...@googlegroups.com

A few comments here...

On Tuesday, June 4, 2013 2:25:41 AM UTC+2, Shannon Deminick wrote:

[...] and we've also abstracted the entire caching layer for which we'll expose publicly soon. This means you could simply swap out the caching provider to just use Examine if you wanted instead of an XML file. We'd love to release all of this now but in reality these things take some time and testing :)

Just to confirm, I'm definitively working on abstracting the cache and already have prototypes running with a "NoCache" that does not cache anything, and a yet-to-be-named cache that caches IPublishedContent objects (no XML document) and should handle back-office changes much better. I hope to discuss it at CG13 and make a few things public right after, but... it's going to take time before it can be used in production sites.

It *might* work OOTB for published data since events for publishing will fire on all LB nodes but I haven't fully tested that. It however will not work OOTB for non-published data without some work on your end.

That's an interesting point. At the moment there are no OOTB distributed events for non-published content... but they will be required at some point if we want to provide true and efficient preview on LB environment. Unless we want to restrict the back-office to one single node in the LB environment.

Stephan

Shannon Deminick

unread,

Jun 4, 2013, 5:46:12 AM6/4/13

to umbra...@googlegroups.com

Unfortunately we cannot force a single node in LB environments in some scenarios (aka Azure websites). If we start using distributed events for non-published content the only worry is over communicating between servers, though tiny JSON requests shouldn't affect much. I've thought about this quite a bit in the last few months with feedback from various sources. In the long run, it is a better solution to have all nodes read from a central database to 'pull' changes in so that each server works anonymously. To achieve that is a tremendous amount of work but I think the idea is good. We'd have an 'instructions' table that each node can read from to figure out what instructions it needs to execute. We can even use SQL DB events to listen for changes automatically. Anyways, that's a totally different story and is not something we can jump in to straight away.

Stephen Gay

unread,

Jun 4, 2013, 7:44:48 AM6/4/13

to umbra...@googlegroups.com

On Tuesday, June 4, 2013 11:46:12 AM UTC+2, Shannon Deminick wrote:

Unfortunately we cannot force a single node in LB environments in some scenarios (aka Azure websites). If we start using distributed events for non-published content the only worry is over communicating between servers, though tiny JSON requests shouldn't affect much. I've thought about this quite a bit in the last few months with feedback from various sources. In the long run, it is a better solution to have all nodes read from a central database to 'pull' changes in so that each server works anonymously. To achieve that is a tremendous amount of work but I think the idea is good. We'd have an 'instructions' table that each node can read from to figure out what instructions it needs to execute. We can even use SQL DB events to listen for changes automatically. Anyways, that's a totally different story and is not something we can jump in to straight away.

Interesting. So, one central content server, then each node subscribes to that server and receives "instructions"... so it's basically a master-slave replicating content server, right?

Ian Smedley

unread,

Jun 4, 2013, 7:59:30 AM6/4/13

to umbra...@googlegroups.com

I'm not sure about Azure - but in most LB environments the load balancer will set a client cookie to ensure that browsing session always hits the same server behind the scenes - this means that sites don't have to be completely compatible with a 'stateless' design - it does mean however, that currently previewing and other functions do work because you do always go back to the same server.

The "pull" model is interesting, I've seen instance where web-front ends are scaled up, but then because they each pull cache from the same SQL server, this becomes a bottleneck - if servers are constantly polling the SQL database - how would this affect the ability to scale up the front-end web servers (which is often easier than scaling up SQL) - having said this it'd be great for a new web server to 'register' its self and start to receive updates.

Damiaan Peeters

unread,

Jun 5, 2013, 6:28:52 AM6/5/13

to umbra...@googlegroups.com

I can confirm that azure websites sets Application routing cookies (well two in fact WAWebSiteID and ARRAffinity)

Niels Kühnel

unread,

Jun 5, 2013, 6:31:30 PM6/5/13

to umbra...@googlegroups.com

If Azure does sticky load balancing now it's an amazing game changer. That makes everything quite easy. I was almost certain it only did round-robin. Are you sure?

Scaling large solutions while maintaining immediate consistency is extremely difficult. However, "perceived" immediate consistency is easier. That gives the user the impression that changes are made instantaneously, even though it may take a couple of seconds before users on other nodes see them. No one will ever notice, and the lie is complete.

This consistency model is a part of the next version of Examine, but for another purpose than LB. It's for maximizing the use of all available hardware by only letting people wait when they need to. However, the model could be used for LB. The idea is simple; Every change increments a counter that marks the "generation" that will contain the change. If you need it, you can wait until the searchable data's generation reaches that. Waiting time sounds bad, but it's in the order of 10ms.

For LB I second the idea of nodes that share their primary data source but otherwise work autonomously. A reliable, persistent queue is needed for making the "generation" idea work across nodes, and a database certainly has those properties. If it's clustered it's not even a single point of failure (in Azure it's clearly not).

When a node makes a change to the database it will write down the generation. Clearly the data in that node will contain that generation, but if it's put in a cookie or a "?gen=755779021337" redirect the client can tell other nodes the required generation. If we assume that nodes polls the database for changes every second the client's request can be put to async sleep, or an extra sync can be made.

This is not the most clean solution but it will definitely work. Given that the read to write ratio is almost always very high for Umbraco solutions, it will probably even work great. This should be possible to implement without any low level architectural changes, and maybe even be straightforward.

Btw, thanks for starting this thread!

Cheers,

Niels K

--
You received this message because you are subscribed to the Google Groups "Umbraco development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/81f2bfcc-7135-48bc-a2f9-cd0cbcb4c5d8%40googlegroups.com?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Shannon Deminick

unread,

Jun 5, 2013, 6:58:53 PM6/5/13

to umbra...@googlegroups.com

The thing is, immediate consistency is never possible even with how Umbraco is structured today with the web service calls between nodes to sync. There might be latency between these calls as well. As for the db performance of all nodes requesting if there's updates, I really don't see this as being processor intensive on the SQL server whatsoever even with a ton of web servers and as Niels mentions if you have clustering there's no single point of failure.

With Azure sticky sessions, this is how to get Examine to work in that type of environment or any other LB environment that supplies this cookie so long as your servers can query to the public DNS name (which they should). Example/theory: Send the request for update through the frontend – it’ll land on *a* worker – which one is not important. You’ll get a response back which will include an arr session affinity cookie, which you can share among all the different callers which need to initiate the refresh. Using this cookie, you’ll always get forwarded to the same worker. You’ll also want some error handling code in here in case you get an unexpected response. If you've been load balanced to a different server due to failure/farm operations, your old cookie will send you to a server which will not service your request and throw an error. Just get a fresh cookie and try again.

Stephen Gay

unread,

Jun 6, 2013, 5:34:26 AM6/6/13

to umbra...@googlegroups.com

On Thursday, June 6, 2013 12:31:30 AM UTC+2, Niels Kühnel wrote:

For LB I second the idea of nodes that share their primary data source but otherwise work autonomously. A reliable, persistent queue is needed for making the "generation" idea work across nodes, and a database certainly has those properties. If it's clustered it's not even a single point of failure (in Azure it's clearly not).
When a node makes a change to the database it will write down the generation. Clearly the data in that node will contain that generation, but if it's put in a cookie or a "?gen=755779021337" redirect the client can tell other nodes the required generation. If we assume that nodes polls the database for changes every second the client's request can be put to async sleep, or an extra sync can be made.

I like the idea of the "generation". It's something that could help address an issue I have with caching: when the app restarts, how do you know whether the locally stored data (umbraco.config XML file at the moment, could be anything else) is still up-to-date with the DB, or needs to be updated, without reloading the entire content, but only those content items that have changed while the LB node was offline.

Stephan

Shannon Deminick

unread,

Jun 6, 2013, 10:37:54 AM6/6/13

to umbra...@googlegroups.com

One way to solve that issue is that each item entered in the 'instructions' table is obviously ID'd. On each server (in a non-replicated area) we save the last synced ID. Of course there can be numerous different types of instructions in this table and in some cases when a node first starts up it might not need to process every instruction, depending on it's type.

This is all just pure theory. I have been thinking about it over the past few months but only in my head :)

Shannon Deminick

w: http://shazwazza.com

e: sdem...@gmail.com

s: shannondeminick

t: shazwazza

--
You received this message because you are subscribed to a topic in the Google Groups "Umbraco development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/umbraco-dev/Spw1bDYhuEo/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to umbraco-dev...@googlegroups.com.

To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/318171e4-4883-459d-b474-eabf52296c46%40googlegroups.com?hl=en.

Niels Kühnel

unread,

Jun 6, 2013, 10:54:32 AM6/6/13

to umbra...@googlegroups.com, umbra...@googlegroups.com

Couldn't we just use the modified date on nodes?

Are hard deletes tracked?

You received this message because you are subscribed to the Google Groups "Umbraco development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.

To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CAME%2BwjsyEkqovZpFguJs-xze%2B8wY18Qopxr%3D%2BPKt%3DCUA3uH-fg%40mail.gmail.com?hl=en.

Lee Kelleher

unread,

Jul 25, 2013, 1:38:24 PM7/25/13

to umbra...@googlegroups.com

On Wednesday, 1 August 2012 20:22:31 UTC+1, Tom Richardson wrote:

<snip/>

4) The back office becomes very brittle with a very large number of nodes. Recursive publishing of a piece of content that has a large number of children is likely to bring everything grinding to a halt. Publishing takes between 30 seconds and 2 minutes. Sorting is touch and go. Oddly enough saving is typically very fast. I haven't dug deep enough into publishing to see exactly what part of the process eats the most time, but suffice it to say that publishing is a drag, and undisciplined actions in the back office can definitely impact site stability.

I've spent the past couple of days on-site with a client who has been experiencing these same issues. They are currently running v4.6.1, have just around 100,000 nodes (say a 60/40 split of content/media nodes) and a load-balanced set-up of 4 web-servers. The interesting part is that their Umbraco install has about 20 websites (domain/hostnames) ... each of those has been set-up as a separate IIS website on the web-servers (each with their own application-pool).

They experience spikes in CPU/memory whenever a content node is published (with the distributed calls), they also noticed a massive spike whenever a content editor would move or sort a node!

After some digging into the v4.6.1 source I found that both Move and Sort would make a call to "umbraco.library.RefreshContent", which in turn would make a distributed call. I found that the "umbraco.presentation.cache.pageRefresher" ("RefreshAll"), called/pinged on each of those 20 domains/servers - would then call "RefreshContentFromDatabaseAsync()" to grab the fresh XML and rebuild the in-memory cache.

Following the core code/logic it made sense, until I started to do the maths. All the XML is pulled from the "cmsContentXml" table, 100,000 rows - then re-organised in code to the correct hierarchy. The "umbraco.config" (on the CMS-server) is around 80Mb - assuming that each domain's app-pool is playing around with that XML, so could be double (maybe triple) that? Multiplied by 20 domains, across the 4 web-servers. It explains the huge CPU/memory spikes.

It is worrying that performing a simple Sort would introduce a huge performance impact. I've checked the v6.1.2 source and the "nodeSorter.asmx.cs" code still makes the call to "umbraco.library.RefreshContent", so it is still an issue. (I note that Shannon has commented on this in the code)

Wondering if there is a better solution to distributing the "refresh all nodes" across the web-servers? ... (and/or am I missing something obvious with the set-up?)

Thanks,

- Lee

Dirk De Grave

unread,

Jul 25, 2013, 1:58:46 PM7/25/13

to umbra...@googlegroups.com

Lee,

Do you experience the same issues when you publish a few times? Currently having a site with 10000 nodes (content), 13000 media items and publishing a node is slow, but only for the first publish. Each consecutive publish seems to be of much better performance, no matter what node is being published?

Cheers,

Dirk

--
You received this message because you are subscribed to the Google Groups "Umbraco development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/e50141d7-2597-46bf-bff5-cb6039a0c248%40googlegroups.com.

Lee Kelleher

unread,

Jul 25, 2013, 2:06:40 PM7/25/13

to umbra...@googlegroups.com

Is your site load-balanced?

My client does experience performance hits when a content editor publishes a node - but they find it acceptable.

I'm not sure about consecutive publishing, guessing that since the XML is already generated, it's not a big hit (as the first time)?

Cheers,

- Lee

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/eme7bd7924-cdfe-4731-97e3-c51e3b1f754e%40athena.

Dirk De Grave

unread,

Jul 25, 2013, 2:09:02 PM7/25/13

to umbra...@googlegroups.com

nope, no load balancing... i might be wrong, but xml has to be repopulated for each publish, right? So, why would a first publish be much much slower than following publishes. Don't epect a answer, just looking for clues...

/Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to mailto:umbraco-dev%2Bunsu...@googlegroups.com.

To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/eme7bd7924-cdfe-4731-97e3-c51e3b1f754e%40athena.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Umbraco development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CAJi0wqR%2B9h4j7yVvSbmZ3qJ7%3DNeWviDqOGGuGuHxGBpepBni8A%40mail.gmail.com.

Morten Christensen

unread,

Jul 25, 2013, 3:20:20 PM7/25/13

to umbra...@googlegroups.com

Just a quick note: the sorting and publishing has been greatly improved from 6.1.x so pretty sure you won't see the same impact on newer versions.

- Morten

Morten Christensen

unread,

Jul 25, 2013, 4:02:08 PM7/25/13

to umbra...@googlegroups.com

Plus Shannon has done a great deal to improve publishing in load balanced environments. He would probably be able to better explain which parts has been updated and improved as well as parts that is yet to be improved.

Shannon Deminick

unread,

Jul 25, 2013, 8:49:31 PM7/25/13

to umbra...@googlegroups.com

The LB improvements have been related to actually making it work, not so much about it's performance in LB scenarios. Before LB didn't actually work whatsoever for anything apart from content! If you were using 1 server for all of your editing and not having your editors load balanced you might not have noticed it (too much). But before 6.1 there was only 1 ICacheRefresher which was only for content which means that any media, doc types, templates, (and everything else that has cache) would not have notified other servers and therefore their cache would not have been invalidated or updated obviously leading to all sorts of strange issues.

There's been a ton of improvements in sorting and publishing in various places, not only to do with just SQL calls but how XML is sorted internally etc... In fact one of the bottlenecks when profiling a large site with publishing and sorting was a method that was used to sort the XML nodes which was profiled at 40% of the entire publishing/sorting sequence. This was fixed in rev: 4983990d39cc2ede6c7559ed88815fb85058dbeb if you want to have a look and this fix is contained in later 4.x branches too. It's many of these types of fixes that have been implemented that would drastically improve performance from the version you are working with ( v4.6.1 ). I would strongly urge you to upgrade to the very latest version and run some tests.

There's been a few issues logged in regards to performance on large sites so today I'm going to spend a bit of time to see if I can find any quick wins. There's a ton of places where we can improve the entire process but unfortunately we're stuck with a lot of legacy code and maintaining backwards compatibility which is the development bottleneck. The other issue is that having one giant XML file (80 Mb is HUGE) cached is a big bottleneck in itself which is why we've been actively working on a new caching mechanism but that won't be out for some time as it is quite complex to design and develop (but we're getting there!). I'm sure there's plenty of ways that we can improve how we deal with a very large xml file so in the meantime I'm sure we can improve that too.

Lee Kelleher

unread,

Jul 26, 2013, 6:20:27 AM7/26/13

to umbra...@googlegroups.com

Thanks Shannon, great to hear about all the improvements!

I've already recommended to my client upgrading to v6.1.x.

If there's anything else I can feed back from them, I will do.

Cheers,

- Lee

--
You received this message because you are subscribed to the Google Groups "Umbraco development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/93c41349-e341-4fab-ae8f-e87fa7f61674%40googlegroups.com.

Shannon Deminick

unread,

Jul 28, 2013, 8:16:08 PM7/28/13

to umbra...@googlegroups.com

Yup, I'd definitely recommend to anyone having performance issues to upgrade to the latest version. I was working on some new perf improvements on friday (regarding this issue: http://issues.umbraco.org/issue/U4-2527) and confirmed that there are several optimized improvements already in the latest versions of 6 that don't exist in the 4.x series mostly related to publishing, sorting and re-building the cache. There are several more improvements I need to make so hopefully in the next version or two it will become much quicker when dealing with larger sites.

Shannon Deminick

unread,

Jul 28, 2013, 11:16:19 PM7/28/13

to umbra...@googlegroups.com

@Lee, I have no also removed that call in nodeSorter if distribution is turned on, I have verified that it will work without it. Will greatly improve the perf of sorting in LB scenarios (out in 6.1.4)

Lee Kelleher

unread,

Jul 30, 2013, 11:10:31 AM7/30/13

to umbra...@googlegroups.com

Excellent! Thanks Shannon! #h5yr

6.1.4 should be out by the time my client schedules in the upgrade, so all good.

In the meantime, if I can feedback any other details, I will do.

Cheers,

- Lee

On 29 July 2013 04:16, Shannon Deminick <sdem...@gmail.com> wrote:

@Lee, I have no also removed that call in nodeSorter if distribution is turned on, I have verified that it will work without it. Will greatly improve the perf of sorting in LB scenarios (out in 6.1.4)

--

You received this message because you are subscribed to the Google Groups "Umbraco development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/c91c46d1-bdd7-49ce-ad4f-934827ea0fe2%40googlegroups.com.

Pete Duncanson

unread,

Aug 15, 2013, 7:47:08 AM8/15/13

to umbra...@googlegroups.com

I have to say the performance of V6+ on LB setups is super good, you can actually see the difference when you install each new patch. Great work :)

One idea that sprang to mind regarding the RefreshCache call was to have it wait 2 seconds before it acted on it, this would allow time for multiple chunks of updates to occur (which would all need to refresh the cache on each server) all pushing that 2 second delay back before it sent the request and do it all in one hit? Might be a quick win with minimal changes required?

Pete

Shannon Deminick

unread,

Aug 15, 2013, 8:05:02 AM8/15/13

to umbra...@googlegroups.com

We're on our way to getting to that point but not quite yet. We now support a json payload to do the sync so we can then bundle a few calls together. The best way to implement that is to collect all payloads that need to be sent in one request and then combine them and send it all in one hit when the request ends. I think it'll be much more difficult to wait on a timeout especially if the app pool needs to restart in between.
I think bundling all calls for the request should suffice though, will make much less calls especially when doing bulk operations!
Good thinking Pete!!

Shannon Deminick

unread,

Aug 15, 2013, 8:37:03 PM8/15/13

to umbra...@googlegroups.com

Hey @Pete, I've created a task here:

http://issues.umbraco.org/issue/U4-2633

Hopefully can get it in for 6.1.4 but we'll see.

Pete Duncanson

unread,

Aug 23, 2013, 6:44:43 AM8/23/13

to umbra...@googlegroups.com

I love it when a plan comes together :)

convertium...@gmail.com

unread,

Sep 26, 2013, 3:18:06 AM9/26/13

to umbra...@googlegroups.com

Hi,

I am also facing this issue on my 6.1.4 installation.
I have a node, where there are about 30K nodes under it.
And publishing of nodes under this node took a long time (approx 2 minutes). Does anybody have any ideas on how to resolve this issue?

Thanks

On Thursday, August 2, 2012 3:22:31 AM UTC+8, Tom Richardson wrote:
> Hello All,
>
>
>
> First let me preemptively apologize if I am treading over already well-worn ground. I joined this list early in its history, but was immediately distracted with other concerns and I am still very much in the process of playing catch-up. So far I have seen some speculation about performance concerns for large sites, but I have not seen anyone with a true "use case" hands-on experience, so I thought I would share ours. One of my other developers will likely weigh into the discussion thread (assuming we get a discussion going), so we should be able to answer any questions or concerns promptly.
>
>
>
>
>
> Background (if you want to skip a slightly long winded explanation of our specific experience, feel free to do a find on the term Lessons Learned to skip this section):
>
>
>
> We built NRAPVF.org in the early summer of 2010. It was our first major foray into Umbraco development, so plenty of mistakes were made but we ended up with a site that both appeared and functioned beautifully. In total we had a very manageable amount of content, numbering in the low thousands of nodes. Much of the total size of the site was based on "Election Data", and on launch archive election data was held outside of Umbraco, with the plan to import it back into Umbraco at a later date. Based on the success of this site, in 2011 we began to transition the much larger website nraila.org to umbraco. These sister sites share many features, so the decision was made to have them in the same Umbraco instance with multiple domains. NRA-ILA has news, fact sheets, and article content going back to the mid 1990s. Knowing that we would end up with thousands of nodes, we extensively utilized Examine for rendering lists of content, and site performance prior to launch was fantastic. At launch time we had somewhere in the neighborhood of 100,000 nodes of content.
>
>
>
> The first ILA launch attempt was on January 17th of 2012. Failure of epic proportions. Dogs and cats living together, mass hysteria. Within minutes of switching the DNS, the server's CPU load spiked, the request queue starting flooding with unsatisfied requests, and the site quickly started throwing 500 errors. We immediately started looking into how we built ILA and looking for ways we could make it more efficient, assuming that our construction of the site was to blame. We moved the site out onto an Amazon instance where we would have a bit more control and set up a contract with a load testing company to start creating massive load tests to find the failure point. Assuming that our extensive use of Examine searches might be the cause of the CPU spikes we implemented output caching, and ran a 5000 user, 5 hour long load test and passed with flying colors. Around January 21st 2012, with confidence we tried to launch the site once again. Though it took a little bit longer to fail, the result was nearly identical. At this point we contacted Umbraco support to get a little expert assistance on our issue. I started learning more about memory dumps than I ever wanted to know. Casey Neehouse started looking through our macros to see which massive mistake we made was causing our issue.
>
>
>
> Long story short, we found our problem and were shocked to find out it wasn't something we did directly. Being a website that had been running for many years, we made sure any of the most popular links that were floating out on the web resolved properly by using content Aliases. One of these popular links was the NRA ILA RSS feed. After adding 404 pages into our load test we were able to crash the server almost instantly with only a few dozen users. Commenting out the 404 handlers appeared to have zero effect. On our request Casey looked into the 404 handler and found the following two things:
>
>
>
> 1) The reason commenting out the 404 handlers didn't affect the outcome of our tests is that the parser was set up in such a way that it considered the commented block to still be a valid XML item. Only by deleting them did they stop executing.
>
>
>
> 2) The page alias 404handler did not one but two wild-card XSL searches through the entire content tree. With our enormous data set this process was taking multiple seconds of processing time to satisfy every single request that made it through, whether a true 404 or just an Alias. Coupled with the thousands of RSS feed aggregates hitting our RSS feed alias, our site was doomed to failure from the start. The silver lining is that we discovered a critical failure immediately instead of it hiding until a later date when the client chose to delete a page and cause a small flood of 404 traffic.
>
>
>
> Armed with this knowledge we implemented an Examine index based page alias handler (with caching), and re-launched the site with no issues. We weren't using the alt-template 404 handler, or the user 404 handler, so we just chose to remove those and never implemented an Examine based version of either of those.
>
>
>
> With election season approaching we have since imported all archive election data into Umbraco, as well as continuing to add relevant content, our total node count at this point is 176883. Traffic has increased, and recently server stability has been inconsistent. The poor server that the site is running on is unfairly overburdened and we will soon be moving to a much more capable box, and I expect the move will fix any stability issues we have encountered. We are also planning on moving the site to a load balanced solution, to better handle both content publishing and front-end traffic spikes. The upcoming move to a new bank of servers and the switch to load balancing, as well as the 10:30pm phone calls from the hosting provider asking for permission to restart the app pool because the box is locking up has prompted a fresh round of site performance sleuthing. Please understand that I am not asking anyone to solve anything for me, I'm just trying to give as much background information as possible.
>
>
>
>
>
> Lessons Learned:
>
>
>
> 1) Assumptions that hundreds of sites and thousands of developers are using Umbraco, so site failures must be entirely contained with our macros and general site setup was seriously wrong thinking on our part. We would likely have found our initial launch issue much faster had we considered internal Umbraco inefficiencies as a legitimate possibility. We still are using and continue to love Umbraco, so please don't consider this to be a shot across the bow. Umbraco is a terrific piece of software, but assuming platform infallibility was really stupid on my part.
>
>
>
> 2) Umbraco simply wasn't designed for massive sites. It isn't necessarily a platform failing it just wasn't the primary goal of the platform. Also, doing some sort of unit testing for massive amounts of content is difficult. Algorithms that are awesome for a site with 1000 nodes might be crippling for a site with 100,000 nodes, if the Big O for them is O(n^2) like the page alias 404 handler was. I know that there has certainly been some internal recognition of this with the development of Examine to begin with. Anyone that made a site that has used the old school XSLTSearch macro can see this in practice. Should that same efficiency principal be applied more back towards some of the internal processes?
>
>
>
> 3) Because of 2, parts of Umbraco dependent on searching the content tree can be very slow with a large number of nodes. Uncached requestHandler requests take somewhere in the neighborhood of 3-tenths of a second. Page aliases take multiple seconds.
>
>
>

> 4) The back office becomes very brittle with a very large number of nodes. Recursive publishing of a piece of content that has a large number of children is likely to bring everything grinding to a halt. Publishing takes between 30 seconds and 2 minutes. Sorting is touch and go. Oddly enough saving is typically very fast. I haven't dug deep enough into publishing to see exactly what part of the process eats the most time, but suffice it to say that publishing is a drag, and undisciplined actions in the back office can definitely impact site stability.
>
>
>
>
>

Randy McCluer

unread,

Nov 21, 2013, 1:45:34 PM11/21/13

to umbra...@googlegroups.com

Re: "80MB is huge", so you're saying that the site I'm launching with a 185MB umbraco config scares you as much as it scares me. Been telling these guys forever that I need more time for testing, but they just didn't listen. I'm thinking I may set up a load-balanced config even though I only have one server, so that admins are hitting a site w/ the CPU usage capped. Any other recommendations for such a precarious position are welcome.

Tom Richardson

unread,

Nov 21, 2013, 3:58:11 PM11/21/13

to umbra...@googlegroups.com

Ours is over 300mb, so though you might run into some issues, it is possible to run a very large dataset site.

I highly suggest running in a load balanced config.

--
You received this message because you are subscribed to the Google Groups "Umbraco development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/cf3aba2f-3c3b-408c-a4a4-fd46c7ae61c5%40googlegroups.com.

Randy McCluer

unread,

Nov 21, 2013, 4:02:57 PM11/21/13

to umbra...@googlegroups.com

That's good to hear, Tom, and I'd heard of folks doing it, which is why I've spent most of the past 2 yrs building this in u5, then now in u6. Starting on the load balanced config now, so any tips you have would be great to hear.

--
You received this message because you are subscribed to a topic in the Google Groups "Umbraco development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/umbraco-dev/Spw1bDYhuEo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to umbraco-dev...@googlegroups.com.

To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CAO0_ttRjwnphoUHSWjbJPw73axvgpVLSyv8WaazmN6_Cbsf%3DMw%40mail.gmail.com.

Shannon Deminick

unread,

Nov 21, 2013, 5:22:44 PM11/21/13

to umbra...@googlegroups.com

Wooww! that's huge :)

Here's some docs on LB - might help a little bit: http://our.umbraco.org/documentation/Installation/load-balancing

Would be really great to get some documentation (would probably just be 'tips') on how best to configure Umbraco for ultra huge data sets.

Randy McCluer

unread,

Nov 21, 2013, 5:56:07 PM11/21/13

to umbra...@googlegroups.com

It seems that right now my biggest issue is with publishing content from the admin, so I don't think the load balancing would help much. I've got 6-8 editors in it right now & I think they're just queuing up write locks on the umbraco.config. I turned off ContinouslyUpdateXmlDiskCache which brought CPU way down, of course, but now even a Republish Entire Site isn't recreating the xmlcache. Any idea if I could rig up a Windows Service to recreate the file every 5 mins or something?

--

You received this message because you are subscribed to a topic in the Google Groups "Umbraco development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/umbraco-dev/Spw1bDYhuEo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/56f751f8-863c-4ca0-b8e4-155d1f237f9e%40googlegroups.com.

Shannon Deminick

unread,

Nov 21, 2013, 6:01:59 PM11/21/13

to umbra...@googlegroups.com

What umb version are you using (sorry you might have said previously but it's an old thread)?

Randy McCluer

unread,

Nov 21, 2013, 6:14:10 PM11/21/13

to umbra...@googlegroups.com

Sorry, I'm in a nightly of 6.2. I'm a crazy person building his first live Umbraco site on a monster like this. I deserve and accept any derision you want to throw my way. A few months ago I was hopeful that Stephan's binary cache would be ready, and lost sight of it in the crazy last few weeks. Any help is appreciated.

On Thu, Nov 21, 2013 at 5:01 PM, Shannon Deminick <sdem...@gmail.com> wrote:

What umb version are you using (sorry you might have said previously but it's an old thread)?

--

You received this message because you are subscribed to a topic in the Google Groups "Umbraco development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/umbraco-dev/Spw1bDYhuEo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to umbraco-dev...@googlegroups.com.
To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/5f235065-f039-4bad-9405-9cf01bf99bca%40googlegroups.com.

Tom Richardson

unread,

Nov 21, 2013, 10:53:16 PM11/21/13

to umbra...@googlegroups.com

We aren't on 6.2, so take all of this with a grain of salt, but here goes:

At some point your content managers will perform a task that will cause the site to chew itself to pieces. We have handled this by having a load balanced solution where there is a "single point of truth" for the content administration (cms.foo.com) where all content management takes place, while 2 other servers handle traffic. This keeps requests from log jamming behind a long database heavy back office call. We don't use the disc cache at all because it causes frequent locking issues, which it sounds like you have experienced already. We implemented a simple template based output caching with a cache dependency that clears on publish to keep page renders fast. With the XmlCacheEnabled and ContinouslyUpdateXmlDiskCache turned off we very rarely get the locking issue.

I don't know about your nightly build version, but be very weary of the 404 handlers. Multiple 404 handlers (which are used for url aliasing, template aliasing, etc) are depth first searches and with a large tree and anything that might drop through the tree and hit the 404 handlers they will crash your server almost instantly. This is especially a concern if you are replacing an existing site with an abundance of existing links. We had multiple abortive launches because a pre-existing RSS feed was dropping through to the 404 handlers and taking long enough to resolve that other requests would stack up until 500s would be thrown, eventually spiking all resources and locking up the box. We replaced the depth searches with a lucene based Alias handler (so that 404s would resolve with a hash table O(1) speed). I know the main url handling was replaced for 6, but I don't know if the 404 handlers were also modified or replaced to mitigate those depth searches. In the case of the alias handler it actually searched the entire tree twice. We didn't replace the other handlers because we weren't using them, and we just removed them instead.

If the 404 handlers have been replaced/rewritten for faster execution with large datasets, then you might be fine to keep using them. Otherwise I suggest replacing them. If you are using Alias and not the others I can provide you with our handler that uses Lucene instead, just let me know.

--

You received this message because you are subscribed to the Google Groups "Umbraco development" group.

To unsubscribe from this group and stop receiving emails from it, send an email to umbraco-dev...@googlegroups.com.

To post to this group, send email to umbra...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CACMC2H9Fz6UwqzYetHpA6W2G5s8hYEL%2BoA4yED7FxODZpbKbvA%40mail.gmail.com.

Randy McCluer

unread,

Nov 22, 2013, 12:26:31 PM11/22/13

to umbra...@googlegroups.com

Thanks, Tom. Turning off the xml on the admin site & the child has helped a lot. I'm fine with 30-60s output caching on my pages as well. Of course, I have some slow bits here & there in my custom stuff which aren't helping either, but squashing those as fast as I can.

Shannon, I'm wondering if I can also disable the external Examine indexing of content on the child site, since I'm not using Examine searches anywhere in my code/templates. The umbraco logs seem to indicate some lag in that pipeline on the child site when publishing.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CAO0_ttR62S_iM-md4Yx6sC6%3DT9REkKCYn2DGhPAuB-djUdTqUw%40mail.gmail.com.

Tom Richardson

unread,

Nov 22, 2013, 1:54:00 PM11/22/13

to umbra...@googlegroups.com

Examine indexes in a separate process and shouldn't affect publishing speed.

Also, you really should be using Examine. Examine is awesome.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CACMC2H-Z540aDzsWyHZyzfkkcmAUWhKgX4tpxvyJRPZ0Cde-dA%40mail.gmail.com.

Randy McCluer

unread,

Nov 22, 2013, 1:55:35 PM11/22/13

to umbra...@googlegroups.com

Yeah, I know, Examine seems cool, but I'm integrating this site into an ecosystem that's already running a big SOLR install. So all my searching goes through that. Hence not needing the external examine indexes.

To view this discussion on the web visit https://groups.google.com/d/msgid/umbraco-dev/CAO0_ttT%3DfX_C0k3yFDut%2BoPZVSuNORw8g%2Buj3r%3DcGEzDGqgzLQ%40mail.gmail.com.

Randy McCluer

unread,

Dec 9, 2013, 5:06:45 PM12/9/13

to umbra...@googlegroups.com

I'm up & running now at www.dmagazine.com. Still a couple of quirks, but thought I'd post some quick details. I'll try to post more in a few weeks after we've got our post-launch list knocked out.

Size & Traffic

Site has about 25k documents & 42k+ cmsNode entries.

Total site (including blogs & custom mvc apps) does about 3MM monthly pageviews, but the umbraco part will probably average around 1-1.5MM.

Umbraco mods & plugins used

Custom widget system that allows editors to place content on any of the landing pages or sidebars. Very simple, just uses a RenderAction call under the hood. Widget tree looks like ...

DAMP with cropUp for image handling. Being a print mag, obviously images are very important to the site.

Perfomance quirks that had to be dealt with

-Turned off XML file. At ~200MB, it was just too unwieldy, the admin would hang all the time.

-Moved forward-facing site to its own box, but still having some occasional quirks there.

-Set up an assets subdomain to handle the cropUp images for the front-end to allow me to turn off runAllManagedModulesForAllRequests. It still points its /media and /imagecache paths to the same location as the admin site virt dirs, but with runAllManagedModulesForAllRequests on, serving those images was KILLING my cpu. I'd highly recommend the core team figure out some way to remove that config requirement, because even the overhead on CSS & JS files is a bit unpleasant.

That's it in a nutshell, but I'd be happy to answer any questions.

Adam Nelson

unread,

Dec 10, 2013, 6:04:25 AM12/10/13

to umbra...@googlegroups.com

-Turned off XML file. At ~200MB, it was just too unwieldy, the admin would hang all the time.

What's your app startup time like with the XML umbraco.config disabled (I'm guessing <ContinouslyUpdateXmlDiskCache>False</ContinouslyUpdateXmlDiskCache>) ?

I'm considering doing the same with a ~250MB umbraco.config. But a little wary as I already have a ~8-10 minute startup time from when the app starts to when it serves requests (still on 4.11.10) without giving macro errors.

Cheers, Adam.

Randy McCluer

unread,

Dec 10, 2013, 1:09:15 PM12/10/13

to umbra...@googlegroups.com

I'm seeing about a 40s restart. I'm on a brand new box at Rackspace (albeit the lowest end), so it's pretty damn fast. Dual hex-cores with 64GB RAM.

Randy McCluer

unread,

Dec 20, 2013, 11:25:18 AM12/20/13

to umbra...@googlegroups.com

A follow-up regarding traffic limits. We just had a story blow up to the tune of 4200 concurrent visitors (almost all on the one page), and our single web server held up just fine. We are serving images off of a 2nd server, which probably helped. We have aspnet output caching at 60s, as well. Just a reference point for everyone else.

Anti_Peter

unread,

Dec 22, 2013, 6:52:16 AM12/22/13

to umbra...@googlegroups.com

Hey Randy (and everyone else who has chipped in)

I've been following this topic with interest and its great knowledge that you are sharing.