How scalable is Joomla

3,677 views
Skip to first unread message

Martingale

unread,
Jun 12, 2012, 12:18:43 PM6/12/12
to joomla-de...@googlegroups.com
We are in the process of moving more than 500,000 unique visitors and a lot of content onto our own Joomla platform and I am trying to find more data on how scalable Joomla is.  One of the mods on joomla.org shared this interview with us http://community.joomla.org/blogs/community/1219-mitch.html - anybody else seen more reports, benchmark tests or anything?

Christian

Gary Jay Brooks

unread,
Jun 12, 2012, 3:03:56 PM6/12/12
to joomla-de...@googlegroups.com
Are you using any extensions? 
Are you planning to use core Joomla only? 
Do you have a server setup already?  
Are you planning to lease/buy hosting gear? Or do you plan to use the cloud?

Martingale

unread,
Jun 12, 2012, 4:43:28 PM6/12/12
to joomla-de...@googlegroups.com
Gary, good questions.  I should have explained that I meant Core Joomla with some main-stream components (we use JomSocial, Kunena, JReview, SOBI and Zoo) and a reasonable large IT budget (we currently have a set of quite powerful servers and the assumption that we can add to it) - so more a theoretical discussion versus one on our specific current set-up.  Should I expect to see problems at the 100,000 users, 1 million or 50 million?  Are there Joomla installations with 1 million or even 10 million users?  What is the service that will create problems first?

Gary Brooks

unread,
Jun 12, 2012, 5:53:40 PM6/12/12
to joomla-de...@googlegroups.com
When you say Millions of users- The question is, do you have the staff to support the project or are you talking about a table with millions of records and only a few hundred at a time using and accessing the site at a time?   If the answer is your going to have thousands/millions of "ACTIVE" users.  Your going to need a awesome Dev Ops team, PHP Ninja's, and someone that understands scaling in general.  This goes for any application framework not just Joomla. This is a special team that has experience in scaling web applications.  A group that understands how to boot and install a OS on a server is not the right team.  You need someone that actually understands how to tune a lamp stack and how to process PHP to its max.    Joomla itself is a PHP application that you can load balance and cache to the extreme of what your coders understand.  Joomla can handle millions of hits a second with the right setup under the hood.  Remember Facebook is using MySQL and PHP (C++ compiled).   Can Joomla do it?   :) Yes, Joomla can do it but are you really prepared to take the hit on how you have to build the network.  Are you ready to fork the database drivers?   Do you have the coders that understand the Joomla! database drivers?  Are you going to separate your application so that it understands how to talk rest base object store?  Facebook  separates reads and writes to deal with large volumes of data.  They also segment the application so it can read and write to different types of databases.  If you want to do this in Joomla you could do it.  If your cluster had enough memory and enough storage space for a custom built caching system, not a problem at all, YOU CAN DO IT :D

The problem I see if that your going to use a series of 3rd party extensions.  Extensions will the the basis of why you will not be able to scale. Time will kill you if your tying to scale and deal with the user base. I guess is you want to use extensions to deal with time to market.  Some Joomla extension developers went outside of the core Joomla database driver and they are doing direct connects to mysql.   What happens when you need to have a master, master database setup instantly because everything is crashing?   You end up having to fork the database driver on all your applications or coble together a half baked system that always has to be monitored.  Once a 3rd party extension reaches its limits then your down to the wire to deal with read and write traffic on a already live system.  How do you solve this?  Load test what you want to see in real life first before you go to market.  You can scale Joomla and meet your go to market goals, saving programming time, money, and resources.   Try to keep the 3rd party extension count to a limited level.  if you have the budget build your own application that is built to scale with Joomla.

If you could provide a real live case study of your Joomla scaling needs, I can provide you a picture of what you will need to scale.   My suggestion for you admin team is try to stay away from hypervisors.  Raw metal and 15K SaaS disk would be your best bet. 

We have a Joomla setup with http://www.ramsan.com/   << this is a machine ready to scale a large scale Joomla database.


Gary Brooks 
garyb...@cloudaccess.net
Phone:  +1-231-421-7160 Ext: 7161
Direct Office: +1-231-421-7161
Skype id: garyjaybrooks2000
Fax:    313-899-7032
Web:    http://www.cloudaccess.net
Address:  10850 Traverse Hwy, Suite 4480 | Traverse City, Michigan 49684



On Tue, Jun 12, 2012 at 4:43 PM, Martingale <christi...@themartingale.com> wrote:
Gary, good questions.  I should have explained that I meant Core Joomla with some main-stream components (we use JomSocial, Kunena, JReview, SOBI and Zoo) and a reasonable large IT budget (we currently have a set of quite powerful servers and the assumption that we can add to it) - so more a theoretical discussion versus one on our specific current set-up.  Should I expect to see problems at the 100,000 users, 1 million or 50 million?  Are there Joomla installations with 1 million or even 10 million users?  What is the service that will create problems first?

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To view this discussion on the web, visit https://groups.google.com/d/msg/joomla-dev-general/-/2yCWuDkJE5wJ.
To post to this group, send an email to joomla-de...@googlegroups.com.
To unsubscribe from this group, send email to joomla-dev-gene...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/joomla-dev-general?hl=en-GB.

Martingale

unread,
Jun 14, 2012, 4:42:06 AM6/14/12
to joomla-de...@googlegroups.com
Thanks, Gary, that is very helpful and food for thought.

Anybody else views or even better experience on how scalable Joomla really is?

brian teeman

unread,
Jun 14, 2012, 5:08:48 AM6/14/12
to joomla-de...@googlegroups.com
CAn I suggest you contact Fotis of joomlaworks.gr - he recently gave a presentation on this (in greek) at joomladay i athens and manages some very very large joomla web site

Martingale

unread,
Jun 14, 2012, 5:46:53 AM6/14/12
to joomla-de...@googlegroups.com
Thanks, Brian, I have contacted Fotis through their contact form and if I receive it maybe embed the presentation here. 

Anybody else with brilliant ideas?

Mark Dexter

unread,
Jun 14, 2012, 11:15:32 AM6/14/12
to joomla-de...@googlegroups.com
I think you need to differentiate between a site with a lot of visitors vs. a site with a large number of database items (e.g, 100k articles or unusually large number of menu items -- not sure of the exact size or other types of content in the database). 

On the database side, we have had various reports for version 2.5 of bottlenecks when these numbers are higher than the "normal" range.  We believe this is due to some inefficiencies in the MySQL queries. For example, if I recall correctly, the category blog layout is unusable (too slow) in one example database with like 30k articles. Some people have looked at this issue, but to my knowledge we haven't posted any fixes into trunk for this type of issue. My guess is that there are at least a few of this type of bottleneck hiding in the code.

On the visitors side, I am not aware of any known issues in the actual code (but of course there could be). We still use 1.5 for most of the Joomla sites (joomla.org and the JED).

Hope that helps a bit. Mark

On Thu, Jun 14, 2012 at 2:46 AM, Martingale <christi...@themartingale.com> wrote:
Thanks, Brian, I have contacted Fotis through their contact form and if I receive it maybe embed the presentation here. 

Anybody else with brilliant ideas?

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To view this discussion on the web, visit https://groups.google.com/d/msg/joomla-dev-general/-/KQQ9tHRrjEoJ.

Janich

unread,
Jun 14, 2012, 5:56:45 PM6/14/12
to joomla-de...@googlegroups.com
Some two years ago I made a migration with 150k users to a Joomla cms (1.5). We used only a handfull of virtual machines for the project, and once we had eliminated the bad queries, we never had any problems with it, so best guess is, that 2.5 will rock even more in that sense.
You mentioned a customised Joomla platform, but anyway... I tried to compile a short list of things to look out for, based on my own experience with that project.

Caching
The CMS is quite capable and you get a lot of functionality, fast! But it also have some drawbacks, as Mark mentioned.
If you plan on using it, make sure you fully understand the caching mechanisms. It can be quite powerfull, but its almost worthless if you have 100k loggedin users at the same time.
Also, use something like APC, memcache or something else ... Just not the database!

Database structures
3rd party extensions may not have optimized tables/indexes, so what's you gonna do when you hit that roof? Analyze the structures carefully on beforehand(!) Maybe there are alternatives?

Be realistic - not optimistic
Loadtest as much as possible and duplicate real scenarios (fx. traffic at peaktimes, inserts vs updates vs selects, etc) 
Also - when you end up with 7500 pages in the User Manager - dont choose "All". THAT will kill it for sure... ;-)

Bad queries
Use the debug function and a good mysqllogger to identify bad queries and tmp tables, while loadtesting. (I used 'Jet profiler' amongst others - it's commercial, but good)
Specially watch out for the noncached pages, and any pseudo-cron stuff, that puts all the load into one pageview (not sure jomsocial does this anymore though).

Contribute
If you find something bad, fix it and give it back. You learn a lot, and at the same time you strengthen your network for the bad days (the days when you REALLY need to identify the bugs and put fixes out in prod asap).
Fx: I found a lot of bad queries in Kunena during loadtesting, joined the team and helped fix it, so now Kunena works with at least 3 mio postings - and it got incremental upgrading too. :-)

The list is far from complete and its not very scientific, but very practical for any semi-sized Joomla cms.
I hope you (or someone else) can use it.


/ Janich

Brad Gies

unread,
Jun 15, 2012, 7:49:15 PM6/15/12
to joomla-de...@googlegroups.com

Well... I certainly don't have better experience scaling Joomla, mainly
because I haven't even thought of doing it for the simple reason that
Joomla! is not built to be scalable :).

First, when most people ask questions about scalability they are really
thinking about the ability to handle large volumes of traffic, not true
scalability.

A truly scalable CMS would allow you to popup extra websites/servers
(and round robin them or ??? ) with a click of a button or two, or
automatically. It would also allow you to use multiple databases easily
(a master/slave type scenario). A scalable CMS would be able to go from
1 page view a day to a million page views a second without missing a
beat :). Joomla! does not allow that, hence it is not truly scalable.

As Gary mentions there are things you can do to overcome some of that,
but you shouldn't be thinking that Joomla is truly scalable. Joomla! is
designed to handle very large volume websites without being scalable.

FYI... the above is not a knock on Joomla!. Joomla! is not built to be
everything for everybody. Joomla! does what it does very well. Joomla!
can handle very large websites and a lot of users very easily. It is
just not built to be infinitely scalable. So, as good as Joomla! is,
Google/Facebook etc. have not even considered using it :).

Do you really need a scalable CMS or just a CMS that can handle large
volumes of users?

My thoughts,

Brad
> --
> You received this message because you are subscribed to the Google
> Groups "Joomla! General Development" group.
> To view this discussion on the web, visit
> https://groups.google.com/d/msg/joomla-dev-general/-/y8aFSS_HLRoJ.
> To post to this group, send an email to
> joomla-de...@googlegroups.com.
> To unsubscribe from this group, send email to
> joomla-dev-gene...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/joomla-dev-general?hl=en-GB.


--
Sincerely,

Brad Gies
----------------------------------------------
bgies.com maxhomevalue.com
idailythought.com greenfarminvest.com
----------------------------------------------

Russ Winter

unread,
Jun 16, 2012, 7:18:11 AM6/16/12
to joomla-de...@googlegroups.com
Not sure I understand Brad's definitions of scalability and round-robin'ing....  So I can only speak from my own experiences of "scalability" and performance...

As already mentioned, Yes, there will always be a need to custom tune MySQL queries, tables and indexes and the use of an external caching mechanism is advisable.

I have a single site running on a "Parallel, Cloud Based, 4-Node Cluster" using VMware and Veritas Cluster Server under RH Enterprise, Apache etc etc...  It is primarily used for a custom Secure Document Management System using J!2.5, ACL's and LDAP Authentication.  There are 10K registered users in LDAP but I've never seen more than 1K active at a time, and more likely no more than 500 actually accessing something simultaneously. The repository currently holds around 20K PDF documents (Technical Drawings and Procedures mainly in PDF format) in 20 Top-Level categories and 200 2nd-4th Level Categories.  Documents Lists are kept to a maximum of 50 per page for performance reasons. The new Search is left disabled for performance reasons. Caching is through memcache on the server and 16GB hardware memory on the disk array, set to 60% Read caching.

The hardware is very specifici and high-end...   Each node is a virtual 8 Core, 32GB RAM, 4x 8 Lane PCIe bus, 8GB Fibre Channel SAN attached storage, 15K RPM 500GB drives at the EMC Clariion storage array in to 4x 8+1 RAID5 Logical Units (16TB) -virtualised at the host with Veritas Volume Manager (stripe/mirror for performance) and Veritas File System (Journal, Extent Based) back up to SATA disk through Mirror-Break-Off once a day (staged to tape once a week) and hardware snapshots every 60 minutes. 

Hope this gives you some idea of what is possible with J!, no public access though, this is an internal secure client site.

Cheers
Russ






On 16 June 2012 09:49, Brad Gies <rbg...@gmail.com> wrote:

Well... I certainly don't have better experience scaling Joomla, mainly because I haven't even thought of doing it for the simple reason that Joomla! is not built to be scalable :).

First, when most people ask questions about scalability they are really thinking about the ability to handle large volumes of traffic, not true scalability.

A truly scalable CMS would allow you to popup extra websites/servers (and round robin them or ??? ) with a click of a button or two, or automatically. It would also allow you to use multiple databases easily (a master/slave type scenario). A scalable CMS would be able to go from 1 page view a day to a million page views a second without missing a beat :). Joomla! does not allow that, hence it is not truly scalable.

As Gary mentions there are things you can do to overcome some of that, but you shouldn't be thinking that Joomla is truly scalable. Joomla! is designed to handle very large volume websites without being scalable.

FYI... the above is not a knock on Joomla!. Joomla! is not built to be everything for everybody. Joomla! does what it does very well. Joomla! can handle very large websites and a lot of users very easily. It is just not built to be infinitely scalable. So, as good as Joomla! is, Google/Facebook etc. have not even considered using it :).

Do you really need a scalable CMS or just a CMS that can handle large volumes of users?

My thoughts,

Brad



On 14/06/2012 1:42 AM, Martingale wrote:
Thanks, Gary, that is very helpful and food for thought.

Anybody else views or even better experience on how scalable Joomla really is?
--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To view this discussion on the web, visit https://groups.google.com/d/msg/joomla-dev-general/-/y8aFSS_HLRoJ.
To post to this group, send an email to joomla-dev-general@googlegroups.com.
To unsubscribe from this group, send email to joomla-dev-general+unsub...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/joomla-dev-general?hl=en-GB.
--
Sincerely,

Brad Gies
----------------------------------------------
bgies.com              maxhomevalue.com
idailythought.com      greenfarminvest.com
----------------------------------------------
--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To post to this group, send an email to joomla-dev-general@googlegroups.com.
To unsubscribe from this group, send email to joomla-dev-general+unsub...@googlegroups.com.

Sam Moffatt

unread,
Jun 17, 2012, 5:07:11 PM6/17/12
to joomla-de...@googlegroups.com
Scale is one of those weasel words of our industry because it means
different things to different people. I'm presently on a project where
what Joomla does will have relatively little impact upon scalability
and we're looking towards pre-computing and caching huge amounts of
data to get it to return data in a reasonable period of time. Number
of users in this system: 5000. That's not concurrent users, that's our
total. We're not putting this on slouches of servers either, from
memory our three DB servers are 72GB Oracle T4 boxes.

A while ago I went to a presentation at a conference from a Google
engineer. It was about how hard it would be to put a real time hit
counter on the Google home page. It went through doing your most basic
implementation and then working through adding in load, forcing to
scale out, latency of having to ask multiple servers for responses as
you scale up and eventually you shift incrementing the counter to a
backend process that instead reads the log files, pushes updates to
the cluster, but then that gets beyond the limits so you have many of
those so when they're done counting a set, they ask the max counter
value, adds theirs to it and push it back out to everything and then
on the front end side you cheat and increment the counter artificially
using a crude algorithm (n hits per second, cache is x seconds old) or
an internal cache that fakes incrementing with session affinity on the
load balancers to help maintain the illusion. You can see a piece of
this with YouTube where the counter doesn't get incremented in real
time. Something as simple as a hit counter at "scale" is far from
simple.

I disagree that it is the CMS' job to handle infrastructure tasks.
It's the job of the monitoring system to keep a check on performance
and be reactive to load. Being able to 'popup' extra nodes in a
cluster is not a trivial job for anyone who has written bare metal
deploy scripts or built out template VM images (more recently
particularly with VMWare, AWS and to an extent Azure; really x86
virtualisation that doesn't suck). Instead a proper management system
should be used which handles provisioning these environments and
monitoring load. The CMS shouldn't need to be aware that you've added
extra web nodes to that cluster - it should just handle the requests.
The CMS shouldn't have an awareness of the F5 in front of it or need
to know how to configure it. It shouldn't need to know that one of the
backend database servers fell over and it needs to provision a new
one, there a plenty of HA solutions that handle that. It should just
continue to connect to the VIP and let the other layer handle itself
while handling an absolute failure as gracefully as it can. Similar
deal for the caching layer, it should handle failure gracefully which
may mean your DB layer gets more load than it was expecting (vice
versa caching layer may be able to keep up parts of the site as well).
Only with recent virtualisation advances are we able to spot deploy
environments and even then if you don't have the underlying hardware
to support it then you're stuck anyway (again, less of an issue for
"cloud" VM solutions). However if your CMS/app level is aware of all
of that then you've done something wrong - or you're trying to avoid
re-using the myriad of tools that will help you out. In fact I'd argue
that if the CMS is doing all of that from a click of a button or two
in it's UI - that's a recipe for much pain.

If an unmodified Joomla instance will handle all of that with the
level of traffic that you're asking requires a deeper understanding of
exactly what you're doing, what caching options you can utilise and
the level of personalisation that you are expecting to deliver. My
personal suggestion is that you take the time to build up a system,
import your base data and then replay traffic from your live site back
onto the Joomla site and see how it goes. Capture traffic streams and
map them onto Joomla to see what performance would be like for flows
of your sample users, look at how many concurrent users you have and
then ramp up the Joomla site with the traffic streams to emulate it.
Also depends on how much hardware you can put out there and how
skilled you are at tuning the low level stuff. I've run into strange
low level bugs[1] of various systems even with some of the simpler
stuff I've worked on. Suffice to say that Joomla is a piece of the
puzzle, perhaps the largest piece, but not the only piece. And likely
you're going to need some tuning, though I don't understand why none
of these fixed SQL queries have made it back as patches.

Cheers,

Sam Moffatt
http://pasamio.id.au

[1] http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
>>> joomla-de...@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> joomla-dev-gene...@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/joomla-dev-general?hl=en-GB.
>>
>>
>>
>> --
>> Sincerely,
>>
>> Brad Gies
>> ----------------------------------------------
>> bgies.com              maxhomevalue.com
>> idailythought.com      greenfarminvest.com
>> ----------------------------------------------
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Joomla! General Development" group.
>> To post to this group, send an email to
>> joomla-de...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> joomla-dev-gene...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/joomla-dev-general?hl=en-GB.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Joomla! General Development" group.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> To unsubscribe from this group, send email to
> joomla-dev-gene...@googlegroups.com.

Russ Winter

unread,
Jun 17, 2012, 9:10:05 PM6/17/12
to joomla-de...@googlegroups.com
if it's purely load-testing you want to generate, I use HP (Ex~ Mercury) LoadRunner and QTP packages to generate large amounts of different traffic from any chosen number of users (logged-in, backend and frontend simultaneously - and guest frontend users)

Gary Brooks

unread,
Jun 18, 2012, 11:00:52 AM6/18/12
to joomla-de...@googlegroups.com
Great read Sam. 

Gary Brooks 

Markus Edholm

unread,
Jun 19, 2012, 9:05:16 AM6/19/12
to joomla-de...@googlegroups.com
I haven't found out the exact number, but if you increase the original 8 groups for accesslevels to over 50 different. Editing backend (articles,categories, global settings etc) will timeout the validation, when submit changes. I haven´t found any solution for this "error" except than click on wait in popup and force script to continue.

So there is a limit. At least for administration sites with big numbers of groups.

Would be interesting to know if any dev has thought on limit groups or change the way posting forms.

/Markus 

Richard

unread,
Jun 19, 2012, 10:09:21 AM6/19/12
to joomla-de...@googlegroups.com
We ran into that as well. It locks up the browser. I think that is an issue with mootools. If you disable that bit of javascript, it'll work fine.

Martingale

unread,
Jun 19, 2012, 4:44:04 PM6/19/12
to joomla-de...@googlegroups.com
Hi Chris, I hope you do not mind if I ask you to post your question re multi-tenant in a separate thread (and ask others not to reply here).  This discussion is only about how scalable Joomla is and it is becoming a resource for folks (there is just very little on this subject elsewhere) and I want to ensure it stays on track.

elin

unread,
Jun 21, 2012, 9:02:28 PM6/21/12
to joomla-de...@googlegroups.com
Actually one thing we knew from the beginning is that we would need some kind of pagination for access levels and groups (in the user manager). Probably the issue you are seeing could be somewhat addressed if someone took on rewriting that field to use a modal like the other fields that need pagination. I think the idea of searching through 50 items in a drop down to find one specific one is a usability problem, at least it would be for this user.

Elin  
Reply all
Reply to author
Forward
0 new messages