Slow loading for large (~70k articles) site

1,543 views
Skip to first unread message

Jindan Zhou

unread,
Jul 8, 2013, 3:57:08 PM7/8/13
to joomla-de...@googlegroups.com
Hi all,

My site has about 70k articles in about 10 categories, when I need to load the article list somewhere, say, category list in frontpage, latest article module, it takes long to load the page, usually in the 45+ seconds. My observation is that those pages need to query all rows in a category to compute the correct pagination (I am still a beginner in Joomla, so correct me if I am wrong).

I have also noticed that most of the query time are spent on building tmp tables, i.e., `Copying tmp tables to...`, so my immediate solution was to let MySQL to use tmpfs as its tmpdir, and that indeed improved, roughly the Joomla debug info shows:

Before tmpfs:
 Application 43.734 seconds (+43.653); 1.84 MB (+0.741) - afterDispatch
After tmpfs:
 Application 2.224 seconds (+2.137); 1.84 MB (+0.706) - afterDispatch

I believe if my assumption on pagination was true, then we could even improve the large site performance by change the way how we paginate listings, in my case, I think just have a link to the next page would suffice, that way we don't have to query all rows every time, am I right?

To back up my observations, in one of my category I have about 30k articles, when I click on a page number in the listing page, MySQL needs roughly 800M to over 1G for writing the tmp table, imaging all those files are written/read into/from physical hard drive, it explains very well why it takes so long to load that page.

Any thoughts?



Donald Gilbert

unread,
Jul 8, 2013, 4:05:10 PM7/8/13
to joomla-de...@googlegroups.com
Great post. Presenting a problem, discussing causes, and offering potential solutions - the community could use more of this. :thumbs up:

There are a lot of optimizations that can be done under the hood to speed things up in Joomla, but some of them may present changes to the public API of the code, which would break the backwards compatibility promise. I suppose that's probably why these issues haven't been fully addressed previously. Definitely something to look into as the cause of slow loading on large sites.






--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.
To post to this group, send an email to joomla-de...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-general.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Nikolaos K. Dionysopoulos

unread,
Jul 8, 2013, 4:16:20 PM7/8/13
to joomla-de...@googlegroups.com
Hello Jindan,

You are correct and this is something I had complained about years ago (around the 1.7 era, if I recall correctly). You can trace the problem down to JModelLegacy, namely the _getListCount method.

Here's what happens. JModelList::getPagination() needs to create a JPagination object. In order to do that it needs the total number of items. So it calls $this->getTotal(). In turn it does this:

// Load the total.
$query = $this->_getListQuery();
try
{
$total = (int) $this->_getListCount($query);
}
catch (RuntimeException $e)
{
$this->setError($e->getMessage());
return false;
}

Now take a look at _getListCount. What we're doing is asking MySQL to load AN ENTIRE TABLE (OK, filtered by access, but still!!!!) INTO THE MEMORY and then just tell us how many items it loaded.

Loading ~70k records through a multi-join query. As you all know, a JOIN operation is a matrix multiplication. Do the math. Hundreds of thousands to millions of rows loaded to end up with about ~70k entries. Mind. Blown.

The correct approach would be to provide a different query for the count. For example, if _getListQuery returns this:

SELECT d, i, s FROM #__cuckoo c INNER JOIN #__nest n WHERE n.bird_id = c.cukoo_id

we could have a _getListCountQuery which returns this:

SELECT COUNT(*) FROM #__cuckoo c INNER JOIN #__nest n WHERE n.bird_id = c.cukoo_id

And, yes, COUNT(*) is actually faster than COUNT(d). This is something I've already done in FOF, even when it would simply extend FOFModelLegacy.

If you guys dig up my old posts, from the 1.6 to 1.7 era, you'll see that I've already reported that. I was repeatedly told that I don't understand how MySQL works so I decided to stop fighting and eventually got to writing my own solution. At some point during that era I also write com_overload to create tons of test data in com_content to demonstrate my point, but I digress.

Anyway, here's why it happens and here's how to solve it. Anyone interested can write the trivial code to do it and do the easy (but plenty of) work to getting core components' code up to par.

PS: I believe that Eli and Beat have also said pretty much the same stuff over and over. It's not quite a new discovery, it's that nobody actually ever wanted to get it fixed. I would volunteer, but until the release of Joomla! 3.2 I will have my hands full with integrating the RAD layer (FOF) into the core. I don't want to overpromise and underdeliver, so I won't volunteer for this at all.

Cheers,

Nicholas K. Dionysopoulos

Jindan Zhou

unread,
Jul 8, 2013, 4:27:39 PM7/8/13
to joomla-de...@googlegroups.com
Thanks for the reply and explanation;-), I don't think I am qualified
(yet) to rewrite the pagination code, but if any developers need my
data for a test case, I don't mind at all to share them, except that
the main language of the data is in Chinese (should be a problem), the
data is in json, and I have a working code to load them info Joomla.

Drop me a line if interested.
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



--
.--- .. -. -.. .- -. --.. .... --- ..-

Nikolaos K. Dionysopoulos

unread,
Jul 8, 2013, 4:30:59 PM7/8/13
to joomla-de...@googlegroups.com
It goes without saying that I can provide my com_overload component to anyone interested. It can create a massive number (~300K) of com_content articles for debugging this sort of issues.

Nicholas K. Dionysopoulos

Amy Stephen

unread,
Jul 8, 2013, 4:39:39 PM7/8/13
to joomla-de...@googlegroups.com
+1 on Nic's post. (Although I would suggest adding a parameter to deactivate the scrolling entirely, or query for + or - 1 in order to use a much more performance friendly next prev instead).

In addition, it would be helpful to be able to deactivate the view access check (if it's all public view, why slow it down?).

Remove join to contact - I seriously doubt anyone with significant volume is using that - or if they are, they should stop. This should not have been done as there is a N:N relationship between articles and contacts. There is a hack to "fix" that - but the JOIN should come out - a plugin should be used for those who want this.

GROUP BY - WHY? Still have yet to hear anyone who understands why all select values are part of a group by. (I have many times heard that it must have something to do with PostgreSQL, but, I wonder)

If nothing else, being able to override the model more easily would help those trying to use Joomla with a lot of data. I've always had to override with any serious load.

It does not work, as it is, for anything over a 1,000 articles, or so. (if that) Just too slow.

Wonder what Fotis does? He builds some big sites.


On Mon, Jul 8, 2013 at 3:16 PM, Nikolaos K. Dionysopoulos <niko...@gmail.com> wrote:

Michael Babker

unread,
Jul 8, 2013, 4:49:11 PM7/8/13
to joomla-de...@googlegroups.com
The group by changes are indeed for PostgreSQL and SQL Server support.  Any suggestions on how to handle that at a higher level would be nice IMO, there's places we've made MySQL suffer to add other support I'm willing to bet.
--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.
To post to this group, send an email to joomla-de...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-general.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
- Michael

Please pardon any errors, this message was sent from my iPhone.

Nikolaos K. Dionysopoulos

unread,
Jul 8, 2013, 4:52:30 PM7/8/13
to joomla-de...@googlegroups.com
Hi Amy,

+1 on Nic's post. (Although I would suggest adding a parameter to deactivate the scrolling entirely, or query for + or - 1 in order to use a much more performance friendly next prev instead).

I've thought of that and toyed around with the idea. It's easy to not allow scrolling before page #1. But it's really hard figuring out when you're in the last page. I tried applying a rule like "if we show 20 items and there are less than 20 items shown on the page we're on the last page". There's one small issue. What if the number of records is divisible by the number of items per page, e.g. exactly 400 items with 20 shown per page? Bummer :(

In addition, it would be helpful to be able to deactivate the view access check (if it's all public view, why slow it down?).

That would be a great thing to have, yes. With a big fat warning to prevent an accidental information leak.

Remove join to contact - I seriously doubt anyone with significant volume is using that - or if they are, they should stop.

+1

This should not have been done as there is a N:N relationship between articles and contacts. There is a hack to "fix" that - but the JOIN should come out - a plugin should be used for those who want this.

There is actually a way to speed up the join without an N:N glue table. If both columns are indexed and present in the WHERE clause in the right order MySQL will optimise data access. Of course getting rid of the contact information if it's not really used is better.

GROUP BY - WHY? Still have yet to hear anyone who understands why all select values are part of a group by. (I have many times heard that it must have something to do with PostgreSQL, but, I wonder)

I'll pass on that as I am a novice on MS SQL Server and PostgreSQL.

If nothing else, being able to override the model more easily would help those trying to use Joomla with a lot of data. I've always had to override with any serious load.

Well, that's one way, but I wouldn't really recommend it. I can imagine several misguided developers screwing up the model and Joomla! (or other innocent developers) taking the flak. Besides, we should really optimise the core queries. More so when we are talking about Joomla! in the enterprise and stuff like that. Not to mention that every site would benefit from a speed up, even small sites on lowest end servers (or should I say especially those?).

It does not work, as it is, for anything over a 1,000 articles, or so. (if that) Just too slow.

Given obscene amounts of RAM and SSDs it does. But, well, it's a ridiculous way to deal with it.

Wonder what Fotis does? He builds some big sites.

He uses K2 and boasts about how K2 can handle hundreds of thousands of articles with millions of comments when Joomla! struggles with a few hundred articles. That's not a conjecture, I've talked to him about this subject in many occasions.

Denis Ryabov

unread,
Jul 8, 2013, 5:36:02 PM7/8/13
to joomla-de...@googlegroups.com
What about to override getTotal() in components/com_content/models/articles.php (and category.php):

public function getTotal()
{
$store = $this->getStoreId('getTotal');
if (isset($this->cache[$store]))
return $this->cache[$store];

$query = $this->_getListQuery();
try {
$query->clear('select')->select('COUNT(*)')->clear('group')->clear('order');
$total = $this->_db->setQuery($query)->loadResult();
} catch (RuntimeException $e) {
$this->setError($e->getMessage());
return false;
}

$this->cache[$store] = $total;
return $this->cache[$store];
}

PS. I'm not sure about COUNT(*), maybe COUNT(DISTINCT a.id) is more correct because of grouping is removed.


09.07.2013, 00:27, "Jindan Zhou" <jin...@gmail.com>:
С уважением,
Денис Рябов

Jindan Zhou

unread,
Jul 9, 2013, 11:59:57 AM7/9/13
to joomla-de...@googlegroups.com
So I was going to give Denis' code a shot today, but I couldn't locate the method in the path, nor in the administrator's folder.

Per Nic's comment, my tmpfs approach would soon become infeasible as I am adding hundreds more article per day and my site start to generate traffic.

I have also experimented putting a single article on frontpage, then just the prev and next links under the article (does not make sense if you have 20~30k articles in a category;-), I feel even in that case, showing next page still takes long, does that still load the entire table as Nic described?

Thanks,
>>  email to joomla-dev-general+unsub...@googlegroups.com.
>>
>>  To post to this group, send an email to joomla-de...@googlegroups.com.
>>  Visit this group at http://groups.google.com/group/joomla-dev-general.
>>  For more options, visit https://groups.google.com/groups/opt_out.
>>
>>  --
>>  You received this message because you are subscribed to a topic in the
>>  Google Groups "Joomla! General Development" group.
>>  To unsubscribe from this topic, visit
>>  https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
>>  To unsubscribe from this group and all of its topics, send an email to
>>  joomla-dev-general+unsub...@googlegroups.com.
>>  To post to this group, send an email to joomla-de...@googlegroups.com.
>>  Visit this group at http://groups.google.com/group/joomla-dev-general.
>>  For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> .--- .. -. -.. .- -.    --.. .... --- ..-
>
> --
> You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.

Denis Ryabov

unread,
Jul 9, 2013, 12:26:59 PM7/9/13
to joomla-de...@googlegroups.com
I mean to add implementation of getTotal() method at the end of ContentModelArticles class (this method is not contained in that class, but it is inherited from JModelList). The idea is to remove GROUP/ORDER clauses and COUNT(*) from original query to speed up get of total articles (it's not ideal solution, but better than nothing).
 
09.07.2013, 20:00, "Jindan Zhou" <jin...@gmail.com>:
>>  email to joomla-dev-gene...@googlegroups.com.
>>
>>  To post to this group, send an email to joomla-de...@googlegroups.com.
>>  Visit this group at http://groups.google.com/group/joomla-dev-general.
>>  For more options, visit https://groups.google.com/groups/opt_out.
>>
>>  --
>>  You received this message because you are subscribed to a topic in the
>>  Google Groups "Joomla! General Development" group.
>>  To unsubscribe from this topic, visit
>>  https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
>>  To unsubscribe from this group and all of its topics, send an email to
>>  joomla-dev-gene...@googlegroups.com.
>>  To post to this group, send an email to joomla-de...@googlegroups.com.
>>  Visit this group at http://groups.google.com/group/joomla-dev-general.
>>  For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> .--- .. -. -.. .- -.    --.. .... --- ..-
>
> --
> You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.

--
С уважением,
Денис Рябов

 

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

To post to this group, send an email to joomla-de...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-general.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jindan Zhou

unread,
Jul 9, 2013, 2:06:39 PM7/9/13
to joomla-de...@googlegroups.com
Thanks! I tried it both way (* and distinct a.id), intuitive they don't seem to help, although I didn't profile the scripts.
>>  email to joomla-dev-general+unsub...@googlegroups.com.
>>
>>  To post to this group, send an email to joomla-de...@googlegroups.com.
>>  Visit this group at http://groups.google.com/group/joomla-dev-general.
>>  For more options, visit https://groups.google.com/groups/opt_out.
>>
>>  --
>>  You received this message because you are subscribed to a topic in the
>>  Google Groups "Joomla! General Development" group.
>>  To unsubscribe from this topic, visit
>>  https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
>>  To unsubscribe from this group and all of its topics, send an email to
>>  joomla-dev-general+unsub...@googlegroups.com.
>>  To post to this group, send an email to joomla-de...@googlegroups.com.
>>  Visit this group at http://groups.google.com/group/joomla-dev-general.
>>  For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> .--- .. -. -.. .- -.    --.. .... --- ..-
>
> --
> You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.

--
С уважением,
Денис Рябов

 

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.

To post to this group, send an email to joomla-de...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-general.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Beat

unread,
Jul 10, 2013, 12:20:42 PM7/10/13
to joomla-de...@googlegroups.com

This has been solved in latest GitHub version for Joomla 3.1.2. (using COUNT instead of loading all)

Patch by Matias (Thanks Matias) in FR:
 # [#31278] Improve counting of results in queries. Thanks Matias Griese (Fix #1274):

https://github.com/joomla/joomla-cms/pull/1274

You can either apply patch, or get github version (not for live site, which also includes the SQL query profiler that I added with Peter), or wait for the upcoming Joomla 3.1.2 release.

Best Regards,
Beat
http://www.joomlapolis.com/

Nikolaos K. Dionysopoulos

unread,
Jul 10, 2013, 12:22:43 PM7/10/13
to joomla-de...@googlegroups.com
Awesome! I missed that update :)

Nicholas K. Dionysopoulos

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Gary Mort

unread,
Jul 10, 2013, 9:57:03 PM7/10/13
to joomla-de...@googlegroups.com


On Monday, July 8, 2013 4:39:39 PM UTC-4, Amy Stephen wrote:
+1 on Nic's post. (Although I would suggest adding a parameter to deactivate the scrolling entirely, or query for + or - 1 in order to use a much more performance friendly next prev instead).


You can do that in a view override.  Disable pagination entirely[so there will never be any checks for number of rows] and set your limit to 1 more then your true limit.

Ie if you want to show 20 articles at a time, set the limit to 21.

If limitstart = 0, then don't show the prev button.
If count($items) < 21, then don't show the next button
In your for loop for displaying the items, use some sort of count so that when you reach the 21st item[if you do] it is not displayed.

Since the view layout is building the next button internally, it is a simple matter to use $nextStart = $limitstart + $limit -1 and $prevStart = $limitstart - $limit +1

I think this is a much better solution than a bunch of options because this sort of thing[optimizing for thousands or articles] is going to very site dependent.  Joomla! provides a sane Pagination strategy for /most/ websites.  Websites that are very large can decide just how they want to display data in their view override and provide alternate methods.

For example, using this strategy you also have the option of providing partial pagination.  For example, if you want to show 20 articles per page, with up to 3 pages on either side, then set your true limit to 100.  Again, you only display up to the first 20 results.  After that, you can use some the div and mod operations to determine how many pages[maximum of 3 of "next" results to display links for.  The same div and mod operations applied to limitstart gives you the number of previous pages.

Ie 
If limitstart is 140, then you know that there are at least 7 pages
If your result set is 56 then you know that there 2 additional pages
Pagination would this be:  4 5 6 7 8 9 


It's not that hard to optimize by disabling Pagination and just rolling your own.

Gary Mort

unread,
Jul 10, 2013, 9:58:57 PM7/10/13
to joomla-de...@googlegroups.com


On Monday, July 8, 2013 4:52:30 PM UTC-4, Nicholas Dionysopoulos wrote:
Hi Amy,

+1 on Nic's post. (Although I would suggest adding a parameter to deactivate the scrolling entirely, or query for + or - 1 in order to use a much more performance friendly next prev instead).

I've thought of that and toyed around with the idea. It's easy to not allow scrolling before page #1. But it's really hard figuring out when you're in the last page. I tried applying a rule like "if we show 20 items and there are less than 20 items shown on the page we're on the last page". There's one small issue. What if the number of records is divisible by the number of items per page, e.g. exactly 400 items with 20 shown per page? Bummer :(


Set the limit to 1 more then you actually want and use a view layout override to suppress the last row from displaying if the number of rows equals the limit.
 

Jindan Zhou

unread,
Jul 11, 2013, 12:27:19 AM7/11/13
to joomla-de...@googlegroups.com
Once again I tried both way (apply the patch and install latest git version), they don't seem to help a lot: the site still takes 1 minute or so (without caching) to load each article listings page, and I'm still seeing about 1GB of mysql tmp files being written and read, besides the pagination area varnished;-) didn't have time to investigate that yet.
I've also tried to load my dataset into a fresh install of Wordpress, the load time for post listings are much, much faster, I wish in Joomla's next major release we can see this improve for large site...

Allon Moritz

unread,
Jul 11, 2013, 12:46:39 AM7/11/13
to joomla-de...@googlegroups.com

Large tmp files are mostly created trough an order by....do you see one in the query. Sorry but sending this from my phone....

On Jul 11, 2013 6:27 AM, "Jindan Zhou" <jin...@gmail.com> wrote:
Once again I tried both way (apply the patch and install latest git version), they don't seem to help a lot: the site still takes 1 minute or so (without caching) to load each article listings page, and I'm still seeing about 1GB of mysql tmp files being written and read, besides the pagination area varnished;-) didn't have time to investigate that yet.
I've also tried to load my dataset into a fresh install of Wordpress, the load time for post listings are much, much faster, I wish in Joomla's next major release we can see this improve for large site...

--

Amy Stephen

unread,
Jul 11, 2013, 2:16:54 AM7/11/13
to joomla-de...@googlegroups.com
Can you turn the profiler on in the admin and then share those queries?

Seems like there are three list queries - one for the links (prev, next, current set), one for the total, and then the third for the actual query.

I'd like to see the profiler results tho.

And, yea, there are order by - and group by - that will create a nice temporary file.

Gary - that's exactly what I meant by a better way to do pagination for large result sets +1.


On Wed, Jul 10, 2013 at 11:27 PM, Jindan Zhou <jin...@gmail.com> wrote:
Once again I tried both way (apply the patch and install latest git version), they don't seem to help a lot: the site still takes 1 minute or so (without caching) to load each article listings page, and I'm still seeing about 1GB of mysql tmp files being written and read, besides the pagination area varnished;-) didn't have time to investigate that yet.
I've also tried to load my dataset into a fresh install of Wordpress, the load time for post listings are much, much faster, I wish in Joomla's next major release we can see this improve for large site...

Gary Mort

unread,
Jul 11, 2013, 9:07:59 AM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 2:16:54 AM UTC-4, Amy Stephen wrote:

Gary - that's exactly what I meant by a better way to do pagination for large result sets +1.


I figured as much, my main point which probably was buried in verbosity is that it doesn't require changing the model or overriding the model to implement - it can be pushed into a layout override for the template by turning the Joomla pagination off[which I believe is possible in 3 places, globally for the entire site, for specific categories via category options, and for specific menu items via menu options - so it is possible to target disabling pagination as precisely as one wishes]

Jindan Zhou

unread,
Jul 11, 2013, 9:47:08 AM7/11/13
to joomla-de...@googlegroups.com
Amy, I put the full debug info on pastebin:
http://pastebin.com/Qnrkzzn2 (Beat's git install)
http://pastebin.com/i90WPKmG (Vanilla 3.1.1)

Also I see 2 duplicate queries in the git install, have not read into that yet;-) Thanks,

Short version:
Profile Information (After Beat's patch):

Time

Memory

Time: 0.1 ms / 0.1 ms Memory: 0.182 MB / 0.18 MB Application: afterLoad
Time: 64.0 ms / 64.0 ms Memory: 0.775 MB / 0.96 MB Application: afterInitialise
Time: 28.0 ms / 92.1 ms Memory: 0.351 MB / 1.31 MB Application: afterRoute
Time: 9573.4 ms / 9665.5 ms Memory: 1.930 MB / 3.24 MB Application: afterDispatch
Time: 8.5 ms / 9674.0 ms Memory: 0.064 MB / 3.30 MB Application: beforeRenderModule mod_articles_latest (Latest Article)
Time: 4470.5 ms / 14144.5 ms Memory: 0.084 MB / 3.39 MB Application: afterRenderModule mod_articles_latest (Latest Article)
Time: 0.2 ms / 14144.7 ms Memory: 0.000 MB / 3.38 MB Application: beforeRenderModule mod_login (Login Form)
Time: 7.9 ms / 14152.6 ms Memory: 0.051 MB / 3.43 MB Application: afterRenderModule mod_login (Login Form)
Time: 0.4 ms / 14153.0 ms Memory: 0.001 MB / 3.43 MB Application: beforeRenderModule mod_breadcrumbs (Breadcrumbs)
Time: 3.7 ms / 14156.7 ms Memory: 0.019 MB / 3.45 MB Application: afterRenderModule mod_breadcrumbs (Breadcrumbs)
Time: 0.7 ms / 14157.3 ms Memory: 0.000 MB / 3.45 MB Application: beforeRenderModule mod_menu (Main Menu)
Time: 6.6 ms / 14163.9 ms Memory: 0.046 MB / 3.50 MB Application: afterRenderModule mod_menu (Main Menu)
Time: 4.5 ms / 14168.5 ms Memory: 0.018 MB / 3.52 MB Application: afterRender

Database queries total: 13977.9 ms


Application 0.000 seconds (+0.000); 0.19 MB (+0.185) - afterLoad
Application 0.059 seconds (+0.059); 0.92 MB (+0.731) - afterInitialise
Application 0.079 seconds (+0.019); 1.10 MB (+0.181) - afterRoute
Application 11.155 seconds (+11.076); 1.84 MB (+0.744) - afterDispatch
Application 11.161 seconds (+0.006); 1.94 MB (+0.098) - beforeRenderModule mod_articles_latest (Latest Article)
Application 15.597 seconds (+4.436); 1.96 MB (+0.023) - afterRenderModule mod_articles_latest (Latest Article)
Application 15.597 seconds (+0.000); 1.96 MB (-0.005) - beforeRenderModule mod_login (Login Form)
Application 15.603 seconds (+0.006); 1.97 MB (+0.014) - afterRenderModule mod_login (Login Form)
Application 15.603 seconds (+0.000); 1.97 MB (+0.001) - beforeRenderModule mod_breadcrumbs (Breadcrumbs)
Application 15.606 seconds (+0.003); 1.98 MB (+0.008) - afterRenderModule mod_breadcrumbs (Breadcrumbs)
Application 15.607 seconds (+0.001); 1.98 MB (-0.001) - beforeRenderModule mod_menu (Main Menu)
Application 15.618 seconds (+0.011); 2.07 MB (+0.092) - afterRenderModule mod_menu (Main Menu)
Application 15.623 seconds (+0.005); 2.09 MB (+0.020) - afterRender


On Thursday, July 11, 2013 1:16:54 AM UTC-5, Amy Stephen wrote:
Can you turn the profiler on in the admin and then share those queries?

Seems like there are three list queries - one for the links (prev, next, current set), one for the total, and then the third for the actual query.

I'd like to see the profiler results tho.

And, yea, there are order by - and group by - that will create a nice temporary file.

Gary - that's exactly what I meant by a better way to do pagination for large result sets +1.
On Wed, Jul 10, 2013 at 11:27 PM, Jindan Zhou <jin...@gmail.com> wrote:
Once again I tried both way (apply the patch and install latest git version), they don't seem to help a lot: the site still takes 1 minute or so (without caching) to load each article listings page, and I'm still seeing about 1GB of mysql tmp files being written and read, besides the pagination area varnished;-) didn't have time to investigate that yet.
I've also tried to load my dataset into a fresh install of Wordpress, the load time for post listings are much, much faster, I wish in Joomla's next major release we can see this improve for large site...

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.

Amy Stephen

unread,
Jul 11, 2013, 11:05:07 AM7/11/13
to joomla-de...@googlegroups.com
It is not good. I do not like it. It is bad. It makes me sad.

Can you rerun the comparisons, both using the same debug options? Trying to figure out why the one with Beat's fix has 95 queries while the current core one has 34.

I believe you have different debug options on -- if possible -- go with the options in place for the current core dump.

Also, I'll give you two queries I would like you to run in phpMyAdmin with stats on -- I'd like to see what that contact table join is costing.

Won't say more on that until we know the cost.

Be a few minutes.




To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Jindan Zhou

unread,
Jul 11, 2013, 11:57:17 AM7/11/13
to joomla-de...@googlegroups.com
Other than on Administration -> System -> Global Configuration -> System Tab: Debug System=>On; Debug Language => Off, where else do I set debug options?

The configuration.php file is copied from vanilla installation folder to git's, so they are identical. Only extension that I have installed on current core (not on git version) is Kunena forum. Didn't touch php.ini, just changed root dir. in nginx.conf to run the debug.

Thanks,
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsubscribe@googlegroups.com.
To post to this group, send an email to joomla-de...@googlegroups.com.

Amy Stephen

unread,
Jul 11, 2013, 12:05:47 PM7/11/13
to joomla-de...@googlegroups.com
Hm. That's weird.  We'll figure that out before we're done -- Beat's output has three times the queries so something is different. The output looks different, too, more detail and it is formatted differently. Do you agree?

++++

Here are two queries - if you can run them in something like phpMyAdmin - and get performance stats for both - we can see what the cost is for the contact join.

Core - no changes

https://gist.github.com/AmyStephen/5975740

Core - removing contact join

https://gist.github.com/AmyStephen/5976751


To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Gary Mort

unread,
Jul 11, 2013, 12:38:12 PM7/11/13
to joomla-de...@googlegroups.com
Just for fun, try replacing
components\com_content\models\articles.php

With the following file as a test: https://gist.github.com/garyamort/5976972


It eliminates most of the joins when getting the list of articles[which is NOT GOOD for general usage but from looking at your queries would not affect this particular use case].

Instead it only looks up the article ID's that match your filters[but skips over a bunch of category checks for published categories and archived categories, which is why it is not good].

It then attempts to use the article model to build the list of article data by pulling in each id.  My hope here is that because each join is a much smaller amount of data, it should work much more quickly despite exchanging 1 query with 70,001 queries.

This is not a "solution" for the problem, but more a stab at figuring out if this path could be extended to provide a true solution.

Jindan Zhou

unread,
Jul 11, 2013, 12:44:59 PM7/11/13
to joomla-de...@googlegroups.com
Shame on me, ran two queries in no time, but have been trying to find
performance statistics for the query (phpmyadmin) ever since... so if
you could let me know....;-)
>>>>>> an email to joomla-dev-gene...@googlegroups.com.
>>>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>>>>
>>>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Joomla! General Development" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to joomla-dev-gene...@googlegroups.com.
>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>>
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Joomla! General Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to joomla-dev-gene...@googlegroups.com.
>> To post to this group, send an email to
>> joomla-de...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Jindan Zhou

unread,
Jul 11, 2013, 12:57:01 PM7/11/13
to joomla-de...@googlegroups.com
Gary thanks!

Your articles.php did not pull articles from the category: `There are
no articles in this category. If subcategories display on this page,
they may contain articles.`


Profile Information
Application 0.000 seconds (+0.000); 0.19 MB (+0.185) - afterLoad
Application 0.055 seconds (+0.055); 0.92 MB (+0.731) - afterInitialise
Application 0.065 seconds (+0.010); 1.10 MB (+0.181) - afterRoute
Application 5.872 seconds (+5.808); 1.53 MB (+0.431) - afterDispatch
Application 5.879 seconds (+0.007); 1.63 MB (+0.104) -
beforeRenderModule mod_articles_latest (Latest Article)
Application 10.427 seconds (+4.548); 1.65 MB (+0.019) -
afterRenderModule mod_articles_latest (Latest Article)
Application 10.428 seconds (+0.000); 1.65 MB (-0.005) -
beforeRenderModule mod_login (Login Form)
Application 10.434 seconds (+0.006); 1.66 MB (+0.014) -
afterRenderModule mod_login (Login Form)
Application 10.434 seconds (+0.000); 1.66 MB (+0.001) -
beforeRenderModule mod_breadcrumbs (Breadcrumbs)
Application 10.437 seconds (+0.003); 1.67 MB (+0.009) -
afterRenderModule mod_breadcrumbs (Breadcrumbs)
Application 10.438 seconds (+0.001); 1.67 MB (-0.001) -
beforeRenderModule mod_menu (Main Menu)
Application 10.448 seconds (+0.011); 1.77 MB (+0.099) -
afterRenderModule mod_menu (Main Menu)
Application 10.451 seconds (+0.003); 1.78 MB (+0.011) - afterRender
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Jindan Zhou

unread,
Jul 11, 2013, 1:02:38 PM7/11/13
to joomla-de...@googlegroups.com
BTW, also Beat mentioned that git version` ...which also includes the
SQL query profiler that I added with Peter)`, that probably explains
something, I should've ran comparison just on the patched legacy.php,
not on git version, hang on few hours...

On Thu, Jul 11, 2013 at 11:05 AM, Amy Stephen <amyst...@gmail.com> wrote:
>>>>>> an email to joomla-dev-gene...@googlegroups.com.
>>>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>>>>
>>>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Joomla! General Development" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to joomla-dev-gene...@googlegroups.com.
>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>>
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Joomla! General Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to joomla-dev-gene...@googlegroups.com.
>> To post to this group, send an email to
>> joomla-de...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Gary Mort

unread,
Jul 11, 2013, 1:24:34 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 12:57:01 PM UTC-4, Jindan Zhou wrote:
Gary thanks!

Your articles.php did not pull articles from the category: `There are
no articles in this category. If subcategories display on this page,
they may contain articles.`


Just to clarify:
1) You viewing the homepage/frontpage right?
2) Are you logged on or accessing as a guest user?  Since it worked as a guest user and there was a bug when logged on as super admin, I'm guessing you were logged on?

In any case, this gist fixes the bug:

If you can dump a copy of the queries it would be most excellent.  At the very least now you know that part of that big time jump is coming from somewhere in getItems. :-)

Jindan Zhou

unread,
Jul 11, 2013, 1:35:24 PM7/11/13
to joomla-de...@googlegroups.com
Here's the new comparison:
With Beat's patched legacy.php: http://pastebin.com/v6m0KyJz
Original legacy.php (shipped with Joomla 3.1.1): http://pastebin.com/ABrEk3S4

Jindan Zhou

unread,
Jul 11, 2013, 1:42:21 PM7/11/13
to joomla-de...@googlegroups.com
1) You viewing the homepage/frontpage right?
Yes, on frontpage. Home menu is set to display category list.
2) Are you logged on or accessing as a guest user?  Since it worked as a guest user and there was a bug when logged on as super admin, I'm guessing you were logged on?
Yes, accessing as guest. Also line #835 gave me some trouble;-) needed to use linux style (forward) slash `/`

Amy Stephen

unread,
Jul 11, 2013, 2:34:28 PM7/11/13
to joomla-de...@googlegroups.com
Gary - I'm just taking it one step at a time, removing some of the more likely problems (which I can retrieve the data in a less intensive way), get some good stats on exact changes. We definitely can get speed removing all the joins but that doesn't do us much good in the end -- that damned query needs to be fixed!


On Thu, Jul 11, 2013 at 11:38 AM, Gary Mort <jooml...@gary.mort.net> wrote:

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Amy Stephen

unread,
Jul 11, 2013, 2:39:17 PM7/11/13
to joomla-de...@googlegroups.com
Thanks - will review. Am looking for instructions on configuration needed to get query stats - can't seem to quickly find either. Will respond soon.


To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Gary Mort

unread,
Jul 11, 2013, 2:40:48 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 1:42:21 PM UTC-4, Jindan Zhou wrote:
1) You viewing the homepage/frontpage right?
Yes, on frontpage. Home menu is set to display category list.


Ahh!!  Yeah, that code won't work at all!  I was using the featured/frontpage view not the Category List.

Here is a new gist tweaked so it works with Category list:
 
2) Are you logged on or accessing as a guest user?  Since it worked as a guest user and there was a bug when logged on as super admin, I'm guessing you were logged on?
Yes, accessing as guest. Also line #835 gave me some trouble;-) needed to use linux style (forward) slash `/`


Yeah, that was an ugly hack because I couldn't remember how to load an extra model from within a model properly.  Now it's fixed since the Category model actually has some code to do this already in order to leverage the Articles model.   Give it a stab again and let me know what happens.

Jindan Zhou

unread,
Jul 11, 2013, 2:41:18 PM7/11/13
to joomla-de...@googlegroups.com
i am good at command line, too, so if you have syntax on mysql shell,
that shall do. asked in #mysql, found i don't even have voice there;-(
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Gary Mort

unread,
Jul 11, 2013, 2:56:39 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 2:34:28 PM UTC-4, Amy Stephen wrote:
Gary - I'm just taking it one step at a time, removing some of the more likely problems (which I can retrieve the data in a less intensive way), get some good stats on exact changes. We definitely can get speed removing all the joins but that doesn't do us much good in the end -- that damned query needs to be fixed!

My own thinking is to fix the process rather than just the query.

On his page, as an example, he actually has 2 calls to the monstrous getList - one for the category list and then one for the latest list.  Because of the way it is processed, there is very little chance for any caching of the results by Joomla or by MySQL[mysql has a built in query cache].

If the process is actually broken into two stages, first get all the id's that match the selection criteria - followed by an individual query to get each item - then Joomla has the possibility of caching each of the individual return values - so the first time it has to run 70,000 queries, but for subsequent pulls it only needs to run say 100 as most of the data has not changed.  

Even if Joomla! doesn't cache the results, MySQL itself has a good chance to cache the query results.

Finally, recall that there are a number of massive temporary tables currently being built due to the joins.  While the joins are still there in the getItem for the Article model, because those are being filtered down to single rows by the article ID I'm guessing that with a large number of records it will actually perform much better running in this manner.

Now for the downside of this solution: By skipping the join into the Category table I think there will be a signifigant performance boost.  The downside though is that the number of results[the getListCount function] will only be an estimate.  IE there might be 10,000 results initially but after running the data through getItem there would only be 8967 results - so it would also require a shift in thinking on pagination.  IE don't say X results, just say "over 1000 results".  Don't display all the pages, just the forward/back buttons.  Not a big deal.


I think the query itself can't really be "fixed" in a one size fits all manner.  IE what works best for 100 records will be different than best for 1000 and best for 10,000 - so better to have a big dataset and small dataset path with different ways of behaving then one path where one type of website is penalized performance wise to cater to others[sort of how linux has slowly been degrading performance on desktops in order to accommodate high performance on large cpu core servers]. 

Jindan Zhou

unread,
Jul 11, 2013, 2:59:13 PM7/11/13
to joomla-de...@googlegroups.com
Gary's new tweaked articles.php (the results are too cryptic to me;-):
http://pastebin.com/FP83RPKv

Gary Mort

unread,
Jul 11, 2013, 3:04:26 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 2:41:18 PM UTC-4, Jindan Zhou wrote:
i am good at command line, too, so if you have syntax on mysql shell,
that shall do. asked in #mysql, found i don't even have voice there;-(

You could try:
SET GLOBAL general_log = 'ON';

That will log some basic information on every query being run, giving you an idea of where the queries are hitting issues.  Then with the slower queries, from the mysql command line, first do a 'use database <joomladb>;' followed up by 'explain <query>;' and you might spot some parts that can be tuned. 

Jindan Zhou

unread,
Jul 11, 2013, 3:21:18 PM7/11/13
to joomla-de...@googlegroups.com
When using explain, should I revise the query somehow? Result for Core - no changes

mysql> explain  SELECT a.id, a.title, a.alias, a.introtext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, CASE WHEN a.modified = '0000-00-00 00:00:00' THEN a.created ELSE a.modified END AS modified, a.modified_by, uam.name AS modified_by_name, CASE WHEN a.publish_up = '0000-00-00 00:00:00' THEN a.created ELSE a.publish_up END AS publish_up,a.publish_down, a.images, a.urls, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, LENGTH(a.fulltext) AS readmore, CASE WHEN badcats.id IS NOT NULL THEN 0 ELSE a.state END AS state,c.title AS category_title, c.path AS category_route, c.access AS category_access, c.alias AS category_alias, CASE WHEN a.created_by_alias > ' ' THEN a.created_by_alias ELSE ua.name END AS author,ua.email AS author_email, parent.title AS parent_title, parent.id AS parent_id, parent.path AS parent_route, parent.alias AS parent_alias,ROUND(v.rating_sum / v.rating_count, 0) AS rating, v.rating_count AS rating_count,c.published, CASE WHEN badcats.id IS NULL THEN c.published ELSE 0 END AS parents_published FROM w0z9v_content AS a LEFT JOIN w0z9v_content_frontpage AS fp ON fp.content_id = a.id LEFT JOIN w0z9v_categories AS c ON c.id = a.catid LEFT JOIN w0z9v_users AS ua ON ua.id = a.created_by LEFT JOIN w0z9v_users AS uam ON uam.id = a.modified_by  LEFT JOIN w0z9v_categories AS parent ON parent.id = c.parent_id LEFT JOIN w0z9v_content_rating AS v ON a.id = v.content_id LEFT OUTER JOIN (SELECT cat.id AS id FROM w0z9v_categories AS cat JOIN w0z9v_categories AS parent ON cat.lft BETWEEN parent.lft AND parent.rgt WHERE parent.extension = 'com_content' AND parent.published != 1 GROUP BY cat.id ) AS badcats ON badcats.id = c.id WHERE a.access IN (1,1,5) AND c.access IN (1,1,5) AND CASE WHEN badcats.id IS NULL THEN a.state ELSE 0 END = 1 AND a.catid = 8 AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2013-07-11 13:29:10') AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2013-07-11 13:29:10') GROUP BY a.id, a.title, a.alias, a.introtext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, a.created, a.modified, a.modified_by, uam.name, a.publish_up, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, a.fulltext, a.state, a.publish_down, badcats.id, c.title, c.path, c.access, c.alias, uam.id, ua.name, ua.email, parent.title, parent.id, parent.path, parent.alias, v.rating_sum, v.rating_count, c.published, c.lft, a.ordering, parent.lft, fp.ordering, c.id, a.images, a.urls ORDER BY c.lft, CASE WHEN a.publish_up = '0000-00-00 00:00:00' THEN a.created ELSE a.publish_up END DESC , a.created LIMIT 0, 10
    -> ;
+----+-------------+------------+--------+------------------------+----------------+---------+-------------------+-------+----------------------------------------------+
| id | select_type | table      | type   | possible_keys          | key            | key_len | ref               | rows  | Extra                                        |
+----+-------------+------------+--------+------------------------+----------------+---------+-------------------+-------+----------------------------------------------+
|  1 | PRIMARY     | <derived2> | system | NULL                   | NULL           | NULL    | NULL              |     0 | const row not found                          |
|  1 | PRIMARY     | c          | const  | PRIMARY,idx_access     | PRIMARY        | 4       | const             |     1 | Using temporary; Using filesort              |
|  1 | PRIMARY     | a          | ref    | idx_access,idx_catid   | idx_catid      | 4       | const             | 33260 | Using where                                  |
|  1 | PRIMARY     | fp         | eq_ref | PRIMARY                | PRIMARY        | 4       | m31.a.id          |     1 |                                              |
|  1 | PRIMARY     | ua         | eq_ref | PRIMARY                | PRIMARY        | 4       | m31.a.created_by  |     1 |                                              |
|  1 | PRIMARY     | uam        | eq_ref | PRIMARY                | PRIMARY        | 4       | m31.a.modified_by |     1 |                                              |
|  1 | PRIMARY     | parent     | eq_ref | PRIMARY                | PRIMARY        | 4       | const             |     1 |                                              |
|  1 | PRIMARY     | v          | eq_ref | PRIMARY                | PRIMARY        | 4       | m31.a.id          |     1 |                                              |
|  2 | DERIVED     | parent     | range  | cat_idx,idx_left_right | cat_idx        | 153     | NULL              |     2 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | cat        | index  | idx_left_right         | idx_left_right | 8       | NULL              |    13 | Using where; Using index; Using join buffer  |
+----+-------------+------------+--------+------------------------+----------------+---------+-------------------+-------+----------------------------------------------+
10 rows in set (0.00 sec)

Amy Stephen

unread,
Jul 11, 2013, 3:24:24 PM7/11/13
to joomla-de...@googlegroups.com
Are you good at github? I forked the cms, went ahead and made that change to the query - you can clone it https://github.com/AmyStephen/joomla-cms

I was afraid the change for the count might not make a big difference.  Looks like might have gained a second out of 13 seconds. Gonna have to dig in a little deeper.


If you are able, give this a  test and see if it's noticeable or not. There are a number of fixes we can try - but want to try this one first.

Amy Stephen

unread,
Jul 11, 2013, 3:36:42 PM7/11/13
to joomla-de...@googlegroups.com
Jindan -

If github is a problem, you can use this https://gist.github.com/AmyStephen/5978524

Just replace the entire file located at components/com_content/articles.php with that gist. (Meaning, just copy it into the existing file - and overwrite what is there.)

Amy

Jindan Zhou

unread,
Jul 11, 2013, 3:42:45 PM7/11/13
to joomla-de...@googlegroups.com
Did:
1. git clone https://github.com/AmyStephen/joomla-cms
2. cp ../htdocs/configuration.php joomla-cms/.
3. point nginx.conf root to joomla-cms, and restart nginx
4. here's the profile: http://pastebin.com/SwVEh3pv

Am I missing something?;-)

>>>>>>>> To post to this group, send an email to
>>>>>>>> joomla-de...@googlegroups.com.
>>>>>>>>
>>>>>>>> Visit this group at
>>>>>>>> http://groups.google.com/group/joomla-dev-general.
>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Joomla! General Development" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send

>>>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Joomla! General Development" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send

>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>>
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Joomla! General Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an

>> To post to this group, send an email to
>> joomla-de...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to

> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



--
.--- .. -. -.. .- -.    --.. .... --- ..-

--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.

Amy Stephen

unread,
Jul 11, 2013, 4:02:53 PM7/11/13
to joomla-de...@googlegroups.com
See how different that looks again? And - like before - there are 107 queries. (Instead of 33-ish).

Wonder - is it just the first time it ran? Maybe try it again and see if that brings the queries down?

I'll look at the detail more.

Gary - just saw your response - missed it - give me a couple of minutes to do this then read/response/tahnks!


To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Jindan Zhou

unread,
Jul 11, 2013, 4:06:01 PM7/11/13
to joomla-de...@googlegroups.com
Did run few times before sending the profile, true for all profiles sent.
>>>> > --
>>>> > You received this message because you are subscribed to a topic in the
>>>> > Google Groups "Joomla! General Development" group.
>>>> > To unsubscribe from this topic, visit
>>>> >
>>>> > https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
>>>> > To unsubscribe from this group and all of its topics, send an email to
>>>> > joomla-dev-gene...@googlegroups.com.
>>>> > To post to this group, send an email to joomla-de...@googlegroups.com.
>>>>
>>>> > Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>> > For more options, visit https://groups.google.com/groups/opt_out.
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> .--- .. -. -.. .- -. --.. .... --- ..-
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Joomla! General Development" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to joomla-dev-gene...@googlegroups.com.
>>>> To post to this group, send an email to joomla-de...@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Joomla! General Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to joomla-dev-gene...@googlegroups.com.
>> To post to this group, send an email to
>> joomla-de...@googlegroups.com.
>> Visit this group at http://groups.google.com/group/joomla-dev-general.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Amy Stephen

unread,
Jul 11, 2013, 4:08:58 PM7/11/13
to joomla-de...@googlegroups.com
i think it's the explain stuff that is bumping those queries way high.

Here's your latest test - on my fork

DATABASE queries total: 14074.9 ms
Memory Usage
6.6 MB (6,920,584 Bytes)
DATABASE Queries
105 Queries Logged 14074.9 ms

Here's the core vanilla

Memory Usage
2.11 MB (2,213,636 Bytes)
DATABASE Queries
34 Queries Logged


Can you turn that explain stuff off and try it again?

Thanks!




Jindan Zhou

unread,
Jul 11, 2013, 4:14:31 PM7/11/13
to joomla-de...@googlegroups.com

I believe there's a setting some where in the git version, as well as your fork, let me see if I can find it...

>>> Sent from mobile device

Gary Mort

unread,
Jul 11, 2013, 4:16:33 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 3:42:45 PM UTC-4, Jindan Zhou wrote:
Did:
1. git clone https://github.com/AmyStephen/joomla-cms
2. cp ../htdocs/configuration.php joomla-cms/.
3. point nginx.conf root to joomla-cms, and restart nginx
4. here's the profile: http://pastebin.com/SwVEh3pv

Am I missing something?;-)


That is sooo cool......and unless I am wrong shows that we were completely misreading the situation - the problem is not that massive join AT ALL.  Those queries are running very fast.  The problem is with some of the OTHER queries.

For example, from the mysql command line, try running:
 CREATE INDEX `tmp_idx_type_alias` on w0z9v_contentitem_tag_map (`type_alias`,`type_id`) USING BTREE; 

Then load your page again.  I'd bet money you will see at LEAST a 4 second increase in performance.

Amy Stephen

unread,
Jul 11, 2013, 4:17:51 PM7/11/13
to joomla-de...@googlegroups.com
Jindan -

Can you deactivate this option Gary had you turn on?

I believe that's when it came into motion.

On Thu, Jul 11, 2013 at 2:04 PM, Gary Mort <jooml...@gary.mort.net> wrote:


On Thursday, July 11, 2013 2:41:18 PM UTC-4, Jindan Zhou wrote:
i am good at command line, too, so if you have syntax on mysql shell,
that shall do. asked in #mysql, found i don't even have voice there;-(

You could try:
SET GLOBAL general_log = 'ON';

That will log some basic information on every query being run, giving you an idea of where the queries are hitting issues.  Then with the slower queries, from the mysql command line, first do a 'use database <joomladb>;' followed up by 'explain <query>;' and you might spot some parts that can be tuned. 

Jindan Zhou

unread,
Jul 11, 2013, 4:19:24 PM7/11/13
to joomla-de...@googlegroups.com

Not only the explain stuff, the output format changed quite a bit, they now have an HTML color bar to indicate the query time, stack that can link the source file in your ide.

>>> Sent from mobile device

Amy Stephen

unread,
Jul 11, 2013, 4:21:20 PM7/11/13
to joomla-de...@googlegroups.com
In that case, can you just run the baseline query again? (No changes?) Or is that impossible? Do you need me to revert my change? Or give you the original file again?

I don't really care if there are extra queries but do need the baseline on the same basis.

Gary - no indexes (yet). That's cheating.

Amy Stephen

unread,
Jul 11, 2013, 4:25:41 PM7/11/13
to joomla-de...@googlegroups.com
Big jump in After Initialise likely due to the new logging - that's the kind of change we want to isolate so it doesn't reflect on the query changes.

Jindan Zhou

unread,
Jul 11, 2013, 4:26:57 PM7/11/13
to joomla-de...@googlegroups.com

Give me some moment (an hour or so), also I had not turn on general log, I remember at one time I turned that on and forgot to turn off later, my hard drive were filled up like crazy, lol...

>>> Sent from mobile device

Gary Mort

unread,
Jul 11, 2013, 4:30:08 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 4:21:20 PM UTC-4, Amy Stephen wrote:
In that case, can you just run the baseline query again? (No changes?) Or is that impossible? Do you need me to revert my change? Or give you the original file again?

I don't really care if there are extra queries but do need the baseline on the same basis.

Gary - no indexes (yet). That's cheating.

But that's where the problems are...not only that their in some of the new features of Joomla 3.  :-)

For example, that one is on the new "tags" feature.  Seems like somehow one of the columns of the tags table doesn't have an index - so joining that with the each and every article is a killer!

I got so hung up with the "it must be that big messy query" but the explains on those queries are coming back with fractions of a millisecond.  Its the tags and the assets table that seem to be killing performance.

 

Amy Stephen

unread,
Jul 11, 2013, 4:30:37 PM7/11/13
to joomla-de...@googlegroups.com
Appreciate the work you are doing. This needs to be fixed and it's not easy or fast. If you have Skype, please add me - AmyStephen - might make it go more quickly.

Amy Stephen

unread,
Jul 11, 2013, 4:53:30 PM7/11/13
to joomla-de...@googlegroups.com
Gary -

There are some problems in the query that should be fixed. That's the type of thing I am going after.

Here's my list:

1. Contact join (if this is needed, and if it is found to the performance hog I believe it is, then for the 20 rows, or so returned, it will be far quicker to join with a 2nd query).

2. Three queries in the core model to get links, totals, and the query results. We can *at least* drop that to one.  The query that Beat optimized saved almost a second - but we should be able to eliminated completely. That will be another big hunk of time.

3. This piece of garbage -->group('a.id, a.title, a.alias, a.introtext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, a.created, a.modified, a.modified_by, uam.name, a.publish_up, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, a.fulltext, a.state, a.publish_down, badcats.id, c.title, c.path, c.access, c.alias, uam.id, ua.name, ua.email, parent.title, parent.id, parent.path, parent.alias, v.rating_sum, v.rating_count, c.published, c.lft, a.ordering, parent.lft, fp.ordering, c.id, a.images, a.urls');

I want to understand the impact of this ridiculous group by crap. From what I am hearing, it was found to be needed for SQL Server and PostgreSQL. I've used SQL Server since it was a 4.21a port of Sybase to NT. You do *not* need group by for select - but those inner joins that are aggregating things might be creating the requirement. Don't have a SQL Server box handy. But, we can add a check that if it is MySQL - don't do it.  But I want to see what it costs.

4. Several joins that can be conditional for ratings, etc.

5. Alternative scrolling - as I mentioned above and you described in more detail. Get rid of 2 out of 3 queries. Save more time.

Now -- each and every single one of the items I am looking at are problems that can be fixed (and for crying out loud really need to be.) NONE of those items will be fixed by moar indexing. See what I mean?

When we get the crap cleared away -- then -- it's time to look at what other tweaks can be made in terms of the physical structure. But first, problems. If you do indexing first, you lose your baseline for performance testing when removing each problem.

It's slow going, I realize, but we are trying to get a good baseline. The CMS master branch has some kind of changes for the logging that have literally tripled the queries. Those first two tests need to be rerun on that basis. Regardless, even with triple the queries - step 1 is proving successful at 4 second savings (and we know it will be more.)

https://gist.github.com/AmyStephen/5979057

Make sense? I've tried to make some of these changes since 1.6 and frankly I have been shot down or ignored. So, we have a good test case here and dammit - we're at least going to look at some numbers this time. This is insanity to have queries like that.





 

--

Gary Mort

unread,
Jul 11, 2013, 5:25:20 PM7/11/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 4:53:30 PM UTC-4, Amy Stephen wrote:
Gary -

There are some problems in the query that should be fixed. That's the type of thing I am going after.

Here's my list:

1. Contact join (if this is needed, and if it is found to the performance hog I believe it is, then for the 20 rows, or so returned, it will be far quicker to join with a 2nd query).

2. Three queries in the core model to get links, totals, and the query results. We can *at least* drop that to one.  The query that Beat optimized saved almost a second - but we should be able to eliminated completely. That will be another big hunk of time.

3. This piece of garbage -->group('a.id, a.title, a.alias, a.introtext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, a.created, a.modified, a.modified_by, uam.name, a.publish_up, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, a.fulltext, a.state, a.publish_down, badcats.id, c.title, c.path, c.access, c.alias, uam.id, ua.name, ua.email, parent.title, parent.id, parent.path, parent.alias, v.rating_sum, v.rating_count, c.published, c.lft, a.ordering, parent.lft, fp.ordering, c.id, a.images, a.urls');

I want to understand the impact of this ridiculous group by crap. From what I am hearing, it was found to be needed for SQL Server and PostgreSQL. I've used SQL Server since it was a 4.21a port of Sybase to NT. You do *not* need group by for select - but those inner joins that are aggregating things might be creating the requirement. Don't have a SQL Server box handy. But, we can add a check that if it is MySQL - don't do it.  But I want to see what it costs.

4. Several joins that can be conditional for ratings, etc.

5. Alternative scrolling - as I mentioned above and you described in more detail. Get rid of 2 out of 3 queries. Save more time.

Now -- each and every single one of the items I am looking at are problems that can be fixed (and for crying out loud really need to be.) NONE of those items will be fixed by moar indexing. See what I mean?

When we get the crap cleared away -- then -- it's time to look at what other tweaks can be made in terms of the physical structure. But first, problems. If you do indexing first, you lose your baseline for performance testing when removing each problem.

It's slow going, I realize, but we are trying to get a good baseline. The CMS master branch has some kind of changes for the logging that have literally tripled the queries. Those first two tests need to be rerun on that basis. Regardless, even with triple the queries - step 1 is proving successful at 4 second savings (and we know it will be more.)

The problem is, according to the explain, none of those changes will save more than 1000ms.   This page load is taking 10 seconds!  All those enhancements would be lost in the noise of the long running queries that are happening outside of all that.

There was a 4000ms jump on the when getting for a completely seperate query which loads tags:
    SELECT `m`.`tag_id`,`t`.*

      FROM `w0z9v_contentitem_tag_map` AS m 

      INNER JOIN `w0z9v_tags` AS t  
      ON `m`.`tag_id` = `t`.`id`

      WHERE `m`.`type_alias` = 'com_content.article' 
      AND `m`.`content_item_id` = 9162 
      AND `t`.`published` = 1 
      AND t.access IN (1,1,5)

    Explain
    id select_type table type possible_keys key key_len ref rows Extra
    1 SIMPLE t ALL PRIMARY,tag_idx,idx_access NO INDEX KEY COULD BE USED NULL NULL 35 Using where
    1 SIMPLE m ref idx_tag_type,idx_tag idx_tag_type 4 m31.t.id 113 Using where
    Profile
    Status Duration
    starting 0.0 ms
    Waiting for query cache lock 0.0 ms
    checking query cache for query 0.0 ms
    checking privileges on cached 0.0 ms
    checking permissions 0.0 ms
    checking permissions 0.0 ms
    sending cached result to clien 0.0 ms
    logging slow query 0.0 ms
    cleaning up 0.0 ms
    Call Stack
    JROOT/libraries/cms/helper/tags.php:411
    JROOT/components/com_content/models/articles.php:630
    JROOT/components/com_content/models/category.php:234
    JROOT/libraries/legacy/view/legacy.php:398
    JROOT/components/com_content/views/category/view.html.php:61
    JROOT/libraries/legacy/controller/legacy.php:685
    JROOT/components/com_content/controller.php:79
    JROOT/libraries/legacy/controller/legacy.php:722
    JROOT/components/com_content/content.php:16
    JROOT/libraries/legacy/component/helper.php:355
    JROOT/libraries/legacy/component/helper.php:335
    JROOT/includes/application.php:220
    JROOT/index.php:52
    [Add xdebug.file_link_format directive to your php.ini file to have links for files]
    Query Time: 4607.48 ms After last query: 3.46 ms Query memory: 0.011 MB Memory before query: 4.965 MB

Of course, this assumes that the last line is part of the above report and not for the below query.

If this is the case, then we have 8-12 seconds of query issues due to some of the NEW features in Joomla 3[the tags, the extensions, and the assets tables]

Now, if I'm reading this wrong than the Query Time is pointing to the lines following rather than proceeding them. So adding an index to tags makes a good test - if the problem is in the new features it will be immediately apparent with a 4 second improvement in page load.  If instead it makes no change then it is likely to be the following query:

    SELECT COUNT(*)

      FROM w0z9v_content AS a

      LEFT JOIN w0z9v_content_frontpage AS fp 
      ON fp.content_id = a.id

      LEFT JOIN w0z9v_categories AS c 
      ON c.id = a.catid

      LEFT JOIN w0z9v_users AS ua 
      ON ua.id = a.created_by

      LEFT JOIN w0z9v_users AS uam 
      ON uam.id = a.modified_by

      LEFT JOIN w0z9v_categories as parent 
      ON parent.id = c.parent_id

      LEFT JOIN w0z9v_content_rating AS v 
      ON a.id = v.content_id

      LEFT 
      OUTER JOIN (SELECT cat.id as id 
      FROM w0z9v_categories AS cat JOIN w0z9v_categories AS parent 
      ON cat.lft BETWEEN parent.lft 
      AND parent.rgt 
      WHERE parent.extension = 'com_content' 
      AND parent.published != 1 
      GROUP BY cat.id ) AS badcats 
      ON badcats.id = c.id

      WHERE a.access IN (1,1,5) 
      AND c.access IN (1,1,5) 
      AND 
      CASE WHEN badcats.id is null THEN a.state ELSE 0 END = 1 
      AND a.catid = 8 
      AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2013-07-11 19:28:29') 
      AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2013-07-11 19:28:29')

      GROUP BY a.id, a.title, a.alias, a.introtext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, a.created, a.modified, a.modified_by, uam.name, a.publish_up, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, a.fulltext, a.state, a.publish_down, badcats.id, c.title, c.path, c.access, c.alias, uam.id, ua.name, ua.email, parent.title, parent.id, parent.path, parent.alias, v.rating_sum, v.rating_count, c.published, c.lft, a.ordering, parent.lft, fp.ordering, c.id, a.images, a.urls

Which would indicate that the problem is in that ugly query.

It's not that we don't have the same goals, it's just that my process differs from yours.  Your looking at an iterative subtractive process - remove one piece at a time until you find the problem.  My process tends to be additive - strip all the extraneous pieces and get the 'ideal' performance - then iteratively add pieces back in to find which one causes big jumps.

Poor Jindhan...getting advice from 2 conflicting processes....  I think it's best if I bow out because the two processes will cause conflicts. :-)

Amy Stephen

unread,
Jul 11, 2013, 7:53:00 PM7/11/13
to joomla-de...@googlegroups.com
Gary - there *are* *many* problems.

I discouraged your first test because it just removed all the joins - that just doesn't get us anywhere - of course it's faster, but what does it mean?

I shared with you my assumptions and how I will be testing them so that you understood my approach. It's okay that you disagree, determining your position was the point of my sharing. =)

Please understand, though, no matter how passionately you suggest anything your approach is "better" than mine, my response to you will be the same response I give my own assumptions - and that is "Prove it."

It's not that we don't have the same goals, it's just that my process differs from yours.  Your looking at an iterative subtractive process - remove one piece at a time until you find the problem.  My process tends to be additive - strip all the extraneous pieces and get the 'ideal' performance - then iteratively add pieces back in to find which one causes big jumps.

Which means, Gary, I will always force you into my iterative process of taking one item at a time and verifying it. You can count on it, my dear.

While, I still hold to my conviction that looking at indexing should be a last step after all known problems are first corrected, you are still challenging my approach and I am cool with that, we'll head you're direction - with the caveat that it will also have to be proven - just like I was doing with my assumptions.

And, no, you may not "bow out."
You are my Articles list soul partner. < evil laugh >

So, to your concerns. Indexes. Tags.

Yes, I am aware of the query results from my test.

You understand that test seems to show a 4 second saving for the change I made? I say "seems" because we ran into a snag. The format of the testing for my change is different than the formatting for the core test and for the test of Beat's change. More importantly, my test has about three times as many queries.

Core test - http://pastebin.com/ABrEk3S4 (Note: 34ish queries; no individual query data)

My test - http://pastebin.com/SwVEh3pv (Note: 120ish queries; data on each query - cool!)

WTF, right? Can't compare that, now, can we?

We finally realized there are changes to the logging. So, Jindhan will run the other two tests (Core and Beat's test referenced here as 1 and 2 => https://gist.github.com/AmyStephen/5979057) again on that platform - then we have a solid comparison.

Remember I said that it looks like the change we are testing was 4 seconds faster -- it's likely to be more than that when comparing apples and apples. Surely, you can understand my wanting to see that? That's a nice savings.

Tags. Gary says Moar Indexes.
 

So, you have looked the test of my changes and you are responding to the data on the tags queries.

Here's your assertion and evidence - Gary said:

  Query Time: 4607.48 ms After last query: 3.46 ms Query memory: 0.011 MB Memory before query: 4.965 MB

Of course, this assumes that the last line is part of the above report and not for the below query.

If this is the case, then we have 8-12 seconds of query issues due to some of the NEW features in Joomla 3[the tags, the extensions, and the assets tables]

Now, if I'm reading this wrong than the Query Time is pointing to the lines following rather than proceeding them. So adding an index to tags makes a good test - if the problem is in the new features it will be immediately apparent with a 4 second improvement in page load.

Your assumption: Adding an index to tags will result in 4 second page improvement.

My position for any assumption: Explore the data and develop a test -- Prove it.

So, Gary - first question. What table? Tags table? Tag map table? Articles Table?

Second question - what columns do you want to index?

My exploration of this topic:

1. There are eleven of these queries.

11 × SELECT `m`.`tag_id`,`t`.*

      FROM `w0z9v_contentitem_tag_map` AS m
      INNER JOIN `w0z9v_tags` AS t
      ON `m`.`tag_id` = `t`.`id`

Each query adds something of this nature:

WHERE `m`.`type_alias` = 'com_content.article' 
  AND `m`.`content_item_id` = 9162 
  AND `t`.`published` = 1 
  AND t.access IN (1,1,5)

2. What is the join?

`m`.`tag_id` = `t`.`id`

`m`.`type_alias` = 'com_content.article'
`m`.`content_item_id` = 9162
`t`.`published` = 1
`t`.`access` IN (1,1,5)

3.  What is t.id?

It is the primary key to the tags table. That is an auto increment clustered index. It is the best column to join on for this table. Agree?

4.  What is m.tag_id?

It is a column in contentitem_tag_map. It is not the primary key. Every content item that uses that tag will have a row in that table so it is very possible that tag_id is not highly selective. But there is more joining...

5.  What is m.type_alias and m.content_item_id ?

type_alias is a 255 varchar field that should never be used to do a join. So, one WTF. And, why aren't database people making these calls for the project?

`m`.`type_alias` = 'com_content.article'

content_item_id - the table description says it is a primary key to the content type table (10 rows). But, I'm going to say that's wrong since it has a value of 9162 and instead assume it's a primary key to the article table.

`m`.`content_item_id` = 9162
 


6.  Your assertion is that we need an index and we will save 4 seconds on a page load. (Except I don't know what index you want, on what table, and on what columns).

So, here are the table structure for the tag_map table. I'm going to assume you believe the tags table is fine for indexing, since I think it is and you didn't say.

This maps table, well, it's not so great and it's going to be the big one. So, I'll focus here - here's the DDL - what are the indexes now? Look at all those indexes, Gary. There are six of them. That's a lot - someone was thinking about speed. Too bad most of them are of now value.

CREATE TABLE `w0z9v_contentitem_tag_map` (
  `type_alias` varchar(255) NOT NULL DEFAULT '',
  `core_content_id` int(10) unsigned NOT NULL COMMENT 'PK from the core content table',
  `content_item_id` int(11) NOT NULL COMMENT 'PK from the content type table',
  `tag_id` int(10) unsigned NOT NULL COMMENT 'PK from the tag table',
  `tag_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Date of most recent save for this tag-item',
  `type_id` mediumint(8) NOT NULL COMMENT 'PK from the content_type table',
  UNIQUE KEY `uc_ItemnameTagid` (`type_id`,`content_item_id`,`tag_id`),
  KEY `idx_tag_type` (`tag_id`,`type_id`),
  KEY `idx_date_id` (`tag_date`,`tag_id`),
  KEY `idx_tag` (`tag_id`),
  KEY `idx_type` (`type_id`),
  KEY `idx_core_content_id` (`core_content_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Maps items from content tables to tags';

Only three columns matter, the ones in the join:

1. type_alias column - So, what's better than joining on a 255 varchar field? Doing so when it's not indexed. And such is the case here - so this has the potential to be a problem.

2. tag_id - Yea! We have an index, but, there is one row for each item linked to tag_id and it is likely this index is not highly selective (since many items could be linked to it) - so here were have an index (Yea!) that is not likely to be used (Boo!). Bummer. So, more potential to be a problem.

3.
content_item_id is a primary key to the content type table but - it's stuck in the middle of the clustered index - and the first column of the primary index is not even specified in the where so it's highly unlikely it will be used. So, more potential to be a problem.

(`type_id`,`content_item_id`,`tag_id`),

Note: Look at the lovely tag_id to it's right. If only this index were used. Damnation! Why isn't type_id used and what is it?

Oh. My. God. Look! type_id is a numeric value for com_content.article! Why isn't that being used? WTF? Why is there a type_alias value denormalized into every single row? 255 characters of WTF for each and every tag assignment. Where as this lovely primary key type_id is numeric --- IF IT WERE SELECTED THE PRIMARY INDEX WOULD BE USED. The extra join to the content_types table is only 10 rows, so, surely it's worth the trade.

You want to add an index, but I still don't know what index you want.

What I would do is this:

1. REMOVE THE type_alias column from tag map.

Go thru the code, add the join to the nice little type table and use the value from there.

2. Change the query to be like this:

11 × SELECT `m`.`tag_id`,`t`.*
      FROM `w0z9v_contentitem_tag_map` AS m,
      `w0z9v_tags` AS t,

      `w0z9v_content_types` AS c
      WHERE `m`.`tag_id` = `t`.`id`
        AND `m`.`type_id` = `c`.`type_id`

Below I would change the WHERE to AND and m alias to c alias (for each of the 11 queries).

  AND `c`.`type_alias` = 'com_content.article' 
  AND `m`.`content_item_id` = 9162 
  AND `t`.`published` = 1 
  AND t.access IN (1,1,5)


3.  I'd not add any indexes, but I would remove these and I'd go through my queries to ensure the clustered index was selected.

KEY `idx_tag_type` (`tag_id`,`type_id`),
  KEY `idx_date_id` (`tag_date`,`tag_id`),
  KEY `idx_tag` (`tag_id`),
  KEY `idx_type` (`type_id`),
  KEY `idx_core_content_id` (`core_content_id`)



4.  I'll bet you $20 that the core_content_id is either nothing - or - the same value as the content_item_id - so maybe slash that column too.


Now - that's what I would do. But this is your baby. Would you please fork my repo - put up a change and then ask Jindhan to test?

If you are right about 4 seconds that could be gained here - combine that with the other 4 seconds from the first test and we have at least 8 out of 18 recovered.

With tests, too. =)

Joomla needs a database person and you're a likely candidate. Test your assumptions. Someone has to step up in this area, Gary, and it's not going to be me.

Let me know what you find and do.

Thanks.


--

Jindan Zhou

unread,
Jul 11, 2013, 8:25:14 PM7/11/13
to joomla-de...@googlegroups.com
Okay was in the run now back into business. Amy, you guys are the
awesome ones that deserve appreciation;-) Gary, also consider the
4000ms is run in my precious tmpfs;-) No I am not *more* confused for
the conflicts as I am not seeing, lol...

So to make a more consistent call, the following two profiles are
based on Amy's fork:
Amy fork: http://pastebin.com/uf7P72Y4
Amy fork with original articles.php from Joomla 3.1.1: http://pastebin.com/cZK7CREa

Also I ran Gary's code
mysql> CREATE INDEX `tmp_idx_type_alias` on w0z9v_contentitem_tag_map (`type_alias`,`type_id`) USING BTREE;
Query OK, 0 rows affected (0.50 sec)
Records: 0  Duplicates: 0  Warnings: 0
But I don't see that affects page loading time...

I am still in the middle of something, but if I need to run any thing, I'll try my best, thank you all!


On Monday, July 8, 2013 2:57:08 PM UTC-5, Jindan Zhou wrote:
Hi all,

My site has about 70k articles in about 10 categories, when I need to load the article list somewhere, say, category list in frontpage, latest article module, it takes long to load the page, usually in the 45+ seconds. My observation is that those pages need to query all rows in a category to compute the correct pagination (I am still a beginner in Joomla, so correct me if I am wrong).

I have also noticed that most of the query time are spent on building tmp tables, i.e., `Copying tmp tables to...`, so my immediate solution was to let MySQL to use tmpfs as its tmpdir, and that indeed improved, roughly the Joomla debug info shows:

Before tmpfs:
 Application 43.734 seconds (+43.653); 1.84 MB (+0.741) - afterDispatch
After tmpfs:
 Application 2.224 seconds (+2.137); 1.84 MB (+0.706) - afterDispatch

I believe if my assumption on pagination was true, then we could even improve the large site performance by change the way how we paginate listings, in my case, I think just have a link to the next page would suffice, that way we don't have to query all rows every time, am I right?

To back up my observations, in one of my category I have about 30k articles, when I click on a page number in the listing page, MySQL needs roughly 800M to over 1G for writing the tmp table, imaging all those files are written/read into/from physical hard drive, it explains very well why it takes so long to load that page.

Any thoughts?



Amy Stephen

unread,
Jul 11, 2013, 9:05:21 PM7/11/13
to joomla-de...@googlegroups.com
On Thu, Jul 11, 2013 at 7:25 PM, Jindan Zhou <jin...@gmail.com> wrote:
Okay was in the run now back into business. Amy, you guys are the
awesome ones that deserve appreciation;-) Gary, also consider the
4000ms is run in my precious tmpfs;-) No I am not *more* confused for
the conflicts as I am not seeing, lol...

So to make a more consistent call, the following two profiles are
based on Amy's fork:
Amy fork: http://pastebin.com/uf7P72Y4
Amy fork with original articles.php from Joomla 3.1.1: http://pastebin.com/cZK7CREa

Those are exactly the same numbers - plus - the 2nd one (which is core, right?) doesn't have anything contact in it - so - I don't think the test worked. You should be able to search that page and find the world contact.

Try again? (she asked kindly!)
 

Also I ran Gary's code
mysql> CREATE INDEX `tmp_idx_type_alias` on w0z9v_contentitem_tag_map (`type_alias`,`type_id`) USING BTREE;
Query OK, 0 rows affected (0.50 sec)
Records: 0  Duplicates: 0  Warnings: 0
But I don't see that affects page loading time...

There is a problem there -- it's just not what Gary thought, but he's on the right track. Hopefully, he'll have a test soon.

After we get the first test done (and possibly Gary's), I'll be looking at those 3 queries for scrolling. Don Gilbert mentioned that those queries should be cached, and he's right. For 1.6, there was a little code that saved those results -- if I remember correctly, using "context" (which I do not love) to avoid another query - so - my next job is to try to see why that's not working.

I appreciate your patience. Joomla is much better than this - no sense in this type of query time.
 

I am still in the middle of something, but if I need to run any thing, I'll try my best, thank you all!



On Monday, July 8, 2013 2:57:08 PM UTC-5, Jindan Zhou wrote:
Hi all,

My site has about 70k articles in about 10 categories, when I need to load the article list somewhere, say, category list in frontpage, latest article module, it takes long to load the page, usually in the 45+ seconds. My observation is that those pages need to query all rows in a category to compute the correct pagination (I am still a beginner in Joomla, so correct me if I am wrong).

I have also noticed that most of the query time are spent on building tmp tables, i.e., `Copying tmp tables to...`, so my immediate solution was to let MySQL to use tmpfs as its tmpdir, and that indeed improved, roughly the Joomla debug info shows:

Before tmpfs:
 Application 43.734 seconds (+43.653); 1.84 MB (+0.741) - afterDispatch
After tmpfs:
 Application 2.224 seconds (+2.137); 1.84 MB (+0.706) - afterDispatch

I believe if my assumption on pagination was true, then we could even improve the large site performance by change the way how we paginate listings, in my case, I think just have a link to the next page would suffice, that way we don't have to query all rows every time, am I right?

To back up my observations, in one of my category I have about 30k articles, when I click on a page number in the listing page, MySQL needs roughly 800M to over 1G for writing the tmp table, imaging all those files are written/read into/from physical hard drive, it explains very well why it takes so long to load that page.

Any thoughts?



Jindan Zhou

unread,
Jul 11, 2013, 9:11:26 PM7/11/13
to joomla-de...@googlegroups.com
No problem, I might have messed up files. Please confirm the
experiment scheme is correct...
> You received this message because you are subscribed to a topic in the
> Google Groups "Joomla! General Development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/joomla-dev-general/sVFw9reqGGw/unsubscribe.
> To unsubscribe from this group and all of its topics, send an email to
> joomla-dev-gene...@googlegroups.com.
> To post to this group, send an email to joomla-de...@googlegroups.com.
> Visit this group at http://groups.google.com/group/joomla-dev-general.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>



Amy Stephen

unread,
Jul 11, 2013, 9:16:22 PM7/11/13
to joomla-de...@googlegroups.com
Your plan is *perfect*. That should give us the apples to apples comparison we need.

Jindan Zhou

unread,
Jul 11, 2013, 9:32:00 PM7/11/13
to joomla-de...@googlegroups.com
Re-cloned Amy's fork: http://pastebin.com/eR7YdBkL
Amy fork, re-cloned, USING component/com_content/model/articles.php
FROM Joomla 3.1.1 http://pastebin.com/PRyhe10G

This is weird, we're still seeing similar result, here's the diff of
the two articles.php

213,222d212
< // Join on contact table
< $subQuery = $db->getQuery(true)
< ->select('contact.user_id, MAX(contact.id) AS id, contact.language')
< ->from('#__contact_details AS contact')
< ->where('contact.published = 1')
< ->group('contact.user_id, contact.language');
<
< $query->select('contact.id as contactid')
< ->join('LEFT', '(' . $subQuery . ') AS contact ON contact.user_id
= a.created_by');
<
495,496c485
< $query->where('a.language in (' .
$db->quote(JFactory::getLanguage()->getTag()) . ',' . $db->quote('*')
. ')')
< ->where('(contact.language in (' .
$db->quote(JFactory::getLanguage()->getTag()) . ',' . $db->quote('*')
. ') OR contact.language IS NULL)');
---
> $query->where('a.language in (' . $db->quote(JFactory::getLanguage()->getTag()) . ',' . $db->quote('*') . ')');
501c490
< ->group('a.id, a.title, a.alias, a.introtext, a.checked_out,
a.checked_out_time, a.catid, a.created, a.created_by,
a.created_by_alias, a.created, a.modified, a.modified_by, uam.name,
a.publish_up, a.attribs, a.metadata, a.metakey, a.metadesc, a.access,
a.hits, a.xreference, a.featured, a.fulltext, a.state, a.publish_down,
badcats.id, c.title, c.path, c.access, c.alias, uam.id, ua.name,
ua.email, contact.id, parent.title, parent.id, parent.path,
parent.alias, v.rating_sum, v.rating_count, c.published, c.lft,
a.ordering, parent.lft, fp.ordering, c.id, a.images, a.urls');
---
> ->group('a.id, a.title, a.alias, a.introtext, a.checked_out, a.checked_out_time, a.catid, a.created, a.created_by, a.created_by_alias, a.created, a.modified, a.modified_by, uam.name, a.publish_up, a.attribs, a.metadata, a.metakey, a.metadesc, a.access, a.hits, a.xreference, a.featured, a.fulltext, a.state, a.publish_down, badcats.id, c.title, c.path, c.access, c.alias, uam.id, ua.name, ua.email, parent.title, parent.id, parent.path, parent.alias, v.rating_sum, v.rating_count, c.published, c.lft, a.ordering, parent.lft, fp.ordering, c.id, a.images, a.urls');


Amy Stephen

unread,
Jul 11, 2013, 9:57:49 PM7/11/13
to joomla-de...@googlegroups.com
Well, it's a good test - contact is in that second group. Just not showing significant improvement.

Do you have any contacts? Are you trying to use the Contact Component for author info?

Regardless, I think we can rule that out. On to the next one!

Going to revert those changes. Thanks!

Jindan Zhou

unread,
Jul 11, 2013, 10:03:23 PM7/11/13
to joomla-de...@googlegroups.com

No I don't have any contacts, yet

>>> Sent from mobile device

Jindan Zhou

unread,
Jul 11, 2013, 10:06:36 PM7/11/13
to joomla-de...@googlegroups.com

Nor I need contact component...

>>> Sent from mobile device

Amy Stephen

unread,
Jul 11, 2013, 10:20:09 PM7/11/13
to joomla-de...@googlegroups.com
Put up the join change for the tag map - at least that way, Gary can see if that's the kind of change he was hoping for.


Test when you can =)

Amy Stephen

unread,
Jul 11, 2013, 10:26:01 PM7/11/13
to joomla-de...@googlegroups.com
Jindan - can you tell us about your usage of tags -- are you using tags? how many tags do you have? how many articles are tagged? how many tags does each article have (roughly?)


Jindan Zhou

unread,
Jul 11, 2013, 10:43:17 PM7/11/13
to joomla-de...@googlegroups.com

I have been waiting for someone to ask this question :-)
Tags component is vital for my site, I favored the com_tags since I tend to use as less as possible third party extensions, both for the consideration of code compatibility and future maintainabilty. Currently tags are generated by scanning content for a list of keywords, roughly 2 to 3 tags on average. When the site goes online l plan to gave privileges to most users to freely tag an article, need to override front page editor for that purpose.

>>> Sent from mobile device

Amy Stephen

unread,
Jul 11, 2013, 11:05:35 PM7/11/13
to joomla-de...@googlegroups.com
Cool. Good tag test. Did you get a chance to test my change on github? Curious if that helps.

Jindan Zhou

unread,
Jul 11, 2013, 11:24:21 PM7/11/13
to joomla-de...@googlegroups.com

Hold on wasn't aware you've send new thing the same link? Or if you could re send I'd appreciate thanks

>>> Sent from mobile device

Amy Stephen

unread,
Jul 11, 2013, 11:30:11 PM7/11/13
to joomla-de...@googlegroups.com
It rejected my change. Forced it out there. https://github.com/AmyStephen/joomla-cms

Might be the wrong spot - it tests the tag join. (You can try it -- see what happens)

Thanks

Jindan Zhou

unread,
Jul 11, 2013, 11:31:58 PM7/11/13
to joomla-de...@googlegroups.com
never mind testing, code error
1146 Table 'm31.w0z9v_contenttypes' doesn't exist SQL=SELECT
`m`.`tag_id`,`t`.* FROM `w0z9v_contentitem_tag_map` AS m
,`w0z9v_contenttypes` AS c INNER JOIN `w0z9v_tags` AS t ON
`m`.`tag_id` = `t`.`id` WHERE `c`.`type_id` = `m`.`type_id` AND
`c`.`type_alias` = 'com_content.article' AND `m`.`content_item_id` =
5913 AND `t`.`published` = 1 AND t.access IN (1,1,5)
i think i can fix that one or i can wait to re-pull

Amy Stephen

unread,
Jul 11, 2013, 11:35:32 PM7/11/13
to joomla-de...@googlegroups.com
repull - thanks!

Amy Stephen

unread,
Jul 11, 2013, 11:35:39 PM7/11/13
to joomla-de...@googlegroups.com

Jindan Zhou

unread,
Jul 11, 2013, 11:37:56 PM7/11/13
to joomla-de...@googlegroups.com
still
1054 Unknown column 'm.tag_id' in 'on clause' SQL=SELECT
`m`.`tag_id`,`t`.* FROM `w0z9v_contentitem_tag_map` AS m
,`w0z9v_content_types` AS ct INNER JOIN `w0z9v_tags` AS t ON
`m`.`tag_id` = `t`.`id` WHERE `ct`.`type_id` = `m`.`type_id` AND
`ct`.`type_alias` = 'com_content.article' AND `m`.`content_item_id` =
5913 AND `t`.`published` = 1 AND t.access IN (1,1,5)

Amy Stephen

unread,
Jul 11, 2013, 11:40:43 PM7/11/13
to joomla-de...@googlegroups.com
Are you able to get the entire query? Don't see the problem on that.

Jindan Zhou

unread,
Jul 11, 2013, 11:55:13 PM7/11/13
to joomla-de...@googlegroups.com
weird, don't understand why m.tag_id is unknown, but here's the profle
http://pastebin.com/LcsFRA1c

and the page shows:
The requested page cannot be found.

An error has occurred while processing your request.

You may not be able to visit this page because of:

an out-of-date bookmark/favourite
a mistyped address
a search engine that has an out-of-date listing for this site
you have no access to this page

Go to the Home Page

Home Page

If difficulties persist, please contact the System Administrator of
this site and report the error below.

1054 Unknown column 'm.tag_id' in 'on clause' SQL=SELECT
`m`.`tag_id`,`t`.* FROM `w0z9v_contentitem_tag_map` AS m
,`w0z9v_content_types` AS ct INNER JOIN `w0z9v_tags` AS t ON
`m`.`tag_id` = `t`.`id` WHERE `ct`.`type_id` = `m`.`type_id` AND
`ct`.`type_alias` = 'com_content.article' AND `m`.`content_item_id` =
5913 AND `t`.`published` = 1 AND t.access IN (1,1,5)



Amy Stephen

unread,
Jul 12, 2013, 12:09:37 AM7/12/13
to joomla-de...@googlegroups.com
OK, have a better test setup now. Please try again. =) (pushed changes; please pull)

Jindan Zhou

unread,
Jul 12, 2013, 12:17:56 AM7/12/13
to joomla-de...@googlegroups.com
this time the browser show plain text: (how could it happen?)
SELECT `m`.`tag_id`,`t`.* FROM `#__tags` AS t
,`#__contentitem_tag_map` AS m ,`#__content_types` AS ct WHERE
`m`.`tag_id` = `t`.`id` AND `ct`.`type_id` = `m`.`type_id` AND
`ct`.`type_alias` = 'com_content.article' AND `m`.`content_item_id` =
5913 AND `t`.`published` = 1 AND t.access IN (1,1,5)

Amy Stephen

unread,
Jul 12, 2013, 12:22:22 AM7/12/13
to joomla-de...@googlegroups.com
OOOPS! That's my echo to grab the SQL and make sure it works so I don't trouble you. Talk about a good plan turning out bad.

Re-pushed.

Jindan Zhou

unread,
Jul 12, 2013, 12:31:08 AM7/12/13
to joomla-de...@googlegroups.com
don't see changes, yet;-)
http://pastebin.com/xLS7WzVd

Jindan Zhou

unread,
Jul 12, 2013, 12:41:42 AM7/12/13
to joomla-de...@googlegroups.com
I am going to call it a day now;-) good night Amy and all! Will
continue to work on this tomorrow...

Amy Stephen

unread,
Jul 12, 2013, 12:48:01 AM7/12/13
to joomla-de...@googlegroups.com
Yea, the change is in there.

SELECT `m`.`tag_id`,`t`.*
 
      FROM `w0z9v_tags` AS t ,`w0z9v_contentitem_tag_map` AS m ,`w0z9v_content_types` AS ct

 
      WHERE `m`.`tag_id` = `t`.`id`
      AND `ct`.`type_id` = `m`.`type_id`
      AND `ct`.`type_alias` = 'com_content.article'
      AND `m`.`content_item_id` = 5913
      AND `t`.`published` = 1
      AND t.access IN (1,1,5)


In the morning - drop that index you added earlier and try it again. It's still not taking the primary index for the tag_map and I wonder if the new index is creating a problem.

I sure would like to have a test environment. Anyone have a script to create a jinormous 3.1 article site with tags?

BTW - you don't have any cache on - right?

Thanks Jindan

Jindan Zhou

unread,
Jul 12, 2013, 12:58:58 AM7/12/13
to joomla-de...@googlegroups.com

Yeah will do but not sure which file to change. Also I don't mind to send my data file to you, two things,
It's in Chinese that should not be a problem; it sure takes time to load: on my server, 30 hours:-( BTW WordPress took 12 hours
No all these tests cache off

>>> Sent from mobile device

...

Amy Stephen

unread,
Jul 12, 2013, 1:11:07 AM7/12/13
to joomla-de...@googlegroups.com
30 hours - wow. Nah - I'll figure something out.

I can see it is starting to get that clustered index now - but I do NOT want MySQL to use the red ones.

1   SIMPLE  m       REF     uc_ItemnameTagid,idx_tag_type,idx_tag,idx_type  

This picture:

http://twitpic.com/d26011

Go to the contentitem_tag_map table - structure tab - click the -Index link beneath the structure.

Delete these three indexes:
- idx_tag_type
- idx_tag
- idx_type

Then, let's see if performance is better.

Hope you are sleeping (have no idea where you are in the world - hope it's not 5 am!)

Jindan Zhou

unread,
Jul 12, 2013, 1:17:24 AM7/12/13
to joomla-de...@googlegroups.com

Yeah that's the initial data, once the site is up, my scrapy bot will pass each harvested item to the php side, then it becomes a routine job done in real time

>>> Sent from mobile device

...

Amy Stephen

unread,
Jul 12, 2013, 2:39:06 AM7/12/13
to joomla-de...@googlegroups.com
Al of the data is duplicated in the ucm_content tables? And that is queried too many times, too?

And there it is == that type_alias 255 column there, too joining the UCM table to the map -- just like it does for the content table to the map.

The content_types table is actually there in this case and so it could join on type_id.

But, type_id is not in the query at all.  Instead of the numeric key, it's using that 255 characters.

So - here's what it boils down too - duplication of detail content in two tables - unnecessary queries running for the 2nd set of data - bad joins everywhere where the 255 key is used instead of a numeric value - which leads to bad index selection (table scans) - and we already know about too many queries for the pagination.

Everything doubled - done too many times - horrible index selectivity. -- and 40k rowset? I suppose it will be slow.


Inline image 1

Well, not sure what to say. It is now 1:30 - good night.

Screen shot 2013-07-12 at 12.48.01 AM.png

Matt Thomas

unread,
Jul 12, 2013, 7:35:16 AM7/12/13
to Joomla! General Development
Hello Jindan,

I'd like to try to reproduce this issue. I am using https://github.com/nikosdion/com_overload, which generates content for testing, and the latest version of Joomla from Github.

Can you tell me exactly how to reproduce the issue? That is, I see that you have changed the default menu item to be a category list, and are displaying articles with the latest news module. Can you tell me the settings for that menu item and module? Is there anything else that I need to set up to test this, other than 70K articles?



Best,

Matt Thomas
Founder betweenbrain
Phone: 203.632.9322
Twitter: @betweenbrain

Screen shot 2013-07-12 at 12.48.01 AM.png

Amy Stephen

unread,
Jul 12, 2013, 9:06:24 AM7/12/13
to joomla-de...@googlegroups.com
Matt - I have a test implementation with sample data - most of it copied once - three tags - one tag on each article - and with this FANTASTIC debugging tool, I can see the problems. It's not just pagination, either, it's across the board. So, I think it's probably a good idea to see what the plan is moving forward with UCM, maybe in the CMS list discussion.

Jindan - I'm not sure what is best, right now. I think it's fair to say using tags with a lot of data right now will be slow. Fixing that might take time, I'm just not sure what the plans are. You might want to try another tagging solution, see if it's better. If you are in no hurry and willing to play along, cool. Let's see where things lead. Feel free to add me on Skype or email anytime.

Matt Thomas

unread,
Jul 12, 2013, 9:25:54 AM7/12/13
to Joomla! General Development
Amy - Thanks. Not to get off topic, but at some point, I'd love to hear how you added tags programmatically. I was looking at adding tag creation to com_overload, but it is a seemingly a daunting task. It seems like newly tagged items need to be checked for, and added to if not already, the #__ucm_base table for a ucm_id, which seems to duplicate that content in #__ucm_content, and then check for if the tag exists in #__tags, and create that tag if it doesn't exist, and then associate the tag with the item in #__ucm_content in the #__contentitem_tag_map table. Phew!

Best,

Matt Thomas
Founder betweenbrain
Phone: 203.632.9322
Twitter: @betweenbrain



On Fri, Jul 12, 2013 at 9:06 AM, Amy Stephen <amyst...@gmail.com> wrote:
Matt - I have a test implementation with sample data - most of it copied once - three tags - one tag on each article - and with this FANTASTIC debugging tool, I can see the problems. It's not just pagination, either, it's across the board. So, I think it's probably a good idea to see what the plan is moving forward with UCM, maybe in the CMS list discussion.

Jindan - I'm not sure what is best, right now. I think it's fair to say using tags with a lot of data right now will be slow. Fixing that might take time, I'm just not sure what the plans are. You might want to try another tagging solution, see if it's better. If you are in no hurry and willing to play along, cool. Let's see where things lead. Feel free to add me on Skype or email anytime.

Amy Stephen

unread,
Jul 12, 2013, 9:58:39 AM7/12/13
to joomla-de...@googlegroups.com


On Friday, July 12, 2013 8:25:54 AM UTC-5, betweenbrain wrote:
Amy - Thanks. Not to get off topic, but at some point, I'd love to hear how you added tags programmatically. I was looking at adding tag creation to com_overload, but it is a seemingly a daunting task. It seems like newly tagged items need to be checked for, and added to if not already, the #__ucm_base table for a ucm_id, which seems to duplicate that content in #__ucm_content, and then check for if the tag exists in #__tags, and create that tag if it doesn't exist, and then associate the tag with the item in #__ucm_content in the #__contentitem_tag_map table. Phew!

I hope that's not the case. Developers should not be directly accessing these tables.

Maybe this?
$tagsHelper = new JHelperTags;
$tagsHelper->postStoreProcess($my_table_object, $tag_array);

$my_table_object needs to be your JTable object.

$tag_array - tags you want to add.

The problem is - you have to have all your content in the UCM, too.

I'm guessing there is documentation somewhere.

Amy Stephen

unread,
Jul 12, 2013, 10:00:13 AM7/12/13
to joomla-de...@googlegroups.com
Good grief - that's what you are saying. Yes, you are right.

Bakual

unread,
Jul 12, 2013, 10:21:00 AM7/12/13
to joomla-de...@googlegroups.com
I had to do it when I moved the tags in my component from my own implementation to the new JTags. What I did was iterating over each item, loading the table, setting the tags into the metadata column and store it.
Like this:

$db = JFactory::getDBO();
$table = new MycomponentTableMytable($db);
foreach ($items as $item)
{
$table->load($item->id);
$table->metadata = '{"tags":["'.$item->tagtitles.'"]}';
$table->store();
}

Am Freitag, 12. Juli 2013 15:25:54 UTC+2 schrieb betweenbrain:
Amy - Thanks. Not to get off topic, but at some point, I'd love to hear how you added tags programmatically. I was looking at adding tag creation to com_overload, but it is a seemingly a daunting task. It seems like newly tagged items need to be checked for, and added to if not already, the #__ucm_base table for a ucm_id, which seems to duplicate that content in #__ucm_content, and then check for if the tag exists in #__tags, and create that tag if it doesn't exist, and then associate the tag with the item in #__ucm_content in the #__contentitem_tag_map table. Phew!

Best,

Matt Thomas
Founder betweenbrain
Phone: 203.632.9322
Twitter: @betweenbrain



On Fri, Jul 12, 2013 at 9:06 AM, Amy Stephen <amyst...@gmail.com> wrote:
Matt - I have a test implementation with sample data - most of it copied once - three tags - one tag on each article - and with this FANTASTIC debugging tool, I can see the problems. It's not just pagination, either, it's across the board. So, I think it's probably a good idea to see what the plan is moving forward with UCM, maybe in the CMS list discussion.

Jindan - I'm not sure what is best, right now. I think it's fair to say using tags with a lot of data right now will be slow. Fixing that might take time, I'm just not sure what the plans are. You might want to try another tagging solution, see if it's better. If you are in no hurry and willing to play along, cool. Let's see where things lead. Feel free to add me on Skype or email anytime.


--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.

Amy Stephen

unread,
Jul 12, 2013, 10:28:06 AM7/12/13
to joomla-de...@googlegroups.com
When you say "loading the table" - what table? The UCM table?

Question being - how are you managing data in your source table and still getting it into the UCM tables? (So, before that tags).

Also, am seeing another new tables __associations - what is that - how does it fit in (or does it) with the UCM? (and therefore tags).

Am I correct in assuming anything UCM at this point is there simply in support of tags?

Is there documentation to read?

Roberto Segura

unread,
Jul 12, 2013, 10:45:48 AM7/12/13
to joomla-de...@googlegroups.com
I'll try to help you a little. As we all know tags creation is not an easy thing. I gave a talk at JAB about it and it wasn't really usable.

My sample component using it all is at:
https://github.com/phproberto/ucmtags

Base documentation:
http://docs.joomla.org/J3.1:Using_Tags_in_an_Extension

The system was changed recently to not use the metadata field but TBH I haven't tested it. I started to work on a really simple system but Elin didn't like it. My proposal was create a tag table with all the complex things that the system requires ( https://github.com/phproberto/joomla-cms/blob/d425a0f7cfe84dc15df349010d3f8df84021dff5/libraries/cms/table/tag.php ). You only define:

$contentTypeAlias = 'com_jab.speaker';

And you are done. The base table does the rest for you.

For the UCM content creation the documentation includes a really terrible JSON code. Not easy to understand / modify. I transformed it into something usable for my sample component:

https://github.com/phproberto/ucmtags/blob/master/install.php#L248

I hope it's now better easier for you. If not ping me ;)



El viernes, 12 de julio de 2013 15:25:54 UTC+2, betweenbrain escribió:
Amy - Thanks. Not to get off topic, but at some point, I'd love to hear how you added tags programmatically. I was looking at adding tag creation to com_overload, but it is a seemingly a daunting task. It seems like newly tagged items need to be checked for, and added to if not already, the #__ucm_base table for a ucm_id, which seems to duplicate that content in #__ucm_content, and then check for if the tag exists in #__tags, and create that tag if it doesn't exist, and then associate the tag with the item in #__ucm_content in the #__contentitem_tag_map table. Phew!

Best,

Matt Thomas
Founder betweenbrain
Phone: 203.632.9322
Twitter: @betweenbrain



On Fri, Jul 12, 2013 at 9:06 AM, Amy Stephen <amyst...@gmail.com> wrote:
Matt - I have a test implementation with sample data - most of it copied once - three tags - one tag on each article - and with this FANTASTIC debugging tool, I can see the problems. It's not just pagination, either, it's across the board. So, I think it's probably a good idea to see what the plan is moving forward with UCM, maybe in the CMS list discussion.

Jindan - I'm not sure what is best, right now. I think it's fair to say using tags with a lot of data right now will be slow. Fixing that might take time, I'm just not sure what the plans are. You might want to try another tagging solution, see if it's better. If you are in no hurry and willing to play along, cool. Let's see where things lead. Feel free to add me on Skype or email anytime.


--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-general+unsub...@googlegroups.com.

Bakual

unread,
Jul 12, 2013, 11:34:11 AM7/12/13
to joomla-de...@googlegroups.com
I'm using my own table from my component. It had to be adapted to include this around the parent::store()
 
  $this->tagsHelper->preStoreProcess($this);
  $result = parent::store($updateNulls);
  return $result && $this->tagsHelper->postStoreProcess($this);
So it properly creates the ucm entries and tags. The tag data is saved in the metadata column.
 
I understood this as a workaround for a bug related to the batch copy function which does not process postSaveHook.
 
The table #__associations is used for the language (item) associations and doesn't belong to neither ucm nor tags.
 
Currently tags are the only thing using the ucm tables. And I think it's also meant as a way to migrate stuff into ucm once it's completed. But I'd say it's a long way till ucm and most likely a lot needs to be changed again till then. I'm not convinced yet that it was a good decision to include ucm into tags. But it for sure gave a headstart to the ucm project :-)

Gary Mort

unread,
Jul 12, 2013, 11:43:25 AM7/12/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 7:53:00 PM UTC-4, Amy Stephen wrote:
Gary - there *are* *many* problems.

I discouraged your first test because it just removed all the joins - that just doesn't get us anywhere - of course it's faster, but what does it mean?



It get's me somewhere based on the way I default to approaching troubleshooting performance.  It doesn't help you because it doesn't give you information that is of use to your process....  
 

Please understand, though, no matter how passionately you suggest anything your approach is "better" than mine, my response to you will be the same response I give my own assumptions - and that is "Prove it."

Erm, I never said my approach was better, just that we have different approaches.  I find it fun and enjoyable to fiddle with performance issues - so when I am stuck on some bit of code, I often troll through the forums and look for a problem which is diverting and fun to me, such as this one.

My preferred approach is to strip things done to the basics and establish the best possible performance, then incrementally add back complexity and find where things break.  So if I am doing something for 'fun' then that is the way I'm going to do it.  No need to prove myself to anyone because it is for fun and thus I can do things my way.

When approaching a problem professionally, I will use my process by default but have no objection to using alternative processes if the employer prefers them.  Since I view it as a matter of taste, I have no vested ego in doing things "my way"...just a preference to maximize fun when it doesn't interfere with business.

Also when doing things for "fun" I don't have much interest in following through with a complete fix.   I'll push through to identify the problem and identify a hot-fix for the person having the problem so they can get past it.   After that, I've extracted all the "fun" from the process so I'll leave it alone[as opposed to when *I* get bitten by a bug, where for that I will provide a full patch and submit a pull request].

At the moment I need to get back to doing some work professionally - however if you don't want me to stop then when I get to a break/block point and need some fun I'll devise a few more tests and see if Jindan's patience is not exhausted and he's willing to run them.  :-)


Gary Mort

unread,
Jul 12, 2013, 11:53:58 AM7/12/13
to joomla-de...@googlegroups.com


On Thursday, July 11, 2013 8:25:14 PM UTC-4, Jindan Zhou wrote:
Okay was in the run now back into business. Amy, you guys are the
awesome ones that deserve appreciation;-) Gary, also consider the
4000ms is run in my precious tmpfs;-) No I am not *more* confused for
the conflicts as I am not seeing, lol...

Personally, from a DB Admin point of view, 70,000 records is just not very large.  Page load time in excess of 2 seconds is bad.  Page load time in excess of 10 seconds means there is something seriously broken somewhere.

Thanks for all your patience...   It is especially nice that your comfortable with the mysql command line as it allows me to bypass entirely trying to come up with test cases in PHP code that will keep everything working and instead skip directly to troubleshooting the problem query.  Depending on how things have progressed - if your still having problems I'll put together an SQL file of queries later today that if you can run and paste the response from will help me considerably in locating problem areas.  Are you familiar with using redirection with the mysql command to run queries and grab the results?  Ie 'mysql <<somefile.sql >>output.txt'   That way I can just set it up so you can run it all from a single command and then paste the output buffer.
 
Also I ran Gary's code
mysql> CREATE INDEX `tmp_idx_type_alias` on w0z9v_contentitem_tag_map (`type_alias`,`type_id`) USING BTREE;
Query OK, 0 rows affected (0.50 sec)
Records: 0  Duplicates: 0  Warnings: 0
But I don't see that affects page loading time...

Thanks, that means I was wrong and the 4second query length was for the query AFTER the tag lookup and not for the tag lookup via an unindexed field.  I always have to remind myself that eliminating possibilities is progress, even if it doesn't look like progress. :-)

-Gary

Amy Stephen

unread,
Jul 12, 2013, 11:56:43 AM7/12/13
to joomla-de...@googlegroups.com
Understood. You can follow/participate in https://groups.google.com/forum/#!topic/joomla-dev-cms/MWzBgYd2Lgw as you see fit.

If we can get to the point that we have a plan and agreement with the project to move forward, if you are able to dedicate some time to the specific tasks or at least to helping with the testing, it would be appreciated.


--
You received this message because you are subscribed to the Google Groups "Joomla! General Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-gene...@googlegroups.com.

Gary Brooks

unread,
Jul 12, 2013, 12:05:53 PM7/12/13
to joomla-de...@googlegroups.com
@cloudaccess willing to donate some admin time and a dedicated server to help with fixing this issue.

- XDbug Report (Run a profile on code based on URL)
- We would need a copy of the site for our admins to give us back a report.

Possibly we can provide better case study data.  Let me know if the owner/developers helping have interest. We would hope we could get some coder that has the skill set to fix this issue and get the commit in place.  

Gary Brooks
@garyjaybrooks @cloudaccess skypeid: garyjaybrooks2000 
It is loading more messages.
0 new messages