type_alias: #__contentitem_tag_map and #_content_types

938 views
Skip to first unread message

Amy Stephen

unread,
Jul 11, 2013, 11:08:46 PM7/11/13
to joomla-...@googlegroups.com
In reviewing the Articles List query speed issues, Gary Mort noticed query times around 4000ms (4 seconds) -- just for the tag queries. The page takes about 18 seconds to load so there are likely other issues and Gary might recommend additional changes, but this is one that must be fixed.

Briefly - this query runs for each list item, retrieving tags for this item:

SELECT `m`.`tag_id`,`t`.*

      FROM `w0z9v_contentitem_tag_map` AS m
      INNER JOIN `w0z9v_tags` AS t
      ON `m`.`tag_id` = `t`.`id`

    WHERE `m`.`type_alias` = 'com_content.article'
      AND `m`.`content_item_id` = 9162
      AND `t`.`published` = 1
      AND t.access IN (1,1,5
Add join to #__content_types and compare to that table's type_alias to 'com_content.article', joining #__content_types and #__contentitem_tag_map on type_id   
)

Heres the contentitem_tag_map table 

CREATE TABLE `w0z9v_contentitem_tag_map` (
  `type_alias` varchar(255) NOT NULL DEFAULT '', REMOVE
  `core_content_id` int(10) unsigned NOT NULL COMMENT 'PK from the core content table',
  `content_item_id` int(11) NOT NULL COMMENT 'PK from the content type table',
  `tag_id` int(10) unsigned NOT NULL COMMENT 'PK from the tag table',
  `tag_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Date of most recent save for this tag-item',
  `type_id` mediumint(8) NOT NULL COMMENT 'PK from the content_type table',
  UNIQUE KEY `uc_ItemnameTagid` (`type_id`,`content_item_id`,`tag_id`),
  KEY `idx_tag_type` (`tag_id`,`type_id`),
  KEY `idx_date_id` (`tag_date`,`tag_id`),
  KEY `idx_tag` (`tag_id`),
  KEY `idx_type` (`type_id`),
  KEY `idx_core_content_id` (`core_content_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Maps items from content tables to tags';

The fastest index is going to be the primary index, which in this case is (`type_id`,`content_item_id`,`tag_id`)

While the query does join on content_item_id and tag_id, it doesn't use the type_id.

The type_id should be a join to the content_types table (below).

CREATE TABLE `w0z9v_content_types` (
  `type_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `type_title` varchar(255) NOT NULL DEFAULT '',
  `type_alias` varchar(255) NOT NULL DEFAULT '',
  `table` varchar(255) NOT NULL DEFAULT '',
  `rules` text NOT NULL,
  `field_mappings` text NOT NULL,
  `router` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`type_id`),
  KEY `idx_alias` (`type_alias`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=10000 ;


Rather than join to that table using the type id and then comparing to the text value there, the map table has been denormalized and it includes the 255 character field. So, for every single item/tag combination - a row is added and that 255 column is populated and used in queries. Not good. Really bad, in fact. To add insult to injury, there is no index on that 255 character field either, so essentially, a table scan happens for each query. (Every row is read for every combination of joins every time.)

Going forward:

- type_alias needs to be removed from the contentitem_tag_map table.
- everywhere in the system, queries that join to the contentitem_tag_map table need to be modified to join to content_types on type_id and then use the content_types type_alias column for comparison.

I don't know how you want to deal with the API issues but if it were up to me, I'd break the API and get the fix in and let Nic yell at me ;-) . The quicker this problem is out there, the more extensions use that bad join. This error has capacity to bury any site with more than a few items of content -- which uses tags.

What do extension devs think should happen? Not sure who all worked on tags, but does this seem reasonable? Am I missing something?
 

Chad Windnagle

unread,
Jul 11, 2013, 11:31:13 PM7/11/13
to joomla-...@googlegroups.com
Reading your post has convinced me that a break now and a fix before 3.5 is better than living with this nasty one for the next LTS! I agree with trying to fix it unless there's some more info about why the current implementation was necessary.
--
You received this message because you are subscribed to the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-cm...@googlegroups.com.
To post to this group, send an email to joomla-...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-cms.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
Regards,
Chad Windnagle

Matt Thomas

unread,
Jul 12, 2013, 7:43:47 AM7/12/13
to joomla-...@googlegroups.com
I agree that a break in a STS release is better than living with it in a LTS release. I do agree that more information on the current implementation would be helpful.

Best,

Matt Thomas
Founder betweenbrain
Phone: 203.632.9322
Twitter: @betweenbrain

Amy Stephen

unread,
Jul 12, 2013, 9:24:32 AM7/12/13
to joomla-...@googlegroups.com
First, kudos to Beat on that new debugging environment. Being able to see index selectivity on each query is handy. =)

There is more than just this index.

I have three concerns:

1. The technical issues. Content is now in two tables (duplicate data is not easy to maintain and it should not happen - ever - IMO), queries that run but shouldn't (pagination is one example), this column and it's use throughout the application as a key field (why isn't the real key used?), too many indexes that end up getting selected instead of or in addition to the primary index which leads to table scans during queries.

It might be said that of course someone with 40k data will have a slow environment, and that's fair. But, I'm testing with sample data, copied once into another category, three tags, each article has one, and the problems are visible using this new tool.

2. That said, my second concern is -- does the project (the powers that be, those who are working on this, people who make decisions) -- even see this as a problem?  Because if it's just perceived to be the pain that comes with a system of incremental improvement, then, no sense in putting a lot of time in this as the fixes would likely not be accepted.

3. That said, my third concern is -- I have no idea what the plans are for UCM, what work is currently underway, where is this work taking place, are these problems already being fixed?

Right now, it's like the CMS is between two architectures with feet in both. My guess is the plan is to move towards the UCM tables (in which case, fixing this key is important.)

One recommendation I have for the Joomla project is to identify a database expert (I mean a real one) and empower them with authority over every query, any schema changes, and ask them to pull a plan together for how to get from here to where things are humming better.

A second recommendation is to stop adding stuff to the CMS but focus on planning and publishing detailed plans of where things are going. We have Nic's effort, UCM, those are going different directions. Then, the framework, yet another direction. This has to stop. Joomla - all together - as a whole - one community - one plan - everyone should work from it.

Matt Thomas

unread,
Jul 12, 2013, 9:28:26 AM7/12/13
to joomla-...@googlegroups.com
+1 Amy. I don't see any other way to maintain long-term sustainability of the project without addressing these issues. 

Best,

Matt Thomas
Founder betweenbrain
Phone: 203.632.9322
Twitter: @betweenbrain



Amy Stephen

unread,
Jul 12, 2013, 11:45:02 AM7/12/13
to joomla-...@googlegroups.com
So, back to #1) technical issues.

In addition to fixing the use of the varchar(255) key - we need to remove anything using the ucm_content and ucm_base tables.

Only the content_types, tag table and the tag associations table are needed. The other data is not needed. It's just creating a lot of extra complication and slow down.

The views which display data associated from tags should display that data from the REAL source data, not a copy of that data.

The tag map table can use type_id to join to the content_types table. Within, the types table - the type alias can be used to connect to the source table. The field mappings can be used to search and access data - just use the data logically, instead of physically.

Would the project consider these types of changes?

1. use type_id - not type_alias in joins throughout the application (remove type_alias from map table) 
2. remove ucm_base and ucm_content entirely - instead use the real content in conjunction with the content type alias and content_types_field_mappings.

Mark Dexter

unread,
Jul 12, 2013, 11:54:57 AM7/12/13
to joomla-...@googlegroups.com
If I recall correctly, there are values from the source tables that we need in those queries, such as title, author, dates, published status, etc. We could have accomplished this without using the ucm table, but it was much, much easier to get everything from the one table. For example, ordering the tagged items by title or date would have been difficult otherwise (requiring UNION for each content table, for example).

Since we wanted to move in that direction anyway and combine the existing components into one table, that was the decision. 

If you want to propose code to improve the query while preserving the existing functionality, that would be great. At this point, I think we will move forward from where we are. If we could get help moving the existing components over to the UCM tables and drop the separate component tables, that would be great and will eliminate the duplicate content in the db.

Mark

Gary Mort

unread,
Jul 12, 2013, 12:26:00 PM7/12/13
to joomla-...@googlegroups.com


On Thursday, July 11, 2013 11:08:46 PM UTC-4, Amy Stephen wrote:
In reviewing the Articles List query speed issues, Gary Mort noticed query times around 4000ms (4 seconds) -- just for the tag queries. The page takes about 18 seconds to load so there are likely other issues and Gary might recommend additional changes, but this is one that must be fixed.


Note: I misread the explain file and attributed query load timing for the article list lookup to the tag lookup[it wasn't clear on the data dump if it applied to the query above or below the line...turns out it was below!]

The execution time to verify that a single article has no tags was 0.26 ms 

Not saying that this shouldn't be fixed[what if all 70,000 articles in that table were tagged in some manner?] - just being upfront that for this use case the missing index[or poorly designed query?] is not a factor for the excessive page load time.

Amy Stephen

unread,
Jul 12, 2013, 12:40:59 PM7/12/13
to joomla-...@googlegroups.com

On Friday, July 12, 2013 10:54:57 AM UTC-5, Mark Dexter wrote:
If I recall correctly, there are values from the source tables that we need in those queries, such as title, author, dates, published status, etc. We could have accomplished this without using the ucm table,

That is what I was describing - the mapping will provide that in a logical sense.
 
but it was much, much easier to get everything from the one table. For example, ordering the tagged items by title or date would have been difficult otherwise (requiring UNION for each content table, for example).

Sure - same problem as the search.
 

Since we wanted to move in that direction anyway and combine the existing components into one table, that was the decision. 

Right, all that makes sense - expect you and I both have years in the business so we understand that's not ever going to be a great approach. Already, I know that there will be holes in the replicated data if there is no tag.

For example, unless there is a tag, there is no data integrity processes between the two. When a tag comes off - the data stays there - but it is no longer updated. When a new article is added, without a tag, then no row is added to UCM. That alone means the existing table can't just be dropped both an update and insert process and field by field review would be required to move the data correctly. That reason alone means it will be better to migrate again.

Also, the truth is, using it for the integration right now is problematic. There is absolutely no way of knowing what plugins are doing to this data, for example. More than likely those plugins would be directly modifying the data rather than interfacing through an API, that's a common process in Joomla since there is no good CRUD API for interacting with data. So, whatever changes might be leaking in that way are not being carried forward. So even it's intended use to avoid the joins is not fool proof.

If I spent a few hours thinking about legitimate ways of updating the data that do not get passed through the tag helper - I am certain I could come up with other examples.

The problem starts to spread because as soon as there is a data source -- and one that is good for a specific reason, as you are saying -- consolidated data good for the tags page - good for searches -- developers begin to use it. So, the current situation is two piles of the same data - no way of ensuring with 100% confidence (or 80% confidence) that these data will stay in tact. Clear possibility of problems with the manner in the manner it is used now and not likely to be a great source of data for the manner it might be used in the future and developers are using it too - this problem starts to spread.


If you want to propose code to improve the query while preserving the existing functionality, that would be great.

+1
 
At this point, I think we will move forward from where we are. If we could get help moving the existing components over to the UCM tables and drop the separate component tables, that would be great and will eliminate the duplicate content in the db.

I understand, with respect, though I believe the duplicated data should be eliminated now in order to ensure a clean move to the new tables.

However, this is where more information would help. What does it mean to move existing components over to the UCM tables? What needs to be done to get rid of com_content, for example. Maybe it is easier to finish the migration but I have no idea what those steps are or what the plans are to finish the UCM implementation. Is there a road map or any communication that I can study to get up-to-date?
 


Mark Dexter

unread,
Jul 12, 2013, 12:46:42 PM7/12/13
to joomla-...@googlegroups.com
I believe it means dropping #__content, #__weblinks, #__contact_details, #__newsfeeds and using only #__ucm_content as the point of truth for the information in these tables. (Perhaps keeping the old names as views, dunno.) There is a link here: http://docs.joomla.org/Unified_Content_Model_Working_Group.

If you contact Elin, she probably has more information on the next steps. If someone could convert one component (say weblinks) that would be a good start. I believe the infrastructure for doing this is in place, although until we actually code it we can't be sure that everything we need is there. 

Mark

Amy Stephen

unread,
Jul 12, 2013, 12:47:00 PM7/12/13
to joomla-...@googlegroups.com
Like I said in the other thread, Gary, your conclusion was wrong that an index was needed but your sense of where the problem was correct.

The explain does point to the tag queries. The indexing is an issue. But, no new index is needed, rather indexes should be removed and the primary index already in place needs to be used. That means we move away from the 255 character field, remove those alternate indexes so they don't get selected and get those queries running on that primary index.  When you get a moment, fire up 3.1 and look at the new debugging tool - easier to read that explain data with the interface.

Thanks.

Amy Stephen

unread,
Jul 12, 2013, 1:47:54 PM7/12/13
to joomla-...@googlegroups.com
I'll drop Elin a note about the thread, hopefully, she'll have some time to respond here. It would be good to understand what needs to happen to drop those tables, as you are saying. I have been on that page and there is absolutely nothing there. So, unless there is another resource, I suppose we'll have to get the plan from Elin.

Thanks Mark.

Herman Peeren

unread,
Jul 12, 2013, 2:24:24 PM7/12/13
to joomla-...@googlegroups.com
The plan with UCM is not to drop those specific tables, but use the #__ucm_content table to store their common data (like author, creation date, locking info etc. etc.). The specific tables will still be used for data that is specific for that content type.

Some confusion, I think, has come from fields like core_images and core_body, which could be expected in the specific ("special") table. Probably that has been done because many content types need some images and have a larger text-field. Therefore for instance from com_contacts the address is stored in the common table (#__ucm_content), not the suburb-field. By putting more fields that can be generally used in the common table, some content types even won't need a specific table, so less joins are necessary.

Next week I hope to show a bit different solution for UCM without the common table... But I'm the only one thinking in that direction. The UCM workgroup is definitely going into the direction of a common table now, trying to implement that the coming weeks: before the feature freeze for Joomla 3.2 (because new features were not planned for 3.5, so it is now .... or 4.0).

Herman

Amy Stephen

unread,
Jul 12, 2013, 2:49:15 PM7/12/13
to Joomla! CMS Development
Herman - is there a roadmap I can review from your team? Or a link to discussions? Better yet, a repo where folks are working?

Thanks!

Sent note to Elin re: this thread.


Herman Peeren

unread,
Jul 12, 2013, 3:07:59 PM7/12/13
to joomla-...@googlegroups.com
Well, it's not exactly "my team" and as for as I know ther is no roadmap. There is some more info about the UCM specification, I'll post it here if not yet done by someone.

One of the conclusions at #jab13 was that the efforts to get some UCM, a RAD-layer (FOF) and RESTful webservices into Joomla! 3.x have quite some overlap (and also my main interest: ORM) . So there has been and still is some informal exchange of information, the FOF-workgroup made a REST/HAL implementation etc. The FOF-repo (development branch) is the best to see for that. For the rest most code is a bit scattered, some in a Kunena branch etc. Maybe not all plans will make it before the feature freeze (which was originally by July 15 and is probably postponed 'till the end of the month, as the FOF workgroup will be doing a code sprint at the end of this month in Italy).

I'm, as usual, a bit from the sideline, focussing more on ORM (this summer implementing a huge object model into Joomla with Doctrine2 for an international NGO, so that can be a big step forward).

- Herman

Amy Stephen

unread,
Jul 12, 2013, 3:31:33 PM7/12/13
to Joomla! CMS Development
Thanks Herman.


--
You received this message because you are subscribed to a topic in the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/joomla-dev-cms/MWzBgYd2Lgw/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to joomla-dev-cm...@googlegroups.com.

Herman Peeren

unread,
Jul 12, 2013, 3:36:10 PM7/12/13
to joomla-...@googlegroups.com
Some recent UCM specs:

The feature-UCM branch in the joomla-projects repo was planned to be used for implementation.
https://github.com/joomla-projects/joomla-cms/tree/feature-ucm/administrator/components

At the moment I see the most recent UCM-code in Joomla 3.1: /libraries/cms/ucm

Amy Stephen

unread,
Jul 12, 2013, 3:49:14 PM7/12/13
to Joomla! CMS Development
I'm not trying to be rude when I say that these links do not help me, they just don't have any real information or plans or specs. Basically, it says it's going to be unified, integrated, users will like it.

The repo is two months old - no new code - just an old copy of core.

There are some serious performance problems right now and I have described in some detail what is happening. What it gets down to is using tags means duplicate content, joins made on varchar(255) fields that are not indexed, redundant queries -- and massively slow results.

I have suggested these fixes:
1. change from the varchar joins to the integer key.
2. get rid of the ucm base and content tables UNTIL it's time to use them for wherever this is leading. Use the logical associations defined in the content types table to provide the integrated tag page.
3. get the queries directed to the clustered index (in part by removing some of the secondary indexes that are being used instead.)

But, it's not worth going through this effort if that work will not be accepted.

Mark has said Elin is the one. So, we'll see.

Meanwhile, understand -- people are having big problems today because of the UCM/Tags issues I have described in this post above and in the general thread.


--
You received this message because you are subscribed to a topic in the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/joomla-dev-cms/MWzBgYd2Lgw/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to joomla-dev-cm...@googlegroups.com.

Mark Dexter

unread,
Jul 12, 2013, 4:11:57 PM7/12/13
to joomla-...@googlegroups.com
Amy, it's up to the community to do this, not any one person. Elin has graciously volunteered to be the coordinator of the UCM working group. But that, as you well know, does not mean it is on her to be a one-person spec-writing, coding, bug-fixing machine. It's up to the community. 

The working groups are only as good as the people in them. I agree with you that it would be great to have some stuff written in one place (for example, on that wiki page or in a linked Google doc). Perhaps a great way to get started with the group and to be quite useful would be to pull together all the information that is floating around and organize it in one place. That way one could (a) get informed on where things stand and (b) make it much easier for others to catch up.

Mark

--
You received this message because you are subscribed to the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-cm...@googlegroups.com.

Herman Peeren

unread,
Jul 12, 2013, 4:12:39 PM7/12/13
to joomla-...@googlegroups.com
Yes, I've followed the "Slow-loading..."-thread. And I just answered some related tags/UCM-related problems on the Dutch Joomlacommunity forum. I completely agree with you. In fact the way tags are implemented, using so much of the original UCM idea  Louis proposed in december 2011, was accepted, because many people wanted tags and nobody else coded it. As you know Elin is a fan of UCM and she saw some possibilities and had some good reasons to use it. There had been some talking about UCM and there was a general positive feeling about it. We don't have a code review in Joomla, just style-rules. If I have a better idea I'll have to code it. No time = no code = not accepted. The code we have now is mainly Elin's idea of what the UCM should be. Because she was the only one having coded it.  It's just as simple as that.

I just wouldn't worry about if your code will be accepted or not. At the moment in the CMS almost any idea is accepted. If it is just coded. And if you have good reasons to change things (and I think you have), you should just do a pull request and after testing it will almost certainly be accepted. And maybe lobby a bit for people testing it.

Herman

Amy Stephen

unread,
Jul 12, 2013, 4:17:30 PM7/12/13
to Joomla! CMS Development
Wait a minute, Mark. You asked me to contact her. What's the problem?

Richard McDaniel

unread,
Jul 12, 2013, 4:23:00 PM7/12/13
to joomla-...@googlegroups.com
As someone that's been waiting three months for a tested bug fix to get approved, I disagree.

Herman Peeren

unread,
Jul 12, 2013, 4:24:27 PM7/12/13
to joomla-...@googlegroups.com
BTW, your suggested changes match nicely in some respects to what I hope to show next week (an UCM implementation using existing tables and without a #__ucm_content table).

On Friday, 12 July 2013 21:49:14 UTC+2, Amy Stephen wrote:

Mark Dexter

unread,
Jul 12, 2013, 4:26:57 PM7/12/13
to joomla-...@googlegroups.com
It's fine to contact her. Just don't "voluntell" her to do a bunch of stuff. Mark

Herman Peeren

unread,
Jul 12, 2013, 4:31:38 PM7/12/13
to joomla-...@googlegroups.com
On Friday, 12 July 2013 22:23:00 UTC+2, Richard McDaniel wrote:
As someone that's been waiting three months for a tested bug fix to get approved, I disagree.

OK, sorry, thanks for the info. That is sad. I based my words on the  observation that there is much accepted that is not all so very good. I only thougt getting enough (2) testers was the bottleneck. But your experience is a bit differnt. Then it might be that some people are more equal in this project than others... and that is sad.
 

Amy Stephen

unread,
Jul 12, 2013, 4:31:37 PM7/12/13
to Joomla! CMS Development
On Fri, Jul 12, 2013 at 3:12 PM, Herman Peeren <herman...@gmail.com> wrote:
Yes, I've followed the "Slow-loading..."-thread. And I just answered some related tags/UCM-related problems on the Dutch Joomlacommunity forum. I completely agree with you. In fact the way tags are implemented, using so much of the original UCM idea  Louis proposed in december 2011, was accepted



For clarity, none of the issues I am talking about are related to the UCM, more deviations from that plan.

I have no problem with the prototype they shared. If Joomla ends up implementing that, I think life will be good.

MOST of the code involved is very good but there are some problems.


GOOD STUFF

Tables: tags, content_item_tag_map, content_types

BAD STUFF

Tables: ucm_base and ucm_content (until they are in full use); the column type_alias in the contentitem_tag_map

I AM ASKING IF THE PROJECT WOULD ACCEPT THIS WORK:

1. Removes the type_alias column from the contentitem_tag_map table - it is 255 characters -- it is used as a key for a join -- damn thing is not even indexed.

2. Instead, join the contentitem_tag_map table to the content_types table on a nice numeric key named type_id -- in the 10 row content_type table is the 255 character field. THAT can be used for the join. =)

3. Remove the redundant data and usage for ucm_base and ucm_content UNTIL the ucm is ready

4. Instead, use the content_types table field mappings field to link to the original source

There are serious performance issues that these fixes should help to resolve. There will be other issues but these are some of the more important.


 

Mark Dexter

unread,
Jul 12, 2013, 4:36:38 PM7/12/13
to joomla-...@googlegroups.com
Maybe others are better and visualizing code than I am, but I can't really say until I would see a patch and have a chance to test it. As a general statement, I think it is very safe to say that if we can improve the query performance without changing the query results, how could anyone be against that? If we are talking about re-designing something and changing a lot of tables, that gets to be harder to say.

Mark

 

--
You received this message because you are subscribed to the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-cm...@googlegroups.com.

Herman Peeren

unread,
Jul 12, 2013, 4:38:07 PM7/12/13
to joomla-...@googlegroups.com
I understand you not wanting to code something that will not be accepted, but I am afraid it doesn't work that way. You code something and only after that it will be accepted (or not). For what I see now, I think these are brilliant ideas that should be coded as soon as possible. So one vote from one community member.

Amy Stephen

unread,
Jul 12, 2013, 4:39:11 PM7/12/13
to Joomla! CMS Development
On Fri, Jul 12, 2013 at 3:26 PM, Mark Dexter <dexter...@gmail.com> wrote:
It's fine to contact her. Just don't "voluntell" her to do a bunch of stuff. Mark

Where are you getting this? I have not asked Elin to do anything. The only involvement I have had with Elin is contacting her, as per your request, to point her to this thread. I copied you on the note.




 

Mark Dexter

unread,
Jul 12, 2013, 4:42:49 PM7/12/13
to joomla-...@googlegroups.com
I apologize. I got the impression that you were somehow (a) expecting that Elin would have all the answers and (b) that you couldn't do anything until/unless you heard from her. Both of these are incorrect. Contacting Elin is one possible step. If she is able to point you to some additional docs or provide some other help, that's great. If contacting Elin doesn't work out, there are other things you could do -- for example to follow up on the links that were posted earlier. I just don't want to "voluntell" Elin for something. Hopefully that clarifies it. Again, I apologize for misunderstanding your comments.

Mark

Herman Peeren

unread,
Jul 12, 2013, 4:53:36 PM7/12/13
to joomla-...@googlegroups.com
Let's keep things clean and not get personal too much. I expect that if you just propose to get rid of some half baked tables, which are somones "child", then that person could object to that. But if you just stick to the rational arguments (measured in seconds of querytime), then it will be no problem. Main thing we should try when proposing code is to keep tags working, for otherwise not many people will be entusiastic about it.

Amy Stephen

unread,
Jul 12, 2013, 4:58:45 PM7/12/13
to Joomla! CMS Development
No problem, Mark. But, again, the only reason I contacted Elin was because you suggested she could explain what was needed to address the duplicate content issue differently than I am proposing.

My suggestion is to remove the base and content UCM tables until those tables are needed and the others gone. Your point, as I understand it, was there are other plans that Elin might be ale to share. I believe your point was we could be close to dealing with the duplication with existing plans and that direction might be faster.

And, no, I'm sorry, but given the fact that the tables are in place now and that movement is in that direction, it would be rather rude, in my opinion, to put up code that I know will create problems for those who have been working in this area if it's not first discussed and agreed to be an appropriate direction.  Collaboration requires communication.

Thank you for understanding.


--
You received this message because you are subscribed to a topic in the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/joomla-dev-cms/MWzBgYd2Lgw/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to joomla-dev-cm...@googlegroups.com.

Mark Dexter

unread,
Jul 12, 2013, 5:07:50 PM7/12/13
to joomla-...@googlegroups.com
Actually, those tables are an integral part of the functioning of tags. I would be interested to see a detailed approach for doing tags with the current functionality without a unified table. The only approach I could come up with was using a number of UNIONs, which I think we would agree would be much slower than the present implementation. 

One could have done tags without a unified table and changed the functionality (for example, don't allow ordering and pagination across components). But I think, given the current functionality, the overall implementation is reasonable. 

I would like to say that I strongly agree with Herman's earlier point. With any feature like tags, there are many reasonable ways to implement it. The person who writes the code gets to make a lot of decisions about that. As long as the code works and is a reasonable solution to the problem, it will likely be accepted. In the case of Tags, Elin did the bulk of the coding, but there were several of us helping with it, including David Hurley, Robert Segura, Michael Babker, and myself. (I apologize if I'm forgetting anyone.)  I hope none of us would be opposed to improvements to the current implementation, as long as we don't break anything we have now.

We all fully expect that there will be much room to improve it as we move forward, and code that does this is most welcome.  

Mark

Herman Peeren

unread,
Jul 12, 2013, 5:10:40 PM7/12/13
to joomla-...@googlegroups.com
Today on a Dutch forum I helped a developer using that #__ucm_content table, so that is indeed a problem: once it is in the core, it will be used and that makes it harder to get rid of it. Maybe we let too much unfinished stuff just slip into Joomla (some specific UCM-implementation, a "new" MVC, etc.). Once it is in there it will be used and it is more difficult to get rid of it. Maybe we should leave some things in a development branche until it is mature enough.

Amy Stephen

unread,
Jul 12, 2013, 5:19:45 PM7/12/13
to Joomla! CMS Development
On Fri, Jul 12, 2013 at 4:07 PM, Mark Dexter <dexter...@gmail.com> wrote:
Actually, those tables are an integral part of the functioning of tags. I would be interested to see a detailed approach for doing tags with the current functionality without a unified table. The only approach I could come up with was using a number of UNIONs, which I think we would agree would be much slower than the present implementation. 
 
I explained the approach I would use several times above. I would use the content types table to get the table name and I would logically map to the data using the mapping data. It would be a solution more the search.

I understand -- and even share -- a desire for more, but the reality is, while the current approach is ambitious, it is not sound. I explained real problems that exist right now with the duplicate data above. For me, the integrity of the data is much more important than returning weblinks and articles in a single call.

When the UCM is in place, then that integration can be shared. Until then, we live in the world of reality, what can I say?


One could have done tags without a unified table and changed the functionality (for example, don't allow ordering and pagination across components). But I think, given the current functionality, the overall implementation is reasonable. 

I can accept we feel differently about that. I can even agree that you might be right. Now, do you understand why I am asking if the project will accept this work? What sense is there for me to fix these issues if it's not going to be considered?
 

I would like to say that I strongly agree with Herman's earlier point. With any feature like tags, there are many reasonable ways to implement it. The person who writes the code gets to make a lot of decisions about that.

OK, well, I'll go write the code then. Be right back.

Herman Peeren

unread,
Jul 12, 2013, 5:19:59 PM7/12/13
to joomla-...@googlegroups.com
On Friday, 12 July 2013 23:07:50 UTC+2, Mark Dexter wrote:
Actually, those tables are an integral part of the functioning of tags. I would be interested to see a detailed approach for doing tags with the current functionality without a unified table. The only approach I could come up with was using a number of UNIONs, which I think we would agree would be much slower than the present implementation. 

No, sorry Mark, I don't just agree with that. A combination of UNION's and some clever caching might be much more efficient. I'll code some things so we can measure the difference. For we can only be sure if we measure it. And the measurements of those queries in the other thread are a warning something is very bad now. I'll stop the talking. PHP is the only language we need here. Let's rethink some of that UCM now we still have time...
 

Michael Babker

unread,
Jul 12, 2013, 5:27:32 PM7/12/13
to joomla-...@googlegroups.com
Folks, please keep in mind also that this API needs to support three separate database drivers (MySQL, PostgreSQL, and SQL Server).  What may optimize things for MySQL may break PostgreSQL compatibility, and that's an issue we've been dealing with in the CMS since the introduction of this level of database compatibility.


 

--

Amy Stephen

unread,
Jul 12, 2013, 5:41:22 PM7/12/13
to Joomla! CMS Development
On Fri, Jul 12, 2013 at 4:27 PM, Michael Babker <michael...@gmail.com> wrote:
Folks, please keep in mind also that this API needs to support three separate database drivers (MySQL, PostgreSQL, and SQL Server).  What may optimize things for MySQL may break PostgreSQL compatibility, and that's an issue we've been dealing with in the CMS since the introduction of this level of database compatibility.


Michael - if there is anything I have recommended that would create this concern, please share those specifics. Otherwise, understand I assume this does not relate to what I am suggesting.

Michael Babker

unread,
Jul 12, 2013, 6:20:31 PM7/12/13
to joomla-...@googlegroups.com
I haven't seen anything specific as of yet, it's just more of a general reminder than anything.
--
You received this message because you are subscribed to the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-cm...@googlegroups.com.
To post to this group, send an email to joomla-...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-cms.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
- Michael

Please pardon any errors, this message was sent from my iPhone.

Amy Stephen

unread,
Jul 12, 2013, 6:34:10 PM7/12/13
to Joomla! CMS Development
The group by is the one I know of, we might be able to at least check if it's a MySQL implementation then hold it out.

Thanks Michael.


--
You received this message because you are subscribed to a topic in the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/joomla-dev-cms/MWzBgYd2Lgw/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to joomla-dev-cm...@googlegroups.com.

Amy Stephen

unread,
Jul 13, 2013, 3:17:55 AM7/13/13
to joomla-...@googlegroups.com
Patch 1 - Type Alias column in contentitem_tag_map table

https://github.com/AmyStephen/joomla-cms/commit/82c3af96a01649a62c5cd6460b0fd00bfbefe5

Change:

1. Removed the contentitem_tag_map.alias_type 255 character column
2. Changed all relevant queries to join to the content_types table, join to contentitem_tag_map on type_id, use alias_type already in the content_types table

These three tables are the primary tables used in tag processing:

1. tags_table - 1 row per tag with metadata
2. contentitem_tag_map - 1 row per item / tag -- joins to tags_table on tag_id; joins to content_types table on type_id
3. content_types table - 1 row per component, links to the tag_map table on type_id, alias maps to component,

No changes in functionality.

This tool Beat shared is a life saver -- you can see from the image that the primary key for that query was elected. http://twitpic.com/d2cwkf

Problem:
Queries used contentitem_tag_map.alias_type, a 255 character field that had no index, to join to tags and content.
This resulted in the primary index not being selected and "table scans" when linking tags and content.


A table scan is a situation where two sets of data must individually be compared to each item in the another to see if a join situation results. This has a multiplying effect called a Cartesian product.

In most websites, these problems are not recognizable given the low volume of data. If there are 100 articles and 3 tags, a Cartesian product results in 300 comparisons. Given the processing speed of most hardware, the human mind will not notice this. It might not even be a problem with 1,000 articles and 100 tags, or 100,000 comparisons.

But when you have 40,000 rows and 3 tags per row (120,000 map items) that means 4,800,000,000 comparisons. Now, it's impossible not to notice the page load.

Considering the tag query can run several times on the page, this can seriously impact large sites. (Note: caching obviously is going to help, but not the first visitor for a page.)

When indexing data, there are two types of indexes: a primary index which defines how the data is written to the harddrive,  and, a secondary indexes which links together data that is spread out on the machine.

When working with large data sets and listing data, the only index that is ever going to be helpful is the primary index. The read starts at one location and continues until complete. With a secondary index, each read requires determining the location of the next item, so it's get the data, figure out what's next, get the next data, figure out what's next. Much slower.

Rarely do you want a secondary index to be used. The data must be highly selective, and the query must retrieve a small result set for the index to be helpful. So, for example, using the primary key is the highest selectivity possible. In that case, a secondary index is good. Gender, on the other hand, has two values, so it would have poor selectivity on a large employee table. The important point is - you can slow queries down with indexes if the RDBMS is "tricked" into thinking selectivity is high and the query results will be low but the reverse is actually true.

When working with enormous datasets, something I have spent years of my life doing with data warehousing, a large part of introducing new data is to help instruct a RDBMS about your data to improve it's query path determination. There are some very complex strategies to tune performance. But, most of that time is training it to take the primary path.

The best way to build in performance is to focus on the primary index and make certain your queries for that data use joins that consistently hit the primary index from left to right without no missing columns.

In the case of the `jos_contentitem_tag_map` table -- a 255-character field was added and that is used in nearly every query. There is no index on that column. It resulted in a carteasn product when querying. That is as slow as it will ever get.

The primary index for `jos_contentitem_tag_map` is `uc_ItemnameTagid` (`type_id`,`content_item_id`,`tag_id`). The type_id is the numeric value that goes with the tag_alias character field, but none of the queries were using it. Given that is the first column of the primary index, right away we know a RDBMS is not going to use that index, nor would we want it to since we don't join on that field.

In order to train a database to take that primary index, we need to use type_id, content_item_id, and tag_id in the joins. This patch trades out the 255-character field for the numeric value, all queries now use those three fields. I need to go through each query and prove it is using that primary index but my early testing says it is doing so.

Over the years I have raised a number of issues on the database and performance with the project and to be honest, it has been very hard to get good response. I don't think I have shared much about the specifics of my experiences and I hope that this lengthy description will serve to clarify this is an area I have spent much of my professional career. I am good at this work. I am trying to help.

I started with this specific problem because it has the least impact on approach, it's logical, it should make sense and be accepted as helpful. So, chances of getting it in are higher. There's more to discuss but we'll start here.

Amy Stephen

unread,
Jul 13, 2013, 3:30:05 AM7/13/13
to joomla-...@googlegroups.com
One last point, you might notice, the patch removes a secondary index on type_id

Since type_id is the first column of the primary index, a secondary index on that value can do nothing but create a problem. For queries on that data element, there will never be a better choice than the primary index.

I will likely recommend more secondary index removal following more testing.

https://github.com/AmyStephen/joomla-cms/commit/82c3af96a01649a62c5cd6460b0fd00bfbefe5#L0L44

Herman Peeren

unread,
Jul 13, 2013, 4:58:20 AM7/13/13
to joomla-...@googlegroups.com
I was thinking in an opposite direction for one detail for some time (not only for UCM's content_type table but also for ACL's assets table and probably idem for our "smart search").

For an integer value will have a slightly better performance than a 255 varchar, but if you would first need a lookup in a table to find the varchar-key, then the performance gain is anulled. The varchar key is a "natural" key, known in some situations (a component knows its own name for instance). So it depends on the context when that numerical key is quicker overall. Often the performance gain of a numerical key is in retrieving the joined information and the trade off is to have a bit worse performance while storing that information. That would plead for your proposed solution to use a numeric key, as the performance of retrieval of large amounts of information is more important than the performance of storing it: storage mostly involves a single item and a few tags.

Maybe we should discriminate between some internal use of a key and the use of a key in an API. For the assets table we can limit the use of the asset_id to internal use and simpify the API of it. I'm not sure if this tags-thing is the same situation... figure it out. Hope it is not too vague what I'm saying here. Will look for some concrete code-examples.

- Herman

Herman Peeren

unread,
Jul 13, 2013, 5:10:57 AM7/13/13
to joomla-...@googlegroups.com
Another UCM-link, not that there is so very much to read, but there were some discussions and to be more complete:
https://groups.google.com/forum/#!forum/joomla-ucm-pwg

Chad Windnagle

unread,
Jul 13, 2013, 8:45:16 AM7/13/13
to joomla-...@googlegroups.com
Amy:

Awesome patch, even more awesome explanation. I really enjoyed reading that. I hope we get an item in the tracker so this change can be tested and pulled into the core. 
--
You received this message because you are subscribed to the Google Groups "Joomla! CMS Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joomla-dev-cm...@googlegroups.com.
To post to this group, send an email to joomla-...@googlegroups.com.
Visit this group at http://groups.google.com/group/joomla-dev-cms.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
Regards,
Chad Windnagle

Amy Stephen

unread,
Jul 13, 2013, 11:55:08 AM7/13/13
to Joomla! CMS Development
On Sat, Jul 13, 2013 at 3:58 AM, Herman Peeren <herman...@gmail.com> wrote:
I was thinking in an opposite direction for one detail for some time (not only for UCM's content_type table but also for ACL's assets table and probably idem for our "smart search").

The ACL issue is very different, that has relationship issues.

The UCM content_types table is correctly normalized.
 

For an integer value will have a slightly better performance than a 255 varchar, but if you would first need a lookup in a table to find the varchar-key, then the performance gain is anulled

That's part of the picture but there is more to the story.

First, let's agree to use tools that measure results for us. That way, our opinions on data modeling and denormalization vs normalization techniques can be just that -- but EXPLAIN will tell us like it is. So, all things measured.

Second, your thinking needs a couple of other points.

1. The size of the joined table impacts performance. The smaller the table, the smaller the impact. In this case, 10 rows are nothing. After the first read, this data is in MySQL buffers and it stays there till it's done.

2. The total time required to evaluate join. Compare joining 10 rows for 255 character values to 40,000 - 255 character values. See? A little more swing in your step that way, right? That's the impact.

 
. The varchar key is a "natural" key, known in some situations (a component knows its own name for instance).

Of course! That's why I only change to the key in the joins. Please review the commit to see. Humans need natural keys - Machines like numbers.

Throughout the application, the natural key is used. It is passed into all methods. It is split apart for the URL. Only during the join do I use the numeric key.
 
So it depends on the context when that numerical key is quicker overall.
Often the performance gain of a numerical key is in retrieving the joined information and the trade off is to have a bit worse performance while storing that information.

In data warehousing speak, that is called denormalization. But, denormalization is for viewing and reporting. The joins still benefit from numeric values.
 
That would plead for your proposed solution to use a numeric key, as the performance of retrieval of large amounts of information is more important than the performance of storing it: storage mostly involves a single item and a few tags.

Maybe we should discriminate between some internal use of a key and the use of a key in an API.

I would propose we always discriminate on the basis of correct results first, followed by mechanically produced performance metrics. Then there is no room for debate. Agree?
 
For the assets table we can limit the use of the asset_id to internal use and simpify the API of it. I'm not sure if this tags-thing is the same situation... figure it out.
 
Please, please, please, please - look at the patch. Your concerns are not related to the patch. I am doing what you are suggesting is needed. I am not exposing numeric values outside of a query. All interfaces with human beings, including developers and the API, use the natural key. We're good, right? Check the patch.
 
Hope it is not too vague what I'm saying here. Will look for some concrete code-examples.

Assets are a whole different issue. It is a very different issue and not something I am prepared to discuss today.

Stay tuned - there are number of little fixes I'd like to roll past people. Please keep reviewing - need your mind.
 

Amy Stephen

unread,
Jul 13, 2013, 11:59:20 AM7/13/13
to Joomla! CMS Development
Thanks Chad -- I'm hoping that a little explanation helps. It would be so great if we could use tools like Beat provided us and then we have common goals automatically and clear ways to describe if it is better or not. Should save us some heartache. Beat might not be fully aware of just how much he helped performance with just sharing a tool that measures individual queries. I believe we'll look back and realize how helpful that was.

Hope to finish planning the testing for the patch today. Got the next patch  specked out - it'll be a little more tricky to get everyone to rally around -- but I think we can figure it out.

There is no reason Joomla can't be smoking fast.

Herman Peeren

unread,
Jul 13, 2013, 1:21:33 PM7/13/13
to joomla-...@googlegroups.com
Hi Amy,

To be clear: I was not giving any critique to your patch, I just looked at it from a different angle: you look at performance and at the level of database operations, I merely look at the API. I look at the black box, you are looking inside it. Of course, it is very necessary to look at how things are implemented, at the lowest level. And you're doing a great job at improving performance. My main interest is the overall architecture. Those two are not opposite, but I think it is good to split those two views.

From the outside the problem of an asset_id versus an asset_name is in so far similar to this issue,that the asset_name is a "natural" key, similar to that type_alias. I think it should be used as much as possible for the API, for communication with everything outside the ACL (or the UCM whentalking about the type_alias). If internally things are going smoother with an integer-key, that is something inside the black box. What I see now, is that we are often using such an internal optimisation in the API. That troubles things a lot. It makes implementing ACL in your extensions (and custom extensions are my focus) needlessly complicated (specifying a parent asset_id etc.). The same holds for the UCM, hence my comparison. At the database-level it might be a totally unrelated problem, but at the API-level I see some relations.

The way many things in Joomla!, like UCM and ACL, are implemented, is starting with a data-structure and then adding software to operate upon it. It is the classic way, like we learned it 30 years ago. I try to train myself in thinking from an API and only after that looking at an implementation (and last: at a datastructure). That's why I said at my Doctrine presentation at jab13 about UCM: "you could, I dont say you should, but you could also implement an UCM without a separate common table". It is more a matter of bundling the common behaviour than the common data. The mapping of the persistent data is a seperate thing from the implementation of a common behaviour.

One last remark to your writing: two things have changed a bit in our thinking about databases the last 10 years:

1. normalisation is not always holy. Putting JSON in a filed can be useful (like it can be useful to put an unindexed field in a MongoDB document). It really depends on the situation.
2. redundancy was a dirty word in "our" days, but can now be very handy to boost performance and scaling (as long as data integrety is guaranteed).

This all not as critique or rant or negativism, just my honest views coming from my daily experiences. Keep up the good work!

Cheers,
Herman

Amy Stephen

unread,
Jul 13, 2013, 3:33:59 PM7/13/13
to Joomla! CMS Development
On Sat, Jul 13, 2013 at 12:21 PM, Herman Peeren <herman...@gmail.com> wrote:

From the outside the problem of an asset_id versus an asset_name is in so far similar to this issue,that the asset_name is a "natural" key, similar to that type_alias. I think it should be used as much as possible for the API, for communication with everything outside the ACL (or the UCM whentalking about the type_alias). If internally things are going smoother with an integer-key, that is something inside the black box. What I see now, is that we are often using such an internal optimisation in the API. That troubles things a lot. It makes implementing ACL in your extensions (and custom extensions are my focus) needlessly complicated (specifying a parent asset_id etc.). The same holds for the UCM, hence my comparison. At the database-level it might be a totally unrelated problem, but at the API-level I see some relations.
 

Herman - I want to know you and I agree or that it is clear how we don't agree.

I believe this patch is doing what you are saying. But, if I am wrong about that, then, I need to understand specifically where the problem is. So, I'm going to walk you through examples from the patch - and your comments.


To get tags for an article, the API is implemented as thus:

The developer continues to use the natural key of type_alias, a concatenation of the option view. The article ID and the type alias are input to the method.

$type_alias = $this->input->option . '.' . $this->input->view;

$helper = new JHelperTags();
$tags = $helper->getTagIds($this->row->article_id, $type_alias);


Buried within the bowels, the underlayer where most dare not tread - you'll see a query like this (come on click me, you know you want too):

Note two things:

Point 1: the "natural key" continues to be used for the selection - that is what the dev sends in - that is what the API requires. I did not change the API - I changed the table the natural key links too.
->where($db->quoteName('ct.type_alias') 
.
' = ' . $db->quote($type_alias))

Point 2: the numeric key value is never identified, the numeric key is only used to join - like this: I added this join. But, it is not visible to the developer. It is not input to the API. It is hidden.
->where($db->quoteName('ct.type_id') 
. ' = ' . $db->quoteName('m.type_id'))

So, when you say this:


the asset_name is a "natural" key, similar to that type_alias. I think it should be used as much as possible for the API, for communication with everything outside the ACL (or the UCM when talking about the type_alias).

My response is: I agree and that what the API requires, and I did not change it. The API signature and my query change continues to support the natural key, right?
public function getTagIds($article_id, $type_alias)

->
where($db->quoteName('ct.type_alias')
.
' = ' . $db->quote($type_alias))


When you say this:

If internally things are going smoother with an integer-key, that is something inside the black box.

My response is: Yes, exactly, that's why I bury those things in the black box.


BUT, when you say this:

What I see now, is that we are often using such an internal optimisation in the API. That troubles things a lot. It makes implementing ACL in your extensions (and custom extensions are my focus) needlessly complicated (specifying a parent asset_id etc.). The same holds for the UCM, hence my comparison. At the database-level it might be a totally unrelated problem, but at the API-level I see some relations.


My response is: I have no idea what you are talking about. The UCM API does use the natural key. The patch supports that and only uses the numeric value in a hidden way. I want so much to understand what you see *specifically* as a concern so that I can look at it, think about it, and then let you know how I see it or changes I made because of your point. But, I can't understand how those words relate to the patch or the UCM. That's why I am asking that you look at the patch -- if I have a link -- or a line number -- or a coded example like I provided above -- then, I understand (or at least can ask follow up questions.)


Now, in all honesty, Herman, all I have *only* looked at is the little bit that goes with Tags. Overall, I think it looks pretty good. This one field join thing was added to the map and content detail tables and I disagree with that. I also have concern about a state issue that I will raise later. And, I have concern about the data duplication that I will raise later, too, but am waiting to hear from Elin what the plans are for finishing the migration to see if it would be quicker to get rid of duplication by going forward, as she is proposing, or if we should get rid of the UCM duplication, like I am proposing. 

Other than those things, I don't have a lot of concern and certainly do not see any parallels at all to the ACL. Maybe I have to see more of it, I don't know?

OK, please forgive my specificness but I do want to get your points into useful form for the patch -- I realize in part you are sharing an approach, but I want to make certain you agree the patch is inline with your thinking.


Herman Peeren

unread,
Jul 13, 2013, 6:00:29 PM7/13/13
to joomla-...@googlegroups.com
Ah, I see... I think some misunderstanding was about the term "UCM API". That is not the same as "Tags API".

You are completely right that your patch leaves the Tags API completely unchanged. It uses the "natural key" (= $prefix = type_alias = the varchar key you were talking about) when using it from outside the Tags-blackbox. No misunderstanding about that and as far as people would be concerned about changing the API for tags: no, you don't change any Tags API at all. We agree upon that.

My concern was about the UCM API. Now that is a bit vague term, for that API is still in development, in fact still has to be invented... The Tags extension uses the UCM. I see them as two different things. Much of that UCM is now inside the Tags extension, but it still has to be separated. Not meant with any negativity, I have a smile on my face when saying this, but in a way the Tags were used to "smuggle" the UCM on board and once there, it has to be "unpacked" to stand on its own legs. There is no doubt you pointed at something important that the "natural key" of the content_type should be avoided in those JOINs, but I think that problem should be solved in the UCM extension. Once it stands on its own, the Tags extension (component, helpers and the whole shebang) should use the UCM API. That means: no type_id outside the UCM and so no type_id inside the Tags. But that is a step further; at the moment Tags is still pregnant of UCM, she is bearing this little baby in her womb, umbilical cord still attached. Your proposal is: optimise that situation and don't wait 'till it is born. Another approach would be to first seperate the UCM now a bit more from the Tags (premature birth?), define a UCM API and use that inside the Tags (and elsewhere). The low level joins etc. would still be optimised as you suggest.

The idea, a month ago, was to get the UCM basically up and running before the feature freeze of 3.2, work that out and stabilising that for 3.5. Part of your proposed optimisation would then probably be part of UCM and part of Tags. But if we don't succeed in seperating those two and defining a UCM API very, very soon, then your proposal to "Remove the redundant data and usage for ucm_base and ucm_content UNTIL the ucm is ready" is probably better.

kisswebdesign

unread,
Jul 13, 2013, 7:23:22 PM7/13/13
to joomla-...@googlegroups.com
Great work Amy, and clear explanations too - nice!

That new debugging environment by Beat looks really good, can't wait to try it out myself.

Chris.
Reply all
Reply to author
Forward
0 new messages