Refactoring request: TiddlyWiki.prototype.getTags

4 views
Skip to first unread message

Saq Imtiaz

unread,
Aug 24, 2006, 7:00:10 AM8/24/06
to TiddlyWikiDev
Hey guys,
I've got another refactoring request. The code for TiddlyWiki.prototype.getTags has some code that does what I call an incremental push.
(For a given array where each member is another array in which [0] is the value and [1] the number of occurances.) I'd like to suggest factoring out this code into a separate function so it can be re-used.

With the new extended fields with TW 2.1, it can be useful to treat fields like tags, and write functions like getFields which behaves similar to  getTags but for a given field. And in such cases, it will be handy to be able to re-use the above mentioned code to get not only an array of field values, but the number of occurances as well. I realize this isnt going to be an oft ocurring scenario, but there is nothing to loose by factoring it out. If it's not a good idea for some reason, I'll accept that. But I figured I should at least bring it up.

Here is the refactored code:

Array.prototype.incrementalPush = function (value)
{
         var f = false;
         for(var c=0; c<this.length; c++)
         if(this[c][0] == value)
               {
                f = true;
                this[c][1]++;
               }
         if(!f)
               this.push([fieldValue,1]);
}


TiddlyWiki.prototype.getTags = function()
{
    var results = [];
    this.forEachTiddler(function(title,tiddler) {
        for(var g=0; g<tiddler.tags.length; g++)
            {
            var tag = tiddler.tags[g];
            results.incrementalPush(tag);
            }
        });
    results.sort(function(a,b) {return a[0].toLowerCase() < b[0].toLowerCase() ? -1 : (a[0].toLowerCase() == b[0].toLowerCase() ? 0 : +1);});
    return results;
}

Of course, since I wrote the code, it goes without saying that we probably need a better function name than incrementalPush.
Cheers,
Saq

Kalle Alm

unread,
Aug 24, 2006, 7:24:20 AM8/24/06
to Tiddly...@googlegroups.com
Hi,

Pardon for taking this possibly slightly off-topic, but is there a
reason why this code is not using maps á [tag : occurances]?

JS has some wonderful ways to manipulate maps like the above, which
would provide a natural way to map a key (tag) to a value (occurances).
Rewriting one of the functions using the getTags() feature, it might
look like this (untested):

config.macros.allTags.handler = function(place,macroName,params)
{
var tags = store.getTags();
var theDateList =
createTiddlyElement(place,"ul",null,null,null);
if(tags.length == 0)

createTiddlyElement(theDateList,"li",null,"listTitle",this.noTags);
for (var t in tags)
{
var theListItem
=createTiddlyElement(theDateList,"li",null,null,null);
var theTag = createTiddlyButton(theListItem,t + " (" +
tags[t] + ")",this.tooltip.format([t]),onClickTag);
theTag.setAttribute("tag",t);
}
}

The actual getTags() code might look like this:

// Return an array of all the tags in use. Each member of the array is
another array where [0] is the name of the tag and [1] is the number of
occurances


TiddlyWiki.prototype.getTags = function()
{
var results = {};
this.forEachTiddler(function(title,tiddler) {
for(var g=0; g<tiddler.tags.length; g++)
{
var tag = tiddler.tags[g];

var f = false;
if (results[c])
results[c]++;
else
results[c] = 1;
}
});
return results;
}

The above would of course require refactoring in all places where
getTags() is used, but it may be worth it. Or I'm missing something. :)

-Kalle.

Saq Imtiaz

unread,
Aug 24, 2006, 7:38:50 AM8/24/06
to Tiddly...@googlegroups.com
On 8/24/06, Kalle Alm <kall...@gmail.com> wrote:

Hi,

Pardon for taking this possibly slightly off-topic, but is there a
reason why this code is not using maps á [tag : occurances]?

The above would of course require refactoring in all places where
getTags() is used, but it may be worth it. Or I'm missing something. :)

Hey Kalle,
I am not sure as to why we arent using maps for this, but I  dont think its worth the change now. The benefits are minimal at best,  and not only would all the macros in the core that use the getTags function have to be changed, but it would also break a lot of plugins out there. Considering that maintaining backwards compatibility is always a big priority, I would think this is a no go. But lets wait and see what Jeremy has to say.
Cheers,

Saq

-Kalle.



Udo Borkowski

unread,
Aug 24, 2006, 7:43:05 AM8/24/06
to Tiddly...@googlegroups.com
Hi Saq,

as you mentioned the getTags function I had a more detailed look at the original function and noticed it is a quite "expensive" function (in terms of speed) because it involves three nested loops.

So we should optimize this first. One way to accomplish this is using a hash instead of an array to collect the result. The code may then look like this:
// Return an array of all the tags in use. Each member of the array is another array where [0] is the name of the tag and [1] is the number of occurances
TiddlyWiki.prototype.getTags = function()
{
    var tagCount = {};

    this.forEachTiddler(function(title,tiddler) {
        for(var g=0; g<tiddler.tags.length; g++)
            {
            var tag = tiddler.tags[g];
            var n = tagCount[tag];
            tagCount[tag] = 1+(n?n:0);
            }
        });
    var results = [];
    for(var tag in tagCount)
        results.push([tag,tagCount[tag]]);

    results.sort(function(a,b) {return a[0].toLowerCase() < b[0].toLowerCase() ? -1 : (a[0].toLowerCase() == b[0].toLowerCase() ? 0 : +1);});
    return results;
}
I would then extract this code into a new class "MultiSet" (i.e. a set that may contain an item more than once, sometimes also called "Bag").

function MultiSet()
{
    var items = {};

    this.add = function(item)
    {
        var n = items[item];
        items[item] = 1+(n?n:0);
    };
    
    this.toArray = function()
    {
        var result = [];
        for(var i in items)
            result.push([i,items[i]]);
        return result;
    };       
};

With this new class the getTags function would look like this:

// Return an array of all the tags in use. Each member of the array is another array where [0] is the name of the tag and [1] is the number of occurances
TiddlyWiki.prototype.getTags = function()
{
    var tags = new MultiSet();

    this.forEachTiddler(function(title,tiddler) {
        for(var g=0; g<tiddler.tags.length; g++)
            tags.add(tiddler.tags[g]);
        });
    var results = tags.toArray();

    results.sort(function(a,b) {return a[0].toLowerCase() < b[0].toLowerCase() ? -1 : (a[0].toLowerCase() == b[0].toLowerCase() ? 0 : +1);});
    return results;
}
(all code is untested)

Udo

Kalle Alm

unread,
Aug 24, 2006, 7:54:04 AM8/24/06
to Tiddly...@googlegroups.com
On Thu, 2006-08-24 at 13:38 +0200, Saq Imtiaz wrote:

> Hey Kalle,
> I am not sure as to why we arent using maps for this, but I dont
> think its worth the change now. The benefits are minimal at best, and
> not only would all the macros in the core that use the getTags
> function have to be changed, but it would also break a lot of plugins
> out there. Considering that maintaining backwards compatibility is
> always a big priority, I would think this is a no go. But lets wait
> and see what Jeremy has to say.

Hrm, no, I think you're right on there. If a lot of plugins are using
getTags and the return value changes completely, it's definitely not
worth the trouble.

-Kalle.

Saq Imtiaz

unread,
Aug 24, 2006, 7:54:47 AM8/24/06
to Tiddly...@googlegroups.com
On 8/24/06, Udo Borkowski <Udo.Bo...@gmx.de> wrote:
Hi Saq,

as you mentioned the getTags function I had a more detailed look at the original function and noticed it is a quite "expensive" function (in terms of speed) because it involves three nested loops.

So we should optimize this first. One way to accomplish this is using a hash instead of an array to collect the result.

That looks very good Udo. It seems like a good compromise since it somewhat implements Kalle's and your suggestion for better performance, maintains backwards compatibility, and satisfies my requests as well in terms of refactoring. So this definitely gets a thumbs up from me.

I'll test the code and report back.
Cheers,

Saq

Udo Borkowski

unread,
Aug 24, 2006, 7:54:28 AM8/24/06
to Tiddly...@googlegroups.com
Hi Kalle,

because of compatibility issues it is not possible to change the signature of the TiddlyWiki.prototype.getTags and return a hash/map instead of the array (this would break all plugins currently using the function).

Nevertheless it is possible to perform the optimization, and convert the map to an array (and sort it) before it is returned by getTags. (See my recent post that I send before reading your suggestion).

We may even consider introducing a new function (e.g.) TiddlyWiki.prototype.getTagsCount that returns the hash/map (as suggested). The getTags function would just call getTagsCount and does the conversion and sorting:

// Return an array of all the tags in use. Each member of the array is another array where [0] is the name of the tag and [1] is the number of occurances
TiddlyWiki.prototype.getTags = function()
{
    var tagCounts = this.getTagsCount();
    var result = [];
    for(var i in tagCounts)
        result.push([i,tagCounts[i]]);
    result.sort(function(a,b) {return a[0].toLowerCase() < b[0].toLowerCase() ? -1 : (a[0].toLowerCase() == b[0].toLowerCase() ? 0 : +1);});
    return result;
}


Udo

Saq Imtiaz

unread,
Aug 24, 2006, 7:58:48 AM8/24/06
to Tiddly...@googlegroups.com
On 8/24/06, Udo Borkowski <Udo.Bo...@gmx.de> wrote:

function MultiSet()
{
    var items = {};

    this.add = function(item)
    {
        var n = items[item];
        items[item] = 1+(n?n:0);
    };
    
    this.toArray = function()
    {
        var result = [];
        for(var i in items)
            result.push([i,items[i]]);
        return result;
    };       
};

With this new class the getTags function would look like this:

// Return an array of all the tags in use. Each member of the array is another array where [0] is the name of the tag and [1] is the number of occurances
TiddlyWiki.prototype.getTags = function()
{
    var tags = new MultiSet();

    this.forEachTiddler(function(title,tiddler) {
        for(var g=0; g<tiddler.tags.length; g++)
            tags.add(tiddler.tags[g]);
        });
    var results = tags.toArray();

    results.sort(function(a,b) {return a[0].toLowerCase() < b[0].toLowerCase() ? -1 : (a[0].toLowerCase() == b[0].toLowerCase() ? 0 : +1);});
    return results;
}
(all code is untested)

Not anymore. I've  tested it and it works well.
Cheers,
Saq

Udo Borkowski

unread,
Aug 24, 2006, 8:00:31 AM8/24/06
to Tiddly...@googlegroups.com
I added a ticket (http://trac.tiddlywiki.org/tiddlywiki/ticket/145) for this issue.

Udo

Saq Imtiaz

unread,
Aug 24, 2006, 8:06:12 AM8/24/06
to Tiddly...@googlegroups.com
On 8/24/06, Udo Borkowski <Udo.Bo...@gmx.de> wrote:


We may even consider introducing a new function (e.g.) TiddlyWiki.prototype.getTagsCount that returns the hash/map (as suggested). The getTags function would just call getTagsCount and does the conversion and sorting:

If we do this, can we still extract the MultiSet for re-use please?
Thanks for adding the ticket!
Cheers,

Saq

Kalle Alm

unread,
Aug 24, 2006, 8:07:06 AM8/24/06
to Tiddly...@googlegroups.com
Udo,

On Thu, 2006-08-24 at 13:54 +0200, Udo Borkowski wrote:
> Hi Kalle,
>
> because of compatibility issues it is not possible to change the
> signature of the TiddlyWiki.prototype.getTags and return a hash/map
> instead of the array (this would break all plugins currently using the
> function).

Yeah, I didn't quite realize that there were a lot of plugins using the
function out there. Now I know. :)

> Nevertheless it is possible to perform the optimization, and convert
> the map to an array (and sort it) before it is returned by getTags.
> (See my recent post that I send before reading your suggestion).
>
> We may even consider introducing a new function (e.g.)
> TiddlyWiki.prototype.getTagsCount that returns the hash/map (as
> suggested). The getTags function would just call getTagsCount and does
> the conversion and sorting:

I think that's a great idea (getTagsCount). It would let existing
plugins choose whether to convert to using getTagsCount which would be
more optimized, or choose to stay with the regular getTags.

-Kalle.


Jeremy Ruston

unread,
Aug 24, 2006, 10:14:07 AM8/24/06
to Tiddly...@googlegroups.com
To answer Kalle's original question, the reason that I used an array
and not a map was because I wanted the tags to have a sort order.

The refactoring discussion is cool, and I back Udo's suggestions.
Probably won't get this into beta 5 though.

Cheers

Jeremy

On 8/24/06, Kalle Alm <kall...@gmail.com> wrote:
>


--
Jeremy Ruston
mailto:jer...@osmosoft.com
http://www.tiddlywiki.com

Reply all
Reply to author
Forward
0 new messages