Seeking way to *enlist* but with duplicate preservation

105 views
Skip to first unread message

David Nebauer

unread,
Apr 24, 2019, 8:09:20 AM4/24/19
to TiddlyWiki
I'm processing dictionary tiddler rows that each contain a sequence of values I need to handle individually. I'd love to use enlist in a filter to break apart each row for consumption by a <$list>. The problem with that approach is the rows may contain duplicate values that need to be preserved, and core filters/operators currently remove duplicates.

In looking for a solution I explored both the split:list filter from tobibeer and the split operator being introduced in TW v5.1.20, but neither is entirely suitable as a straight drop-in replacement for enlist. In short, split:list does not preserve duplicates despite its documentation appearing to say it should, and v5.1.20 split does not respect the double square bracket convention -- for values which include spaces -- as enlist does. (See this topic for more details, including test scripts and example output.)

There is a github issue (#3757) from early February which discussed the issue of v5.1.20 math operators not accepting duplicates, but it did not reach any resolution (at least, not yet).

Has anyone else grappled with this problem and found a solution or workaround? Preferably one that does not involve hacking on core tiddlers... :-)

Regards,
David.

P.S. I'm using TW 5.1.19/nodejs.

Mark S.

unread,
Apr 24, 2019, 10:57:50 AM4/24/19
to TiddlyWiki
You'll hate this idea, but you could put a uniqueness enforcer in front of each row item (e.g. A_MyTiddler B_MyTiddler) then, after whatever processing you do, split off the uniqueness bit using the splitbefore and removeprefix operators.

What's really needed is a uniquess filter, like "pragma[keep-dupes]" that would tell all downstream operators to preserve arrays, until flagged the other way "pragma[no-dupes]". But of course every single filter operator that uses dupes would have to be re-written to handle this, which wouldn't be easy because (I suspect) that the duplicate behavior is based on object property behaviour in javascript.

-- Mark

TonyM

unread,
Apr 24, 2019, 10:25:01 PM4/24/19
to TiddlyWiki
Fyi

This has being discussed in other threads, I have added more to the maths github issue. This treatment of titles makes sense when handling sets of titles and deduplication is built into most operators this is where the challenge lays.

I think we need Jeremy Mario or other gurus to resolve this in the long run but a few well documented work arounds.

If the deduplication was removed from the operators and moved into the filter run it would be possible to define a different run without deduplication.

It should also be possible to uniquify items in a list by making them entries in a data tiddler (unique index) and increment a number and retrieve the nth item.

Regards
Tony

David Nebauer

unread,
Apr 25, 2019, 10:57:32 AM4/25/19
to TiddlyWiki
Mark S., I'll keep that as a backup strategy. I'm still hoping I'll stumble across a more elegant, less fiddly solution.

Regards,
David.

David Nebauer

unread,
Apr 25, 2019, 11:29:43 AM4/25/19
to TiddlyWiki
Thanks, Tony. With your suggestion about incrementing a data tiddler, did you mean something like this for a 3-column table?

1: Row 1 Col 1 value
2: Row 1 Col 2 value
3: Row 1 Col 3 value
4: Row 2 Col 1 value
5: Row 2 Col 2 value
6: Row 2 Col 3 value
7: Row 3 Col 1 value
8: Row 3 Col 2 value
9: Row 3 Col 3 value
and so on...

If I've misinterpreted you I'd appreciate you correcting my understanding. If I did interpret you correctly, what you proposed is certainly a valid approach, but maintaining dynamic table data would be fragile for any table of non-trivlal size -- it would be so easy to introduce "off by 1" type errors.

Also, I've looked at incrementing variables before. That appears to be another one of those things that is trivial everywhere else but impossible in TW without installing a non-core plugin. (If I'm wrong about that I'd love a pointer in the right direction.)

Regards,
David.

Mark S.

unread,
Apr 25, 2019, 4:05:54 PM4/25/19
to TiddlyWiki
Gosh, no wonder I always have this sense of Deja vu.

It turns out that I wrote an "enlist2" filter in 2017 that can split on spaces without duplication. Unfortunately, this means your tiddler titles can't have spaces. I can imagine an enhanced version that could split on [[ ]] as well.


Why do I always have this sense of Deja vu?

-- Mark

TonyM

unread,
Apr 25, 2019, 5:46:41 PM4/25/19
to TiddlyWiki
Mark

Split in 5.1.20 prerelease is like that exactly with space.

Tony

TonyM

unread,
Apr 25, 2019, 10:04:26 PM4/25/19
to TiddlyWiki
David,

An aside, When it comes to numbers or a list of numbers it is safe to use space to delimit them because numbers do not typically contain spaces like tiddler titles may, EXCEPT IN YOUR EXAMPLE WHERE NUMBERS ARE EXPRESSED IN WORDS.

What I was think was to use a method such as split[ ] to separate each of the numbers/titles into a non deduplicated list, by then save them in a data tiddler where the value can be "repeated" but the index is always unique. 
So a set of numbers such as "1 2 3 1 2" can be saved in a data tiddler as;
item1: 1
item2: 2
item3: 3
item4: 1
item5: 2

They will now not be deduped, each number is effectively unique as it is referenced by the unique item key. There is also the "select" keyword now on the setWidget that allows you to address the nth item, and you can use the range operator to "iterate" the nth number.

In your example you are using titles for numbers is that correct?

<$set name="HasDuplicates" value="one two two three [[twenty one]] [[twenty one]] [[forty six]]">

<$list filter="[subfilter
<HasDuplicates>]">

</$list>

item1: one
item2
: two
item3
: two
item4
: three
item5
: twenty one
item6
: twenty one
item7
: forty six


I hope that is clearer.

Regards
Tony

David Nebauer

unread,
Apr 26, 2019, 1:54:05 AM4/26/19
to TiddlyWiki
Solved. In fact, the solution was so simple I'm worried it's a trap! I have created a new filter operator -- enlistallowduplicates -- which differs from enlist by a single parameter.

Here is the non-comment part of enlist.js:

(function(){

/*jslint node: true, browser: true */
/*global $tw: false */
"use strict";

/*
Export our filter function
*/

exports
.enlist = function(source,operator,options) {
       
var list = $tw.utils.parseStringArray(operator.operand);
       
if(operator.prefix === "!") {
               
var results = [];
                source
(function(tiddler,title) {
                       
if(list.indexOf(title) === -1) {
                                results
.push(title);
                       
}
               
});
               
return results;
       
} else {
               
return list;
       
}
};

})();

Here is the non-comment part of enlistallowduplicates.js with the added parameter bolded/italicised/underlined:

(function(){

/*jslint node: true, browser: true */
/*global $tw: false */
"use strict";

/*
Export our filter function
*/

exports
.enlistallowduplicates = function(source,operator,options) {
   
var list = $tw.utils.parseStringArray(operator.operand,true);
   
if(operator.prefix === "!") {
       
var results = [];
        source
(function(tiddler,title) {
           
if(list.indexOf(title) === -1) {
                results
.push(title);
           
}
       
});
       
return results;
   
} else {
       
return list;
   
}
};

})();

This works because $tw.utils.parseStringArray is already defined with a second boolean parameter called AllowDuplicates which, unsurprisingly, causes the return value (a list) to include duplicates. The new operator -- enlistallowduplicates -- also respects the double-square-brackets convention. So, this meets all my criteria for a drop-in replacement for enlist that allows duplicates.

I checked the definition of $tw.utils.parseStringArray in version 5.1.20-prerelease (it's in boot/boot.js) and it remains unchanged, so this solution should survive the next upgrade (unless these parts change between now and then).

My thanks to everyone who responded to my question, and especially to Mark S. who nudged me toward a replacement filter operator.

Regards,
David.

David Nebauer

unread,
Apr 26, 2019, 1:59:00 AM4/26/19
to tiddl...@googlegroups.com
I should add that my real-world project handles duplicate values flawlessly now that it uses enlistallowduplicates instead of enlist.

I hope this topic is of assistance to others who have, or will have, the same need as I did.

Regards,
David.

David Nebauer

unread,
Apr 26, 2019, 8:18:22 AM4/26/19
to TiddlyWiki
I've been looking further into this feature. The alteration to $tw.utils.parseStringArray to enable it to return a list with duplicate items was a contribution from inmysocks, aka Jed Carty, in October 2018 (see TW github pull request 2027). This change was part of 5.1.18 and is mentioned in its release notes as a developer improvement.

Thanks, Jed, for providing this functionality that I could use to solve a very frustrating problem.

Regards,
David.
Reply all
Reply to author
Forward
0 new messages