How to extract part of a field and sort extracted stuff

269 views
Skip to first unread message

Ton Gerner

unread,
Jul 13, 2018, 4:21:06 AM7/13/18
to TiddlyWiki

Hi all,

Use case

All relevant tiddlers are tagged with tid and contain the field veld.
The content of veld can vary as shown in the following table.
The for me interesting codes are the ones starting with a character followed by a 3-digit number (bold in the table below).

Tiddler Content of veld
tiddler1 C804, A878
tiddler2 G16, 1945- J5, 1947- JT5, 1950- D806
tiddler3 M832
tiddler4 C801
tiddler5 D 20, 1946- KL1, 1947- KV1, 1950- C803, A879
tiddler6 J6, 1947- JT6, 1950- D804, 1958- F812
tiddler7 P336, 1946- O2, 1948- O28, S812
tiddler8
tiddler9 FY54, 1946- MV7, 1950- M863
tiddler10 A835

Goal

  1. Filter tiddlers according to e.g. Cxyz (C followed by a 3-digit number)
  2. Display the results in a table with 3 columns: Cxyz, Name of tiddler, Content of veld
  3. Sort the first column in ascending order

Example:

Cxyz Name of tiddler Content of veld
C801 tiddler4 C801
C803 tiddler5 D 20, 1946- KL1, 1947- KV1, 1950- C803, A879
C804 tiddler1 C804, A878

Filtering

With [regexp:veld[C\d\d\d]] I can filter all tiddlers containing Cxyz.
Result - as given in Advanced search, tab Filter:

tiddler1
tiddler4
tiddler5

which means this are the 3 tiddlers containing a Cxyz code.

Code (missing helper macro xyz)

\define xyz()
???
\end

\define llinks(filter)
<$list filter="$filter$">
  <tr>
    <td>
      <<xyz>>
    </td>
    <td>
      <$link to={{!!title}}>
      <$view field="title"/>
      </$link>
    </td>
    <td>
      <$view field="veld"/>
    </td>
  </tr>
</$list>
\end

<table>
  <tr>
    <th>Cxyz</th>
    <th>Name of tiddler</th>
    <th>Content of `veld`</th>
  </tr>
<<llinks "[tag[tid]] +[regexp:veld[C\d\d\d]]">>
</table>

renders as:

Cxyz Name of tiddler Content of veld
??? tiddler1 C804, A878
??? tiddler4 C801
??? tiddler5 D 20, 1946- KL1, 1947- KV1, 1950- C803, A879

Problems

Column 1 contains ??? caused by missing helper macro.

How to create this helper macro which needs to:

  1. Extract the Cxyz part from the field veld and store it in a tiddler / field / variable so it can be transcluded into column 1 of the table
  2. Sort column 1

Minimal Test Case

MTC with the information given here, see http://tw5tongerner.tiddlyspot.com/

Help would be greatly appreciated!

Cheers,

Ton

Mark S.

unread,
Jul 13, 2018, 8:41:00 AM7/13/18
to TiddlyWiki

You need a filter that can actually split out the contents you want in Cxyz. Perhaps:


Then you can have an outer list that finds the elements, and an inner list that uses those elements to find the tiddlers that match the elements.

-- Mark

Ton Gerner

unread,
Jul 13, 2018, 10:57:31 AM7/13/18
to TiddlyWiki
Hi Mark,

Thanks for answering.

For me it is the first time I use regexp; up till now it was only abracadabra for me (as is Javascript). I am not a coder; just a longtime user of TW.

May be I am wrong but I do not see how your regexps filter can bring me more than what `[regexp:veld[C\d\d\d]]` already brings me: a list of tiddlers that contain Cxyz in field `veld`.

But how can I extract that Cxyz part from the field content and store it somewhere (tiddler, variable, field, ...). after which I can filter/sort?

Or is there any other way to get to my wanted result?

Cheers,

Ton

Mark S.

unread,
Jul 13, 2018, 1:53:24 PM7/13/18
to TiddlyWiki
The regexps filter not only finds matches, but turns the matches into output strings.

Something like that is necessary when you want to split up lines.

The only way (that I can think of ) to do it without a special filter to have an outer list that has ALL possible special codes. Not infeasible if you happen to also be using a data dictionary of codes.

Javascript may be abracadabra, but at least it's documented abracadabra ;-)

-- Mark

Mark S.

unread,
Jul 13, 2018, 2:14:23 PM7/13/18
to TiddlyWiki
With the regexps filter, the table can be created like this:

<table>
<$list filter="[has[veld]regexps:veld[(C\d\d\d)]sort[]]" variable="code">
<$list filter="[search:veld
<code>]">
 
<tr>
   
<td>
      <
<code>>
   
</td>

   
<td>
      <$link to={{!!title}}>
      <$view field="title"/>
      </$link>
   
</td>
   
<td>
      <$view field="veld"/>
   
</td>
 
</tr>

</$list>
</$list>
</table>


The output looks like:

-- Mark
Auto Generated Inline Image 1

Ton Gerner

unread,
Jul 13, 2018, 3:10:29 PM7/13/18
to TiddlyWiki
Mark,

You've made my day! This exactly what I wanted.

Thanks again,

Ton

Ton Gerner

unread,
Jul 15, 2018, 1:55:11 PM7/15/18
to TiddlyWiki
Hi Mark,

The wikitext + regexps.js filter code you gave does an awful job and saves me a lot of manual 'writing'.
But (there is always a but) I forgot to mention that sometimes more than one tiddler can contain the same code e.g. `D801`.
In such a case I expected to see 2 tiddlers with D801 in the result list. But what is displayed contains 4 tiddlers: 2x the expected 2 tiddlers. In case 3 tiddlers contain the same 'code', the result displays 3x the expected 3 tiddlers, and so on.



I tried to understand/change your wikitext code to no avail.
Would you be so kind to look into this problem?

For a MTC with the information given here, see http://tw5tongerner.tiddlyspot.com/

Cheers,

Ton

Mark S.

unread,
Jul 15, 2018, 3:43:24 PM7/15/18
to TiddlyWiki
An awful job?

I can tell you how to fix it, but you may not like it. The pieces of text that are currently split off into "titles" do not represent tiddlers (at least not in your sample TW). But many of the filter operators want  to work on real tiddlers.

There is an operator that can make sure that only one of each code is represented in the outer layer. It is the "each" filter operator. So the outer filter code would look like:

<$list filter="[has[veld]regexps:veld[(C\d\d\d)]each[]sort[]]" variable="code">

BUT -- for this to work, each of the codes has to have a real-world tiddler. That is, you need to make a C801, C803, C804 etc. tiddler in order for the each operator to work on the result of the regexps. Of course, there are ways to make these very quickly. The only question is whether you mind having your space littered with the extra tiddlers.

Hopefully that's not too much of a stretch.

Good luck!
-- Mark

passingby

unread,
Jul 15, 2018, 4:02:09 PM7/15/18
to TiddlyWiki


On Sunday, July 15, 2018 at 1:43:24 PM UTC-6, Mark S. wrote:
An awful job?


I suspect its a typo and was meant to be 'awesome'

Ton Gerner

unread,
Jul 15, 2018, 4:25:59 PM7/15/18
to TiddlyWiki
Hi Mark

An awful job?

As passingby wrote, I mean awesome. I am grateful for what you already did for me.

I can tell you how to fix it, but you may not like it. The pieces of text that are currently split off into "titles" do not represent tiddlers (at least not in your sample TW). But many of the filter operators want  to work on real tiddlers.

Yeah, there started my first problem: filtering gave me tiddlers.

There is an operator that can make sure that only one of each code is represented in the outer layer. It is the "each" filter operator. So the outer filter code would look like:

<$list filter="[has[veld]regexps:veld[(C\d\d\d)]each[]sort[]]" variable="code">

BUT -- for this to work, each of the codes has to have a real-world tiddler. That is, you need to make a C801, C803, C804 etc. tiddler in order for the each operator to work on the result of the regexps. Of course, there are ways to make these very quickly. The only question is whether you mind having your space littered with the extra tiddlers.

Knowing there is a solution helps me. Tomorrow I will have a look at your solution and on the impact it has on my TW.
Thanks for looking into this.

Cheers,

Ton

Ton Gerner

unread,
Jul 16, 2018, 11:24:31 AM7/16/18
to TiddlyWiki
Hi Mark,

Your solution using the each filter operator works great.
The disadvantage of creating (empty) tiddlers is not that important to me. My 4 MB TW contains already about 1800 tiddlers and not all series of 'codes' contain doubles. In practice it means I need to have about 100 tiddlers extra. I can live with that: the advantage of easy filtering outweighs the disadvantage of creating empty tiddlers.

Thanks again.

Cheers,

Ton

Mark S.

unread,
Jul 16, 2018, 12:10:25 PM7/16/18
to TiddlyWiki
Stop the presses!

I was looking at the each code to see if I could make an "each string" macro. It turns out that there is an undocumented feature in the each filter. A list like this:

<$list filter="[has[veld]regexps:veld[(C\d\d\d)]each:value[title]sort[]]" variable="code">

May do what you want without having to create those extra tiddlers.

Good luck!
-- Mark

Ton Gerner

unread,
Jul 16, 2018, 12:20:56 PM7/16/18
to TiddlyWiki
Hi Mark,


I was looking at the each code to see if I could make an "each string" macro. It turns out that there is an undocumented feature in the each filter. A list like this:

<$list filter="[has[veld]regexps:veld[(C\d\d\d)]each:value[title]sort[]]" variable="code">

May do what you want without having to create those extra tiddlers.

 
Yes! It works without the extra tiddlers.

Thank you so much.

Cheers,

Ton

mauloop

unread,
Jul 17, 2018, 7:09:08 AM7/17/18
to TiddlyWiki
Recently I had a need quite close to your use case. I tried to adapt my solution to your case. It uses just vanilla TW features. Take it just as an exercise. Mark S. solution is more elegant than mine. 

\define list-all-codes()
<$set name="f" filter="""[has[veld]get[veld]]""">
<$wikify name="ac" text=<<f>>>
<$list filter="""[enlist<ac>!suffix[,]] [enlist<ac>removesuffix[,]] +[sort[]]""" variable="c">
<<c>>
</$list>
</$wikify>
</$set>
\end

\define get-codes-by-prefix(p)
<$wikify name="ac" text=<<list-all-codes>> >
<$list filter="""[enlist<ac>regexp[^$p$\d\d\d,*]]""" variable="pc">
<<pc>>
</$list>
</$wikify>
\end

\define build-code-table(p)
<table>
<tr>
<th>Cxyz</th>
<th>Name of tiddler</th>
<th>Content of `veld`</th>
</tr>
<$wikify name="pc" text=<<get-codes-by-prefix $p$>> >
<$list filter="""[enlist<pc>]""" variable="c">
<$list filter="""[search:veld<c>]""" variable="t">
<tr>
<td><<c>></td>
<td><<t>></td>
<td>{{{[<t>get[veld]]}}}</td>
</tr>
</$list>
</$list>
</$wikify> 
</table>
\end

Here is an example output for macro <<list-all-codes>>:

1945- 1946- 1947- 1948- 1950- 1958- 20 A835 A878 A879 C801 C803 C804 D D801 D804D806 F812 FY54 G16 J5 J6 JT5 JT6 KL1 KV1 M832 M863 MV7 O2 O28 P336 S812

Here is an example output for macro <<get-codes-by-prefix C>>:

C801 C803 C804

Here is an example output for macro <<build-code-table C>>:

CxyzName of tiddlerContent of veld
C801tiddler4C801
C803tiddler5D 20, 1946- KL1, 1947- KV1, 1950- C803, A879
C804tiddler1C804, A878

Here is an example output for macro <<build-code-table D>>:

CxyzName of tiddlerContent of veld
D801tiddler10D801 A835
D801tiddler3M832 D801
D804
D806

Mark S.

unread,
Jul 17, 2018, 10:00:28 AM7/17/18
to TiddlyWiki
That looks great! Something of a digression, but ideally neither of our approaches would be needed if the veld field were formatted as a TW list. And as long as the entries in the list had no spaces in their names, it could have been as simple as:


I'm guessing the original had commas because they were pulled out of a CSV listing?

Have fun!

-- Mark

Ton Gerner

unread,
Jul 17, 2018, 1:47:46 PM7/17/18
to TiddlyWiki
@mauloop,

Nice to see another approach that could help me. As they say: There is more than one way to skin a cat.

@Mark.S


> I'm guessing the original had commas because they were pulled out of a CSV listing?

No, everything was added by hand.

Some more background. It all started with a bunch of photographs of historical Dutch navy ships. Names and codes changed in time; since 1950 the so called (NATO) pennant numbers are used (C for cruisers, D for destroyers and so on while the follwing 3-digit number is 800-899 for the Netherlands).

Distinguishing ships on the photographs was for several reasons difficult:
1) In time there were more - different - ships with the same name (as an example: since 1831 there must always be a ship named Van Speijk)
2) Ship codes changed in time from pure Dutch indications to the pennant numbers (e.g. J1, 1947- JT1, 1950- D801 which means J1 before 1947, JT1 during 1947-1950, D801 from 1950 on)
3) In time the function of a ship could change or a ship could be reclassified, e.g. from destroyer (D) to fregate (F)

To help me identifying ships on old photographs I started a small 'database'.
But that database grew and grew. And so the need to filter in different ways.

I implemented Mark's solution. It works great.
If somebody is interested, you can find it here (in Dutch):

http://www.tongerner.nl/ tab 'Techniek', click the link 'Techniek'.
In the 'Techniek' wiki choose 'Marineschepen' from the 'Techniek' menu.
Tab 'Databank' > 'Pennantnummers' shows listings of different kinds of ships filtered by pennant nummber. The only exception is Ondersteuningsvaartuigen (A for Auxilliary) which is done manually (containing elements that could not be filtered).

Cheers,

Ton


JD

unread,
Jul 18, 2018, 5:34:48 AM7/18/18
to TiddlyWiki
http://www.tongerner.nl/ tab 'Techniek', click the link 'Techniek'.
In the 'Techniek' wiki choose 'Marineschepen' from the 'Techniek' menu.
Tab 'Databank' > 'Pennantnummers' shows listings of different kinds of ships filtered by pennant nummber. The only exception is Ondersteuningsvaartuigen (A for Auxilliary) which is done manually (containing elements that could not be filtered).

This is truly interesting. I could lose hours in this wiki just looking at the diagrams and admiring the ships' technical wonders. Thanks for sharing, Ton!

-jd

Ton Gerner

unread,
Jul 18, 2018, 2:45:46 PM7/18/18
to TiddlyWiki
Nice, you like it. It's a hobby that got out of hand ;-)

Cheers,

Ton

Reply all
Reply to author
Forward
0 new messages