Persist list filter to DataTiddler

66 views
Skip to first unread message

Thomas Stone

unread,
Dec 16, 2020, 12:48:44 PM12/16/20
to TiddlyWiki
Would it be desirable to add a new button action to persist a list filter directly into a DataTiddler, with the row number as the index?

I can construct a DataTiddler like this manually using:

* A count of entries from a filter
* A range up to the count, subtracting one from each to make it base zero
* Re-run the first filter, butfirst<current range value>, first entry
* Set-field action to save the filter result to a DataTiddler, using the current range value as the index

If the range filter exceed 10,000 records, this process cannot run. Even when I break the process out into smaller chunks, the throughput gets very slow after 20,000 records.

Does anyone else think this could be useful when trying to pre-process a large amount of data?

TW Tones

unread,
Dec 16, 2020, 5:49:47 PM12/16/20
to TiddlyWiki
Is your primary purpose to assign a number to a set of tiddlers and retain them indefinitely?
Is processing the large number of items only the first time, or on import because we could assign "serial numbers" as created or as used?

I suppose I need to understand why you need this, or to what purpose to present one of a number of different methods.
  • The created date is almost, if not actually unique, and there is a way to make them unique in retrospect if necessary.
  • It is possible to store tiddler titles in a list that is itself in the order required, without introducing index's
  • List item can be generated automatically
  • It is common in tiddlywiki to access tiddlers in a list by more than one sort method, in different orders the serial number is less important.
  • I have built serial number assignment for tiddler if that would satisfy your needs.
Tones

TW Tones

unread,
Dec 16, 2020, 6:07:06 PM12/16/20
to TiddlyWiki
FYI:

We can get an "index" number for items in a list, and we could use this in a button to set an index field in each tiddler or a data tiddler for later reference using the allbefore and count + 1
<$list filter="[tag[TableOfContents]]">
   {{{ [tag[TableOfContents]allbefore<currentTiddler>count[]add[1]] }}} <$link/><br>
</$list><br>

However this will need the whole list to be processed for each item in the list. and could take a lot of time.
  • With a little trickery you may be able to make the numbering "re-start able", ie get the last issued and add more from there.
Tones

Thomas Stone

unread,
Dec 16, 2020, 7:16:46 PM12/16/20
to TiddlyWiki
I am trying to parse a large text file using TW in the browser. I already know how to process text files using .Net, I enjpy learing what programmatic capabilities are available in every environment I pick up.

The text file is simple enough to find records using splitregexp[/n] . However, I know this particular file has a unique key for each record. I am able to convert the file using .Net into a json file describing a DataTiddler that I can import. This way I can reference each record by its primary key. If there was a way to auto-number the list entries and persist them to a DataTiddler, this would be close enough for me to selectively process the records further without having to re-split the original list over and over.

I saw several requests for adding a variable to the ListWidget, calling it an iterator (Tobias) or addposition (Evans) on GitHub issue 3384. It was closed as duplicate of issue 1523. Both issues mention concerns over how the ListWidget creating a variable would actually cause re-rendering.

Instead of modifying the ListWidget directly, I was proposing an intermediate step of preprocessing a list into a DataTiddler with auto-numbered indexes. Then you  can iterate over the DataTiddler indexes to get the list position and use getindex to return the record.

When trying to use the iterator feature in a loop to call a set-field action on a DataTiddler for each record, the button code generated can end up being huge. I would just like to internally side-step the generating of so many individual actions in the button. It would be like a 'copy tiiddler text to DataTiddler' action with a list process in the miiddle and auto-number on the insertions.

Just saying that out loud sounds like I'm probablly barking up the wrong tree, because TW lists are not designed as an imperitive language. DataTiddlers as an internal database table seem to work well as long as the tables are generated outside of TW.

Mark S.

unread,
Dec 16, 2020, 7:38:19 PM12/16/20
to TiddlyWiki
On Wednesday, December 16, 2020 at 4:16:46 PM UTC-8  wrote:

The text file is simple enough to find records using splitregexp[/n] . However, I know this particular file has a unique key for each record. I am able to convert the file using .Net into a json file describing a DataTiddler that I can import. This way I can reference each record by its primary key. If there was a way to auto-number the list entries and persist them to a DataTiddler, this would be close enough for me to selectively process the records further without having to re-split the original list over and over.

I'm confused. If you've made your text file into an importable JSON file, and if each record has a primary key, why do you need to add an index? You're adding an index to something that already has an index. 

There's the temptation when you first using TW to perceive data dictionaries/DataTiddlers as the primary method of data storage and manipulation because that is what they would be in other technologies. But in TW, the DataTiddlers are there more as a convenience tool. They are very limited. The basic data entity in TW is the tiddler.  There are lots of filter operators and widgets dealing with tiddlers, and very few dealing with DataTiddlers. Possibly consider converting your text file into multiple tiddlers (either individual tiddler files or a JSON file containing them all).  You will likely have a better experience down the road manipulating those records then stowing them in a DataTiddler.

TW Tones

unread,
Dec 16, 2020, 9:18:21 PM12/16/20
to TiddlyWiki
positiv,

No withstanding Marks wise words, I have a separate wiki with the JSON mangler plugin for processing files such as csv or with other delimiters. You have full control and auto indexing or tiddler naming the result is stored in a json file with the tiddler format and becomes a plugin. This is my dataset processing and conversion utility , I then drop the data plugin on a wiki and build the code or view templates etc... for processing the data as data tiddlers using native TiddlyWiki. Another advantage of this approach is its easy to save, annotate, regenerate and export your data. Including capturing changes when the shadow tiddler is overridden.

JSON mangler is also able to process deeper third party json files.

Of course ultimately you may want to build a solution that includes JSON mangler or other import methods in your wiki, but until that day a utility wiki for data set import and production is wise.

Despite using JSON mangler for some sophisticated datasets I often build my own parsing methods if the delimiters are simple, starting with \n then perhaps , etc... CSV has the advantage you can also export CSV from tiddlywiki.

If someone gives me a few publicly available data sources of use I would be happy to build a range of "test data" "data plugins".
eg;
  • countries
  • country international codes
  • Airport codes
  • Geological epochs
  • The Planets
  • etc...

Regards
Tones

Reply all
Reply to author
Forward
0 new messages