Better indexes through semantic modeling

288 views
Skip to first unread message

Soren Bjornstad

unread,
Aug 9, 2021, 9:17:00 PM8/9/21
to TiddlyWiki
Some of you all might be interested in this new post on my blog:

https://controlaltbackspace.org/notes/better-indexes-through-semantic-modeling/

It's a proposal for a system for indexing large documents based on a hypertext graph, including a discussion of a possible TiddlyWiki prototype. Warning: 6,000+ words.

TiddlyTweeter

unread,
Aug 10, 2021, 3:02:15 AM8/10/21
to TiddlyWiki
Ciao Soren

VERY interesting write-up of a common practical puzzle with a well conceived SOLUTION.

I'm gonna look at it more and comment in more detail later.

ONE thing worth mentioning already is that TiddlyWiki, I think, is very good for being able to DEMONSTRATE STRATAGEMS on linkages. 
It's open architecture, that does NOT prior commit any user to any specific theory of linkage (i.e. any theory of knowledge), makes it brilliant to illustrate clearly otherwise obscure concepts of informational page design. 

Later 'gator
TT

David Gifford

unread,
Aug 10, 2021, 6:56:59 AM8/10/21
to TiddlyWiki
Hi Soren

You make me very eager for the semantic modeling plugin! And boy that looks like a challenging list for your sabbatical! Enjoy it. Blessings to you.

TiddlyTweeter

unread,
Aug 10, 2021, 8:36:57 AM8/10/21
to TiddlyWiki
Soren

Before more comment from me I do want to compliment you on including a visual serious clues to your intent ...

nodes.png
That is BRILLIANT! It brings the discussion to earth. Any TW fan can understand that. And THIS brings it together and expands the outcome already ...

edges.png
I'll comment a bit more, later.
Best wishes, TT

Mohammad Rahmani

unread,
Aug 10, 2021, 2:34:41 PM8/10/21
to tiddl...@googlegroups.com
Hi Soren,
Very interesting! Specially the Vision part which gives solutions using Tiddlywiki.

Best wishes
Mohammad


--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tiddlywiki/99f6944e-75a7-4e40-a768-795c6cd98934n%40googlegroups.com.

springer

unread,
Aug 11, 2021, 7:31:00 PM8/11/21
to TiddlyWiki
Soren,

In your "better indexes" essay you write:
If we do have the full text included in each locus, we may want to write a summary anyway and store it along with the full text: this way, we’ll be able to create an outline later and more easily see what parts of the document we’re hopping between.

And it reminds me how certain enlightenment texts were printed with a running outer-margin summary distilling key points (and of course the cognitive work of spelling out those side-notes is considerable!). For example, see the side-notes starting at p 49 (pdf-pagination) on this Adam Smith manuscript: https://oll-resources.s3.us-east-2.amazonaws.com/oll3/store/titles/237/0206-01_Bk.pdf

For some tiddlywiki projects, I've started to employ a super-condensed summary field (call it, say, the tldr field) that can be displayed for certain purposes. Unlike the main body of the tiddler, the tldr is text-only, maximum of a single sentence. (And if I can't summarize the tiddler in one sentence, then it needs to be more than one tiddler. ;) ) Of course, the fact that tw's standard search interface doesn't peek beyond title and text field means this solution requires some building-around to be useful. 

Overall, I'm enjoying your essay and its questions!

-Springer

TW Tones

unread,
Aug 11, 2021, 7:41:12 PM8/11/21
to TiddlyWiki
Soren,

I have read much of this and will continue to do so (and reread), I think it is important for tiddlywiki itself that we explore knowledge and information modeling/management. This is a sizable contribution to the discussion. As I read I have seen a few alternate paths both converging and diverging from your ideas, perhaps I will discuss these in details later but here are a few high level of concepts;
  • Indexes can be built automatically from non-trivial words and capture frequency information
    • Perhaps this is how one generates potential keywords
    • I imagine when saving a tiddler a keyword field opens with suggestions you can accept/reject/add to.
    • Comparison between keywords the author selected and those found in the text could discover new relationships
    • There are algorithms such as google language translations that determine how closely related one word is to another statistically to learn about relationships.
      • Is a "related" word found in the same sentence, paragraph, tiddler etc... semantic structure
  • In fact one way to store text is to store every word in an index and store the link to the word not the word itself when we think like this we can see new possibilities.
  • Access to simple synonyms and antonyms in the language used and extending searches to also search for the use of the synonym's in addition to the named word.
  • It is possibly to expose content to the google search engine and use google to search your own content.
  • TOCs and Outlines, Hierarchies in effect, I am a fan but also aware of the apparent limitations, however if you recognise the same data can exist in multiple hierarchies, or not in a  hierarchies.
  • Learning how to flag "missing data" or tentative knowledge/links etc... helps exploration and consistency.
  • We should find ways to ensure for all content we know the semantic structure from tiddler title to tag hierarchies and html headings, paragraphs sections...
  • It is easy to extend Tiddlywiki's search into other fields buy changing the scope of a search, even comparing the results of different searches can produce useful insights
  • ...
Very interesting
Tones

PMario

unread,
Aug 12, 2021, 3:03:32 AM8/12/21
to TiddlyWiki
On Thursday, August 12, 2021 at 1:31:00 AM UTC+2 springer wrote:
Soren,

In your "better indexes" essay you write:
If we do have the full text included in each locus, we may want to write a summary anyway and store it along with the full text: this way, we’ll be able to create an outline later and more easily see what parts of the document we’re hopping between.

And it reminds me how certain enlightenment texts were printed with a running outer-margin summary distilling key points (and of course the cognitive work of spelling out those side-notes is considerable!). For example, see the side-notes starting at p 49 (pdf-pagination) on this Adam Smith manuscript: https://oll-resources.s3.us-east-2.amazonaws.com/oll3/store/titles/237/0206-01_Bk.pdf

For some tiddlywiki projects, I've started to employ a super-condensed summary field (call it, say, the tldr field) that can be displayed for certain purposes. Unlike the main body of the tiddler, the tldr is text-only, maximum of a single sentence. (And if I can't summarize the tiddler in one sentence, then it needs to be more than one tiddler. ;) ) Of course, the fact that tw's standard search interface doesn't peek beyond title and text field means this solution requires some building-around to be useful. 

As I did read that section, I did think in a similar direction. I was thinking about a "teaser" text that is shown in "Blog overview listings". Sometimes it's the first paragraph of the text itself and sometimes it is a short summary, that comes later in the text.

----------

I did create a field-search plugin, that lets you configure additional fields that are shown in the search results. See: https://wikilabs.github.io/editions/field-search/

-mario

Jeremy Ruston

unread,
Aug 12, 2021, 11:13:04 AM8/12/21
to tiddl...@googlegroups.com
Hi Soren

Thank you for sharing, as ever your writing is admirably clear and concise, and it's a pleasure to follow your thought processes.

I tweeted your post as TiddlyWiki:


Jeremy


On 12 Aug 2021, at 08:03, PMario <pmar...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.

TiddlyTweeter

unread,
Aug 12, 2021, 1:34:08 PM8/12/21
to TiddlyWiki
Soren Bjornstad wrote:
Some of you all might be interested in this new post on my blog:

https://controlaltbackspace.org/notes/better-indexes-through-semantic-modeling/

 It was well worth reading!

THE GOOD

Great example of logical working through to a satisfactory outcome.

I complimented you before that your use of visual illustrations helps earth the discussion really well!
I think they definitely help folk who are not so versed in the conceptual matrix you lay out.

THE BAD

TBH your comments about the Old Media of Books are simply inaccurate. 

The Book has had (when required) very good indexing where authors chose to do it.
Think about the richness of the indices of Roget's Thesaurus. 
Think about all those Biblical things that Dave Gifford and several million other Christians sweat over. 
Their  CONCORDANCES has been a venerable partner in print works for a very long time.

THE UGLY

Nothing. Your basic thing is additive.



Best wishes 
TT

Soren Bjornstad

unread,
Aug 12, 2021, 2:10:11 PM8/12/21
to TiddlyWiki
Springer, that's really cool, I don't think I've ever seen that particular layout before! I do a similar thing to your “tldr” in my Zettelkasten, extending the “description” field to non-system tiddlers and displaying it at the top next to the gem icon if present:

ksnip_20210812-125810.png

TT, I'm puzzled where you got the idea that I think book indexes were/are a poor tool, or lacking in either authorial effort or utility. Are you able to point to location(s) in the post which are “inaccurate” or give you that impression? If so, I would like to correct it, as that is the exact opposite of what I think. I've been compiling and using keyword indexes almost daily myself for about twelve years (almost half my life), and they are a powerful tool – not to mention thesauruses, concordances, encyclopedias, etc., as you point out. I wrote a guide on using indexes for your notes back in 2013. It's exactly because they're so good (and, I think, neglected nowadays) that I'm interested in expanding them.

And I think many of us today are too “computer exceptionalist”. Good ideas are mostly independent of medium, it's just that sometimes they're really hard, or a comparatively bad intuitive fit, in one medium, so they don't make a lot of sense there. Or to put it another way, a system like what I'm proposing would be totally doable even on paper...just way more time-consuming to create and maintain.

Mohammad Rahmani

unread,
Aug 13, 2021, 4:08:36 AM8/13/21
to tiddl...@googlegroups.com
Soren,
 What do you think if we use a small section in a tiddler to store tiddler summary or description in the text field (tiddler body)?

Pros:
1. we can find it using standard search
2. no need to create extra field
3. no need to duplicate text

Cons:
1. need a script to extract description/summary (note the tool is available see, Tobias Beers, kookma, Thomas Elmiger, ...)



Best wishes
Mohammad


On Tue, Aug 10, 2021 at 5:47 AM Soren Bjornstad <soren.b...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.

Mohammad Rahmani

unread,
Aug 13, 2021, 7:04:05 AM8/13/21
to tiddl...@googlegroups.com
Some minor comments

1. error in filter

<ul>
<$list filter="[tag[Locus]sortan[]]">
    <li><$link/> (difficulty level: {{!!difficulty}}) – {{!!text}}</li>
</$list>
</ul>



Best wishes
Mohammad


On Tue, Aug 10, 2021 at 5:47 AM Soren Bjornstad <soren.b...@gmail.com> wrote:
--

Mohammad Rahmani

unread,
Aug 13, 2021, 7:39:04 AM8/13/21
to tiddl...@googlegroups.com
Soren
Please check the filters in given examples, there are some other cases of imbalanced square brackets!


Questions
  1. Why is there not anything about tags? Entries replicate tags in Tiddlywiki (but not necessarily)
  2. Some concepts with given scripts are confusing like Nearby lists
Example

We can similarly replicate Tabularium’s Nearby list on loci. On the locus view template, we simply list all of the index entries that link to the current locus:

<<list-links "[all[current]backlinks[]tag[Entry]]">>


So this means we have a view template for locus, means command is only works with tiddler tagged Locus
Then [all[current]backlinks[]tag[Entry]] will list all entries (e.g tag with Entry refer (has link) to current Locus!
This means, an Entry shall have a link to a Locus?! Means in a tag tiddler we refer to a Locus tiddler! This is confusing!


By the way, I enjoyed your writeup and I think it needs some working example to be better understood!



Best wishes
Mohammad


On Tue, Aug 10, 2021 at 5:47 AM Soren Bjornstad <soren.b...@gmail.com> wrote:
--

Mohammad Rahmani

unread,
Aug 13, 2021, 7:43:09 AM8/13/21
to tiddl...@googlegroups.com
Springer


On Thu, Aug 12, 2021 at 4:01 AM springer <springer...@gmail.com> wrote:
Soren,

In your "better indexes" essay you write:
If we do have the full text included in each locus, we may want to write a summary anyway and store it along with the full text: this way, we’ll be able to create an outline later and more easily see what parts of the document we’re hopping between.

And it reminds me how certain enlightenment texts were printed with a running outer-margin summary distilling key points (and of course the cognitive work of spelling out those side-notes is considerable!). For example, see the side-notes starting at p 49 (pdf-pagination) on this Adam Smith manuscript: https://oll-resources.s3.us-east-2.amazonaws.com/oll3/store/titles/237/0206-01_Bk.pdf

For some tiddlywiki projects, I've started to employ a super-condensed summary field (call it, say, the tldr field) that can be displayed for certain purposes. Unlike the main body of the tiddler, the tldr is text-only, maximum of a single sentence. (And if I can't summarize the tiddler in one sentence, then it needs to be more than one tiddler. ;) ) Of course, the fact that tw's standard search interface doesn't peek beyond title and text field means this solution requires some building-around to be useful. 

I would recommend Tobias Beer method for creating a description, Summary!
Also in kookma utility there is a find macro (it has been published separately, see find macro in kookma GitHub page) which can simply extract the summary/description part and you can still use the standard searchbox by Tiddlywiki


 

Overall, I'm enjoying your essay and its questions!

-Springer
On Monday, August 9, 2021 at 9:17:00 PM UTC-4 Soren Bjornstad wrote:
Some of you all might be interested in this new post on my blog:

https://controlaltbackspace.org/notes/better-indexes-through-semantic-modeling/

It's a proposal for a system for indexing large documents based on a hypertext graph, including a discussion of a possible TiddlyWiki prototype. Warning: 6,000+ words.

--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.

Soren Bjornstad

unread,
Aug 13, 2021, 8:41:37 AM8/13/21
to TiddlyWiki
On Friday, August 13, 2021 at 6:39:04 AM UTC-5 Mohammad wrote:
Soren
Please check the filters in given examples, there are some other cases of imbalanced square brackets!

Thanks, I found two (the $list one you quoted in another post and one other one) and these will be fixed when I rebuild the site later today. To be clear, these are intended to give a rough idea of how the system would work, not be fully functional examples, so that's why I obviously haven't tested them. :)
  1. Why is there not anything about tags? Entries replicate tags in Tiddlywiki (but not necessarily)
You certainly could use tags to implement entries, but I think links are easier to type and provide more context about why we're linking to the entry, which is part of the benefit of this system. Plus there could be thousands of entries in a large document, and nobody likes having thousands of tags in one TiddlyWiki.

Other options would be totally feasible, though! But the post isn't really about TiddlyWiki, and it was already 6,000 words, so I didn't think it was a good place to get into this kind of implementation detail.
  1. Some concepts with given scripts are confusing like Nearby lists
Example

We can similarly replicate Tabularium’s Nearby list on loci. On the locus view template, we simply list all of the index entries that link to the current locus:

<<list-links "[all[current]backlinks[]tag[Entry]]">>


So this means we have a view template for locus, means command is only works with tiddler tagged Locus
Then [all[current]backlinks[]tag[Entry]] will list all entries (e.g tag with Entry refer (has link) to current Locus!
This means, an Entry shall have a link to a Locus?! Means in a tag tiddler we refer to a Locus tiddler! This is confusing!

Yep, this example is wrong – it was supposed to say links[]. I will correct that as well.
 
By the way, I enjoyed your writeup and I think it needs some working example to be better understood!

I hope this will be a preview for a working system later!

Soren Bjornstad

unread,
Aug 13, 2021, 8:44:57 AM8/13/21
to TiddlyWiki
On Friday, August 13, 2021 at 3:08:36 AM UTC-5 Mohammad wrote:
 What do you think if we use a small section in a tiddler to store tiddler summary or description in the text field (tiddler body)?

I guess I have too much of a programmer mind to like manually yanking things out of somewhere instead of just putting it in a separate field. But it's great to have the option.
 
3. no need to duplicate text

Maybe I'm being dense, but how would this help us avoid duplicating text? Couldn't we just as easily transclude the separate field into the text field if the summary sentence made sense as part of that text? I do that all the time.

Mohammad Rahmani

unread,
Aug 13, 2021, 11:02:06 AM8/13/21
to tiddl...@googlegroups.com
Hi Soren, Thank you for clarification!
For summary:
I have seen books, especially textbooks which have a paragraph (usually in a colored box or with slightly different font) in the first page of each chapter explaining what is that chapter, and what we learn in that chapter!

So, I thought the same for tiddler! but this is meaningful if we have a long text in that chapter!

By the way, having fields for summary has its own pros and cons!

Thank you

Best wishes
Mohammad


--
You received this message because you are subscribed to the Google Groups "TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlywiki+...@googlegroups.com.

springer

unread,
Aug 13, 2021, 12:01:02 PM8/13/21
to TiddlyWiki
Mohammad, By "Tobias Beer's method" do you mean the one here:
?

That's quite similar to what I use. In fact, I like to create visually distinctive ViewTemplate sections for various fields, iff the fields are populated. Since I often use tiddlers to hold excerpts, and want to put nothing in the text there field apart from the excerpt itself, a conditional view template for the "notes" field --  displaying in a contrasting box below the text content if and only if I've added notes -- was the first such field-based auto-include that I found indispensable. 

-Springer

Mohammad Rahmani

unread,
Aug 13, 2021, 12:08:24 PM8/13/21
to tiddl...@googlegroups.com
Hi Springer!
The conditional summary needs a summary field! This is what Soren explained and is a good approach!

It lets you use the same standard searchbox!


Best wishes
Mohammad


springer

unread,
Aug 13, 2021, 12:21:46 PM8/13/21
to TiddlyWiki
Ah, I believe that techniques like "extracting introductions" might even have developed in tandem with a need I had, way back -- to have a list of "quick definitions" that displayed only the title and first line of each definition-tagged tiddler while dropping any further content of text field (examples, discussion). That technique started back in TW Classic days, though, before we *had* such powerful use of fields!

Mostly I'm leaning toward using fields now for any kind of structured info (such as summary), and transcluding those fields (or using distinctive ViewTemplate sections) as needed. 

Still, there's something to be said for the writing practice of crafting the first line of an entry to serve as a helpful summary (often revised after writing out the actual content). For certain use-cases, that kind of discipline helps both readers and writers, and of course it's especially compatible with off-the-shelf search behavior.

-Springer

Joshua Fontany

unread,
Aug 14, 2021, 8:13:02 PM8/14/21
to TiddlyWiki
Good discussion. I have been experimenting with dynamically generating Indexes from individual Tiddlers in my Martial-Arts wiki.

https://silat.chronicles.wiki/#Glossary

I had to set that aside for my real-time multiplayer experiments, but hope to get back to UI work soon.

Best,
Joshua Fontany
Reply all
Reply to author
Forward
0 new messages