Tiddlywiki and regexp examples

478 views
Skip to first unread message

Mohammad

unread,
Aug 23, 2019, 3:11:07 AM8/23/19
to tiddl...@googlegroups.com
I am looking for examples and use cases of regexp in Tiddlywiki!
Those can be done current filter operators like prefix, search,... are not recommend to be done with regexp.

I appreciate your help, case and examples on this. Just give what you want to do.

Some case

Give a regexp pattern in Tiddlywiki to match all tiddlers name are

  1. only digits
  2. only lowercase letters
  3. only uppercase letters
  4. only alphanumeric and underscore and hyphen
  5. only alphanumeric with length between 3 and 15
  6. start with a capital letter
  7. start with a digit
  8. have a extension like mytiddler.ext
  9. have jpg or jpeg extension like mytiddler.jpg or mytiddler.gpeg
  10. are a date in format like Jan 06 2019 
  11. are a date in format like 2019.08.25 
  12. have duplicate words
  13. have a valid url


[This list will grow by more examples]


Please give your use case.

-- Mohammad

Mark S.

unread,
Aug 23, 2019, 10:50:33 AM8/23/19
to tiddl...@googlegroups.com
Here's code for the first easy 9 (assuming my def of extension matches yours).

#10 and #11 should be doable, it would just take a little more time which I don't have at the moment.

I don't think 12 is doable with the current regexp kit. To find duplicates it would have to loop back on itself.

#13 should be 90+% doable. I would start by searching the net. It depends on how picky you are about the
structure.

Good luck!


<$select tiddler="myregexp">
<option value="^[0-9]*$">Only digits</option>
<option value="^[a-z]*$">Only lower case</option>
<option value="^[A-Z]*$">Only upper case</option>
<option value="^[\w-_]*$">Only alphanumeric, _, and -</option>
<option value="^[\w]{3,15}$">Only alphanum len 3-15</option>
<option value="^[A-Z]+.*$">Starts with capital</option>
<option value="^[0-9]+.*$">Starts with digit</option>
<option value="^.+\.[a-zA-Z]{3,4}$">Extensions only</option>
<option value="^.+(\.jpg|\.gpeg)$">Extension jpg gpeg</option>

</$select>

<$list filter="[regexp{myregexp}sort[]]">

</$list>


Mark S.

unread,
Aug 23, 2019, 11:12:40 AM8/23/19
to tiddl...@googlegroups.com
One more for duplicate words, where a word is defined as something with at least two characters and separated by at least one space.

<option value="^\s*(\w{2,})\s+.*\1.*$">Duplicate words</option>

Edit -- it needs to be tweaked a bit more -- the 2nd word might not be a perfect duplicated. No time.

Mohammad

unread,
Aug 23, 2019, 12:29:52 PM8/23/19
to TiddlyWiki
WOW,
 Thanks Mark. These are great. 

Cheers
Mohammad

@TiddlyTweeter

unread,
Aug 23, 2019, 2:16:21 PM8/23/19
to TiddlyWiki
Ciao Mark & Mohammad a few tweaks for Mark's solutions ... 

     (I'm happy to be aiding Mark, as it's usually the other way round, lol :-)!

<option value="^[0-9]*$">Only digits</option>
<option value="^[a-z]*$">Only lower case</option>
<option value="^[A-Z]*$">Only upper case</option>

These will work, but would also match "the empty string".
Probably better to use "+" (one or more), not "*" (zero or more), so ...

<option value="^[0-9]+$">Only digits</option>
 or use
"\d", shorthand for [0-9]
<option value="^\d+$">Only digits</option> "\d" is shorthand for [0-9]

<option value="^[a-z]+$">Only lower case</option>
<option value="^[A-Z]+$">Only upper case</option>

<option value="^[\w-_]*$">Only alphanumeric, _, and -</option>

Its not ideal to put "\w" inside a character class as it is a shorthand character class for [A-Za-z0-9_].

"\w" already includes "_" so you don't need to add it. 

Its not ideal to put "-" in the middle of a character class as it can sometimes act as a "range marker"

Any of these should work ...

<option value="^(\w|-)+$">Only alphanumeric, _, and -</option>
 or
<option value="^[-a-zA-Z0-9_]+$">Only alphanumeric, _, and -</option>
 or, its good practice to make explicit the need for literal "-", like so ...
<option value="^[\-a-zA-Z0-9_]+$">Only alphanumeric, _, and -</option>
 or, less kosher
<option value="^[\-\w]+$">Only alphanumeric, _, and -</option>

<option value="^[\w]{3,15}$">Only alphanum len 3-15</option>

This simpler regex should work just as well ...

<option value="^\w{3,15}$">Only alphanum len 3-15</option>

<option value="^[A-Z]+.*$">Starts with capital</option>
<option value="^[0-9]+.*$">Starts with digit</option>

These will work fine. But "match the first character" means they can be slightly simpler ...

<option value="^[A-Z].*$">Starts with capital</option>
<option value="^[0-9].*$">Starts with digit</option>


<option value="^.+\.[a-zA-Z]{3,4}$">Extensions only</option>
<option value="^.+(\.jpg|\.gpeg)$">Extension jpg gpeg</option>

Those look fine.

There is a guide to JavaScript regular expressions at: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

TT 

Mark S.

unread,
Aug 23, 2019, 2:21:06 PM8/23/19
to TiddlyWiki
Ok, with duplicates and date formats. Note that the date formats only check for the format. You could still create
nonsensical dates that actually match the formats (Jan 55 9999, 1111.15.55). Actual validation of dates would take
real code massaging.

<$vars digonly="^[0-9]*$">
<$vars useme=<
<digonly>>>
</$vars>
</$vars>

<$select tiddler="myregexp">
<option value="^[0-9]*$">Only digits</option>
<option value="^[a-z]*$">Only lower case</option>
<option value="^[A-Z]*$">Only upper case</option>
<option value="^[\w-_]*$">Only alphanumeric, _, and -</option>
<option value="^[\w]{3,15}$">Only alphanum len 3-15</option>
<option value="^[A-Z]+.*$">Starts with capital</option>
<option value="^[0-9]+.*$">Starts with digit</option>
<option value="^.+\.[a-zA-Z]{3,4}$">Extensions only</option>
<option value="^.+(\.jpg|\.gpeg)$">Extension jpg gpeg</option>
<option value="^\b(\w{2,})\b.*\b\1\b.*$">Duplicate words</option>
<option value="^\b(\w{2,})\b.*\b\1\b.*$">Duplicate words</option>
<option value="^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s{1}\d{2}\s\d{4}$">Date like Jan 06 2019</option>
<option value="^\d{4}\.[0-1]\d\.[0-3]\d$">Date like 2019.08.25</option>

</$select>

<$list filter="[regexp{myregexp}sort[]]">

</$list>

On Friday, August 23, 2019 at 12:11:07 AM UTC-7, Mohammad wrote:

Mat

unread,
Aug 23, 2019, 3:27:46 PM8/23/19
to TiddlyWiki
Great stuff guys!

Please give your use case.

I'd like to access 
  1. from X until the end of the sentence
  2. ...until the end of the paragraph
  3. ...until the end of the text field
  4. The smallest of the above, i.e a kind of OR that first looks for everything from X until the sentence end and if this doesn't exist, then from X until paragraph end and if that no exist then X till text field end
  5. all possible characters that typically follow a word in common text i.e:
    . OR , OR (space) OR ! OR ? OR : OR ; OR (spacechar) OR (end of text field) OR .......I don't even know
  6. Anything between "@@" and "@@"
  7. Anything between "xxxx {" and "}" (i.e css style defs) including some kind of split up for each item therein i.e to reach e.g "color" and "red".
  8. Anything between html "<tag>"  and "</tag>"
<:-)


@TiddlyTweeter

unread,
Aug 23, 2019, 3:34:37 PM8/23/19
to TiddlyWiki
Mohammad:

are a date in format like Jan 06 2019  
are a date in format like 2019.08.25 

Mark:
Actual validation of dates would take
real code massaging. 

Mark, your solutions look good on dates and with good basic match to exclude a lot of things that would not be proper dates.
Its probably all that is needed practically.

FYI, it is actually possible to use regex to correctly match dates. I know because I've done it to accurately  match dates, including leap years, under both Gregorian & Julian calendars. Its just enormously complex :-). Yeah, coding is better suited for that.

TT

Mohammad

unread,
Aug 23, 2019, 3:36:20 PM8/23/19
to tiddl...@googlegroups.com
Thank you all,

 Please keep going on.

For future references I have started a new wiki on tiddlyspot.com to document the questions and answers from this discussion.


I hope I have time to add explanation to all solutions. As Mat has given some new use cases, I hope other people in the group send us their examples.


Cheers
Mohammad

Mohammad

unread,
Aug 23, 2019, 3:48:33 PM8/23/19
to TiddlyWiki
Seems the only uppercase does not work correctly!

because pattern
matches: TIDDLERA
but NOT: TIDDLER B

Mohammad

unread,
Aug 23, 2019, 3:50:04 PM8/23/19
to TiddlyWiki
Noted Josiah!
 I will add these modification!

I hope you have time to add more explanation both solutions have been given!

@TiddlyTweeter

unread,
Aug 23, 2019, 3:52:16 PM8/23/19
to TiddlyWiki
Try this. That includes spaces.

<option value="^[A-Z ]+$">Only upper case and spaces</option>

Mohammad

unread,
Aug 23, 2019, 3:55:01 PM8/23/19
to TiddlyWiki
Noted Josiah!

Question: why you did not use \s?

@TiddlyTweeter

unread,
Aug 23, 2019, 4:09:02 PM8/23/19
to TiddlyWiki
For the use case it just looked like a "literal space" so " ". Keep its explicit. "\s" can have additional matches (space, tab, newline).

But "\s" for the use case would work fine too.

Mark S.

unread,
Aug 24, 2019, 12:25:57 AM8/24/19
to TiddlyWiki

Mark, your solutions look good on dates and with good basic match to exclude a lot of things that would not be proper dates.
Its probably all that is needed practically.

FYI, it is actually possible to use regex to correctly match dates. I know because I've done it to accurately  match dates, including leap years, under both Gregorian & Julian calendars. Its just enormously complex :-). Yeah, coding is better suited for that.

Well, that got me thinkig. Here's a probably inefficient date VALIDATOR for dates starting 0000-01-01 following format yyyy-mm-dd  (which I find to be the most generally useful). Ok, I didn't check the rules. I think there's something about a surprise
leap year every 400 years, so there's probably more tweaking to be done to match the Gregorian calendar precisely.

<option value="^(?=\d{4})(((?!\d\d(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\d{4}-(((0[13578]|10|12)-(0[1-9]|[12]\d|30|31))|((04|06|09|11)-(0[1-9]|[12]\d|30))|((02)-(0[1-9]|1\d|2[1-8]))))|((?=\d\d(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\d{4}-(((0[13578]|10|12)-(0[1-9]|[12]\d|30|31))|((04|06|09|11)-(0[1-9]|[12]\d|30))|((02)-(0[1-9]|1\d|2[1-9])))))">Experimental VALIDATE yyyy-mm-dd</option>




On Friday, August 23, 2019 at 12:34:37 PM UTC-7, @TiddlyTweeter wrote:
Mohammad:

are a date in format like Jan 06 2019  
are a date in format like 2019.08.25 

Mark:
Actual validation of dates would take
real code massaging. 



TT

Mohammad

unread,
Aug 24, 2019, 12:46:20 AM8/24/19
to tiddl...@googlegroups.com
Thanks Mark!

I started documenting the solutions!

Partial implementation can be found here



--Mohammad

TonyM

unread,
Aug 24, 2019, 12:55:34 AM8/24/19
to TiddlyWiki
Folks

Love your work. I suggest we make a small bundle of tiddlers and macros, perhaps even some filters for use in subfilter operator for this collection.

Of course these would often be used for searching but they can be used for validation.

It is however important to remember the html tag can be set on the edit-text widget for rudimentary validation as well.

A supporting macro that allows you to test a variable or text reference against one or more of these tests would be helpful. Especialy when using edit-text widget. It may be as simple as displaying a message when the result is not what the regex tests for. Search a single value for the reflex pattern and indicate when it is not found. E.g. not number only.

Regards
Tony

Mohammad

unread,
Aug 24, 2019, 1:00:15 AM8/24/19
to TiddlyWiki
Mark,
 If you agree, first we add pattern to recognize the date only! I mean set some common cases like

- 2019.10.29
- 2019-10-29
- Jan 18, 2019

As we are looking for tiddler titles. That is it! 

Then you introduce patterns to validate some date format!

What do you think!

--Mohammad

Mohammad

unread,
Aug 24, 2019, 1:01:34 AM8/24/19
to TiddlyWiki
Hi Tony!
 
That is great! This is in documentation side! I appreciate your help and contribution.
 

--Mohammad

TonyM

unread,
Aug 24, 2019, 1:15:47 AM8/24/19
to TiddlyWiki
Mark/Josiah,

Is there a simple way to test a number is in a range and or greater than or less than?

It would be nice to have a pattern to test if a number lies between or equal to a number, even if we simply follow it with the new then or else operators and or make use of the emptyMessage on the list. Sadly the reveal greater than less than an equal to tests are somewhat limited and we do not yet have greater than or less than filter operators although match is now a form of equals.

We may be able to have some tests like this

{{{ [<number>regex[input>A$<B]else[out of range]] }}} 

Regards
Tony

@TiddlyTweeter

unread,
Aug 24, 2019, 6:02:55 AM8/24/19
to TiddlyWiki
Ciao TonyM

Regex is best developed with concrete data. It has no maths ability. Everything is just a string of characters to it. 
But its often possible to match using pattern. It depends on working with example test data to ensure where it might work.

So could you give a paragraph or two of test data?

TT

@TiddlyTweeter

unread,
Aug 24, 2019, 6:47:25 AM8/24/19
to TiddlyWiki
Ciao Mat 
  1. from X until the end of the sentence
  2. ...until the end of the paragraph
  3. ...until the end of the text field
The exact regex may differ according to the context (type of field, does it have line-breaks?, what are the regex settings?) ... 

Test data for a field with no line breaks ...

This is the content of a field containing X and the rest of it.

This will match from the first "X" to the end ...

<option value="X(.+)$">Capture match after FIRST "X"</option>

This will match from the first occurrence of "X" to the end of the string
The (...) creates a "capture group". 

This is the content of a field X containing X again and the rest of it.

<option value="^.*X(.+)$">Capture match after LAST "X"</option>

This will match from the last occurrence of "X" to the end of the string ("*" greedy match ON)
The (...) creates a "capture group". What is captured is in green ...


Annotation 2019-08-24 124212.jpg




TT

@TiddlyTweeter

unread,
Aug 24, 2019, 7:04:36 AM8/24/19
to TiddlyWiki
  1. all possible characters that typically follow a word in common text i.e:
    . OR , OR (space) OR ! OR ? OR : OR ; OR (spacechar) OR (end of text field) OR .......I don't even know
To simply match those characters create a character class "[...]" of them ...

([ .,;:!?]|$)

 Note that in character classes meta-characters "." and "?" no longer have any special meaning so do not need to be escaped.
"|" is "alternation" within a "capture group" "(...)"
"$" is "end-of-scope" (depends on regex setting whether that is end-of-field or end-of-line)

Match results ...

Annotation 2019-08-24 130045.jpg

TT 


@TiddlyTweeter

unread,
Aug 24, 2019, 9:36:54 AM8/24/19
to tiddl...@googlegroups.com
Mark S. wrote:
Here's a probably inefficient date VALIDATOR for dates starting 0000-01-01 following format yyyy-mm-dd  (which I find to be the most generally useful). Ok, I didn't check the rules. I think there's something about a surprise
leap year every 400 years, so there's probably more tweaking to be done to match the Gregorian calendar precisely.

<option value="^(?=\d{4})(((?!\d\d(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\d{4}-(((0[13578]|10|12)-(0[1-9]|[12]\d|30|31))|((04|06|09|11)-(0[1-9]|[12]\d|30))|((02)-(0[1-9]|1\d|2[1-8]))))|((?=\d\d(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\d{4}-(((0[13578]|10|12)-(0[1-9]|[12]\d|30|31))|((04|06|09|11)-(0[1-9]|[12]\d|30))|((02)-(0[1-9]|1\d|2[1-9])))))">Experimental VALIDATE yyyy-mm-dd</option>

 
Lol! Nice one! I'd put that in the REALLY ADVANCED category for regex!
Most users would have no clue how that works!
Anyway I think you got it off the net? It's not bad, but faulty :-).

Practically speaking matching dates with specific calendars from 0001 is prone to error. This is nothing to do with regex or computers. Its that the transitions between the Julian and Gregorian calendar meant (for English speaking countries) 10 days were lost from history. So really you need more than one regex to cover Julian & Gregorian, and it will vary between country (Catholic countries adopted Gregorian dates first and about 7 days were destroyed).

Here is a regex for Gregorian dates that matches (century) leap years accurately.

Match Gregorian dates 1800 -> 9999

<option value="^((?:(?:1[8-9]|[2-9]\d)\d{2}([-/.]))(?:(?:(?:0[13578]|1[02])\2(?:0[1-9]|[12][0-9]|3[01]))|(?:(?:0[469]|11)\2(?:0[1-9]|[12][0-9]|30))|(?:(?:02)\2(?:0[1-9]|1[1-9]|2[1-8])))|(?:(?:1[8-9]|[2-9]\d(?:04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96)([-/.])(?:02)\3(?:29))|(?:(?:[2468][048]|[3579][26])(?:00)([-/.])(?:02)\4(?:29)))$">Match dates 1800-9999 in "yyyy[-/.]mm[-/.]dd" format</option>

The "([-/.])" means the data separator can be "-""." or "/"

"(?: ...)" starting "?:" means "use this capturing group, but don't retain it". It makes complex regular expressions easier to work with.

Note: the regex needs to be one line, not broken by line-breaks as google does.

The regex could have utility in some cases (for historians?, calanderians?), but I think they would be rare??

But its good example that regex can sometimes be quite clever.

TT

@TiddlyTweeter

unread,
Aug 24, 2019, 10:05:51 AM8/24/19
to TiddlyWiki
Mat

Anything between "@@" and "@@"

Simply ...

<option value="@@(.+?)@@">Match between "@@" pairs</option>

The "+?" is to make the match "lazy" so it won't extend beyond the second @@ to a third pair of @@.
The content between them is passed to a "capturing group".

If the match needs to span lines regex settings may need tweaking.

The same approach would apply to other "pairings".

TT

HansWobbe

unread,
Aug 24, 2019, 10:38:12 AM8/24/19
to TiddlyWiki
@Mark S.

I've just had a chance to test this in one of my environments and I have to say "It's amazingly useful!". 

Thank you very much for this and all of your other contributions.

Best regards,
Hans

@TiddlyTweeter

unread,
Aug 24, 2019, 11:20:59 AM8/24/19
to TiddlyWiki
Ciao Mark & Mohammad

Duplicate words

I suggest this ... This matches sequential duplication of words any number of times

<option value="(\b\w+\b)(?:\s+\1)+">Duplicate Words (in Sequence)</option>

It is NOT case sensitive but might be set so.
"\b" is "word boundary". Its an "anchor" of no length.
The first captured group "(\b\w+\b)" is repeated by "\1" the final "+" repeats the second capture group "(?:\s+\1)"  as many times as needed. The "(?:" prevents it being retained as a numbered capture group.
"\s" will match a line break in a field that has them.

Example match ...

Annotation 2019-08-24 165516.jpg


TT


Mohammad

unread,
Aug 24, 2019, 11:23:55 AM8/24/19
to TiddlyWiki
What is the regexp pattern (one pattern) to match any of of below numbers

1234_dp
12.34_dp
1234._dp

1234e5_dp
1.234e5_dp




Note wp or dp can be any kind of string (only a-zA-Z not hyphen not underscore) like _myPrecison 
These are numbers with user defined precision used fortran!

--Mohammad

@TiddlyTweeter

unread,
Aug 24, 2019, 11:32:46 AM8/24/19
to TiddlyWiki
By "number" do you mean everything before the "_"? So "1.234e5" is a number?

@TiddlyTweeter

unread,
Aug 24, 2019, 12:00:43 PM8/24/19
to TiddlyWiki
By "number" do you mean everything before the "_"? So "1.234e5" is a number?

I see "e" in the number -- is it  hexadecimal number? Or is the "e" for something else?

Eric Shulman

unread,
Aug 24, 2019, 12:18:22 PM8/24/19
to TiddlyWiki
On Saturday, August 24, 2019 at 9:00:43 AM UTC-7, @TiddlyTweeter wrote:
By "number" do you mean everything before the "_"? So "1.234e5" is a number?
I see "e" in the number -- is it  hexadecimal number? Or is the "e" for something else?

That is a regular number, using "scientific notation".

1.234e5 means "1.234 times 10^5", i.e., 1.234 x 10000 = 123400

-e


 

Mark S.

unread,
Aug 24, 2019, 12:27:11 PM8/24/19
to TiddlyWiki


On Saturday, August 24, 2019 at 6:36:54 AM UTC-7, @TiddlyTweeter wrote:
Mark S. wrote:
Here's a probably inefficient date VALIDATOR for dates starting 0000-01-01 following format yyyy-mm-dd  (which I find to be the most generally useful). Ok, I didn't check the rules. I think there's something about a surprise
leap year every 400 years, so there's probably more tweaking to be done to match the Gregorian calendar precisely.

<option value="^(?=\d{4})(((?!\d\d(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\d{4}-(((0[13578]|10|12)-(0[1-9]|[12]\d|30|31))|((04|06|09|11)-(0[1-9]|[12]\d|30))|((02)-(0[1-9]|1\d|2[1-8]))))|((?=\d\d(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))\d{4}-(((0[13578]|10|12)-(0[1-9]|[12]\d|30|31))|((04|06|09|11)-(0[1-9]|[12]\d|30))|((02)-(0[1-9]|1\d|2[1-9])))))">Experimental VALIDATE yyyy-mm-dd</option>

 
Lol! Nice one! I'd put that in the REALLY ADVANCED category for regex!
Most users would have no clue how that works!
Anyway I think you got it off the net? It's not bad, but faulty :-).


Actually, I rolled it myself.

@TiddlyTweeter

unread,
Aug 24, 2019, 12:32:43 PM8/24/19
to TiddlyWiki
Mark S. wrote:

TT: Lol! Nice one! I'd put that in the REALLY ADVANCED category for regex!
Most users would have no clue how that works!
Anyway I think you got it off the net? It's not bad, but faulty :-).


Actually, I rolled it myself.

WOW! That's impressive. It took me days!  

TT

Mohammad

unread,
Aug 24, 2019, 1:22:51 PM8/24/19
to TiddlyWiki
Yes, as Eric explained these are scientific notation. I forgot to add they can have positive or negative sign like

+1.23e4_dp
-1.23e4_dp

1.236e+5_dp
-1.23e-5_wp

Mat

unread,
Aug 24, 2019, 1:48:59 PM8/24/19
to TiddlyWiki
@TiddlyTweeter wrote:
  1. from X until the end of the sentence
  2. ...until the end of the paragraph
  3. ...until the end of the text field
The exact regex may differ according to the context (type of field, does it have line-breaks?, what are the regex settings?) ... 

"type of field" - as in multi-line vs single line? I did not realize this can affect a search. And what does "regex setting" mean?

Thank you for the provided code!

<:-)

Mat

unread,
Aug 24, 2019, 1:52:49 PM8/24/19
to TiddlyWiki
@TiddlyTweeter wrote:
([ .,;:!?]|$)

 Note that in character classes meta-characters "." and "?" no longer have any special meaning so do not need to be escaped.
"|" is "alternation" within a "capture group" "(...)"
"$" is "end-of-scope" (depends on regex setting whether that is end-of-field or end-of-line)

Very valuable info that I had no idea about!
(The attached image doesn't seem to work tho.)

<:-)

@TiddlyTweeter

unread,
Aug 24, 2019, 2:05:34 PM8/24/19
to TiddlyWiki
Mat wrote:
(The attached image doesn't seem to work tho.)

Eeek,  It shows for me. Its a shame as regex is much easier to understand when you see its effect on actual data.

TT


Mat

unread,
Aug 24, 2019, 2:08:59 PM8/24/19
to TiddlyWiki
@TiddlyTweeter wrote:

<option value="@@(.+?)@@">Match between "@@" pairs</option>

The "+?" is to make the match "lazy" so it won't extend beyond the second @@ to a third pair of @@.
The content between them is passed to a "capturing group".

Thank you - but: 

If the match needs to span lines regex settings may need tweaking.

In deed, this common case is not found AFAICT:

@@.mystyle
foo bar
@@

<:-)

Mat

unread,
Aug 24, 2019, 2:16:54 PM8/24/19
to TiddlyWiki
On https://digitalfortress.tech/tricks/top-15-commonly-used-regex/ I found the following:

10. HTML Tags

How would this be applied in TW? I.e what to write here:

{{{ [regexp:text[..........]] }}}


Side note: This is probably the first thing the TW docs should answer, i.e "How to reformat regexps so they can be used in TW"

Thank you!

<:-)

Mark S.

unread,
Aug 24, 2019, 2:27:34 PM8/24/19
to TiddlyWiki
My trick is to "flatten" the text before applying the target regular expression

...tiddlers...splitregexp[\n]join[ ]splitregexp<myfilter> ...

It sometimes works ;-)

@TiddlyTweeter

unread,
Aug 24, 2019, 2:27:43 PM8/24/19
to TiddlyWiki
Mat:  the first thing the TW docs should answer, i.e "How to reformat regexps so they can be used in TW"

You can't use regex "character classes" directly in the operator as the string contains the "TW reserved characters" "[...]" it needs itself. 

You have to "pass-in" anything regex in square brackets via a variable. See: https://tiddlywiki.com/#regexp%20Operator%20(Examples)

Basically like this ...

<$set name="digit-pattern" value="[0-9]{2}">
<<list-links "[regexp:title
<digit-pattern>]">>
</$set>

TT





Mark S.

unread,
Aug 24, 2019, 2:30:57 PM8/24/19
to TiddlyWiki
I believe almost all of these will work out of the box. The exception is the square brackets. The workaround
is to put your regular expression in a variable first. This is mentioned in the docs for regexp :

The filter syntax makes it impossible to directly specify a regular expression that contains square brackets. The solution is to store the expression in a variable. See the examples.

Thanks!

@TiddlyTweeter

unread,
Aug 24, 2019, 2:38:26 PM8/24/19
to TiddlyWiki
Mat

Is you aim here to CHANGE them?

If it is to change them then I think the regex section of Tiddler Commander is better suited to that.

TT

@TiddlyTweeter

unread,
Aug 24, 2019, 2:59:03 PM8/24/19
to TiddlyWiki
Mohammad wrote:
Yes, as Eric explained these are scientific notation. I forgot to add they can have positive or negative sign like

+1.23e4_dp
-1.23e4_dp

1.236e+5_dp
-1.23e-5_wp
 
It is an interesting case. Like with the dates. It can be matched quite simply by PATTERN. But the pattern will match things you might overlook.

For the specific case a "pattern-match" for a field containing a string (and only that) would be ...

^([\-+.0-9e]+_[A-Za-z]+)$

This would likely be all you'd need??

But it could be made more precise if needed. 

Here is a test match (and one problem) ... the green arrow -> indicates the match ...

Annotation 2019-08-24 205231.jpg


Its a fact regex isn't "determinate" in the same way normal code is. That can lead to much confusion. Testing against data is the best way to ensure a regex is good enough for its purpose.

TT

@TiddlyTweeter

unread,
Aug 24, 2019, 3:45:02 PM8/24/19
to TiddlyWiki
Mat to convert that ...

1. remove the leading and trailing "/" slashes. Those are for using regex in raw form in Javascript we don't use in TW operators, so: <\/?[\w\s]*>|<.+[\W]>

2. I see it uses "alternation". Not a bad idea to wrap it in brackets so: (<\/?[\w\s]*>|<.+[\W]>)

3. put it in a variable as it contains [...] brackets. (You know how to do that :-)

4. reference the variable in the regex [regexp:text[<regexvariable>]]

Did it work? :-)

TT

Mat

unread,
Aug 24, 2019, 6:09:59 PM8/24/19
to TiddlyWiki
Mark S. wrote:
My trick is to "flatten" the text before applying the target regular expression

...tiddlers...splitregexp[\n]join[ ]splitregexp<myfilter> ...

Clever!

These nuggets should probably be in the official docs. While it "is about regex, not TW" it IS about regex in a TW context which does mean extra demands and quirks.

<:-) 

Mat

unread,
Aug 24, 2019, 6:19:48 PM8/24/19
to TiddlyWiki
@TiddlyTweeter wrote
 
Is you aim here to CHANGE them?

I'm not sure what you mean. If you're referring to the @@... @@ then the idea, or I should say; then *one* idea is to be able to extract and get a list of all applied custom styles. @Mohammad asked for use case so I tried to think back about situations I've stumbled in - so the requests are a bit hypothetical at this point in time. 

I do know I will need these things for when I update the CherryPicker and @ttention plugins. I hope these regexes will help to split out parts of strings in a better way than how this is currently done and also to be able to implement some of the "future ideas" listed in the CherryPicker plugin.

<:-)

Mat

unread,
Aug 24, 2019, 6:24:08 PM8/24/19
to TiddlyWiki
@TiddlyTweeter wrote:
.... (<\/?[\w\s]*>|<.+[\W]>) ...

Thank you for the description. If I have a specific html tag - or a specific $widgettag for that matter! - how would the expression be modified?

<:-)

TonyM

unread,
Aug 24, 2019, 11:25:33 PM8/24/19
to TiddlyWiki
Josiah,

I see here that regex is not so good at ranges however it seems that determining the magniture of a number may be easier. For example to test if the number in a text!!reference or variable is one, two or three digits in size  and if you test all three then you could test if the number is from 0 to 999 this may be enough for a lot of applications. You may ask someone for their age and you could eliminate someone entering their year of birth by mistake because 1980 is > 999

As suggested before the ability to write a quick test on a number that will display a message if outside the range could be helpful.

\define magnitude3() [regex[blah]]
{{{ [<var>subfilter<magnitude3>else[number too big]] }}}

Where blah is a regex that tests if the input is a number of 3 digits maximum 0-999

Regards
Tony

On Saturday, August 24, 2019 at 8:02:55 PM UTC+10, @TiddlyTweeter wrote:
Ciao TonyM

Regex is best developed with concrete data. It has no maths ability. Everything is just a string of characters to it. 
But its often possible to match using pattern. It depends on working with example test data to ensure where it might work.

So could you give a paragraph or two of test data?

TT

On Saturday, 24 August 2019 07:15:47 UTC+2, TonyM wrote:
Mark/Josiah,

Is there a simple way to test a number is in a range and or greater than or less than?

It would be nice to have a pattern to test if a number lies between or equal to a number, even if we simply follow it with the new then or else operators and or make use of the emptyMessage on the list. Sadly the reveal greater than less than an equal to tests are somewhat limited and we do not yet have greater than or less than filter operators although match is now a form of equals.

We may be able to have some tests like this

{{{ [<number>regex[input>A$<B]else[out of range]] }}} 

Regards
Tony

Mohammad

unread,
Aug 25, 2019, 1:17:33 AM8/25/19
to TiddlyWiki
Thanks Josiah,
 It works great! The only point should be mentioned is it also matches wrong cases, but considering a correct number it is not a big deal.

Example

eee_dp
1.23eee45_dp
eee111.34_dp

Note: A number with/without scientific notation starts with number or float point like (1.23e3  or .123e3)
so, one improvement is to prevent match against e123.
the second improvement may be to prevent more than one e.

Cheers
Mohammad

@TiddlyTweeter

unread,
Aug 25, 2019, 3:44:16 AM8/25/19
to tiddl...@googlegroups.com
Mark S. wrote:
My trick is to "flatten" the text before applying the target regular expression

...tiddlers...splitregexp[\n]join[ ]splitregexp<myfilter> ...

Mat wrote: 
These nuggets should probably be in the official docs. While it "is about regex, not TW" it IS about regex in a TW context which does mean extra demands and quirks.

I think better documenting the specific way TW regex in OPERATORS work with the "scope" in "single string" fields v. the "multi-line" text field is important.

A lot of the solutions Mark & I gave are "single string" field orientated. When used in text field they may break. 

There needs to be a better documented bridge between general JavaScript regex and the TW operators implementation. Otherwise it could be confusing.

TT

 

@TiddlyTweeter

unread,
Aug 25, 2019, 3:50:33 AM8/25/19
to tiddl...@googlegroups.com
Ciao Mohammad

To get a preciser match I'd like to know where in the number "e" can appear. 

Is it always near the end? For instance ...

+1.23e4_dp
-1.23e4_dp
1.236e+5_dp
-1.23e-5_wp

In these "e" is left-offset from "_" either 2 or 3. In that always the case?

TT

Mat

unread,
Aug 25, 2019, 5:13:45 AM8/25/19
to TiddlyWiki
@TiddlyTweeter wrote:

To get a preciser match I'd like to know where in the number "e" can appear. 

 
Since it means "10^e", the "e" is "functionally" at the end but it might not appear to be since the exponent can be any number e.g

1.2e-334556.232564264  

(which means 1.2*10^-334........)

The "e" itself is sometimes instead written "E" (maybe because old calculators can't easily show "e").

That the exponent can also contain a varied set of characters. In reality it can be any number of characters because they can be variables with arbitrary names but the common cases are limited. While my math knowledge is both limited and rusty, I'd say these are necessary to manage:
  • minus character e.g 1.2e-3
  • decimal period e.g 1.2e3.4
  • division character as in 1.2e3/4
I would think one can also write 1.2e.3 meaning 1.2e0.3. Of course, in many parts of Europe we write 1.2e0,3 (comma instead of period).

Beyond this maybe parentheses and constants like pi. There can also be other bases than 10 which, if memory serves (and it might not), is notated with a subindex on the e. This is not totally fringe but definitely "university level".

<:-)

P.S What is that noise? Is that Josiah screaming curses, pulling his hair and smashing his computer? Easy my friend, no need to cover it all. ;-)

Mat

unread,
Aug 25, 2019, 5:21:01 AM8/25/19
to TiddlyWiki
Here's a wikipedia article on scientific notation / E notation


<:-)

@TiddlyTweeter

unread,
Aug 25, 2019, 5:27:03 AM8/25/19
to TiddlyWiki
Ciao TonyM

Thanks. I now  better understand what you are trying to do. 

The example below illustrates (1) "matching magnitude" (that regex can do; it can become, sort-of, a peasant's "pseudo-range" :-); (2) "matching length". 
Ask if you need any clarification.

Because it uses "[...]" character classes they need to go into a variable BEFORE they get put into the operator, see: https://groups.google.com/d/msg/tiddlywiki/VFJS9eB9oV4/G14R6_clAQAJ

Note: the example ASSUMES the matching is of a single-string field.

_Number 000 -> 799 _ "^([0-7][0-9][0-9])$" (must be exactly three numbers long)
   "->" indicates a match ...
_Number 000 -> 799 _ "^([0-7][0-9][0-9])$" (must be exactly three numbers long)
  or, more compact ...
_Number
000 -> 799 _ "^([0-7]\d\d)$" (must be exactly three numbers long)↩︎

... These should NOT match↩︎
800↩︎
27↩︎
8↩︎
... These should match↩︎
->799↩︎
->435↩︎
->000↩︎
->127

TT

@TiddlyTweeter

unread,
Aug 25, 2019, 5:37:58 AM8/25/19
to TiddlyWiki
Very helpful!

Mohammad

unread,
Aug 25, 2019, 5:42:10 AM8/25/19
to TiddlyWiki
Hi Josiah,
 Mat explained it, BUT to keep it practical and simple I would only need the below

  • _dp or _myprecision only appear at the end
  • decimal point can appear anywhere but not after _
    • so a number like .123_dp is valid
    • so a number like ._dp is valid
    • so 1.23e4_dp is valid
  • e or E means exponent and it can be appeared only after decimal point and before _precision
    • so 1.23E4_dp is valid
    • so 123.36589e11_wp is valid
    • in standard the number before decimal pint is between 1 and 9 (including) so 1.2356e7_dp

Please note that these are documented for learning regexp in Tiddlywiki, so we need keep them simple!
Thanks!

@TiddlyTweeter

unread,
Aug 25, 2019, 6:21:54 AM8/25/19
to TiddlyWiki
Mat wrote:

P.S What is that noise? Is that Josiah screaming curses, pulling his hair and smashing his computer?  ;-)

Tx Mat, Actually it's a change to be able to address issues in a way I can actually do with a sense of basic competence. :-). 

 J.
 

@TiddlyTweeter

unread,
Aug 25, 2019, 6:26:48 AM8/25/19
to tiddl...@googlegroups.com
Mohammad

I think about it. And keeping it simple. If I can't keep it simple enough I won't do it.

It is actually a good case because it has "small variations" on e/E in the pattern.

TT

@TiddlyTweeter

unread,
Aug 25, 2019, 8:56:20 AM8/25/19
to TiddlyWiki
TonyM

A bit more. Slightly more advanced. You could do a "pseudo-range" in regex like below.

The regex uses "alternation", "|", to separate decade ranges, its a bit like an OR construct. (Though in regex its a pattern that advances from left to right, recusing through alternates, till there is a match, or not.)

[2][2-9] - matches 22-29
[3-4][0-9] - matches 30-49 (this could be any number of decades that remain whole)
[5][0-5] - matches 50-55

_Ages 22 -> 55_"^([2][2-9]|[3-4][0-9]|[5][0-5])$"
  "->" indicates a match ... 
_Age 22 -> 55_ regex: "^([2][2-9]|[3-4][0-9]|[5][0-5])$"↩︎
↩︎
... These should NOT match↩︎
21↩︎
56↩︎
300↩︎
4↩︎
05↩︎
... These should match↩︎
->22↩︎
->29↩︎
->30↩︎
->37↩︎
->40↩︎
->45↩︎
->55↩︎

TT

Mohammad

unread,
Aug 25, 2019, 8:59:57 AM8/25/19
to tiddl...@googlegroups.com
For date like 25th August 2019  1st January 2020  2nd February 2018 does the below pattern is correct?

^\d{2}(st|nd|rd|th)\s{1}(January|February|March|April|May|June|July|August|September|October|November|December)\s{1}\d{4}$

These are kind of date is used by Tiddlywiki journal.

--Mohammad

TonyM

unread,
Aug 25, 2019, 9:01:51 AM8/25/19
to TiddlyWiki
Thanks Josiah/TT

So this suggests to me I may be able to test if an IP address is valid or in a local domain -

[02][0-255][0-255][0-255]

[192][168][0-255][0-255]

Or variations there of?

Have you put together some examples of the basic methods in tiddlywiki?. One would be the \define of the regex string in a macro to accommodate [ ] and using then/else and empty message to respond to the result of a regex filter? etc...

Regards
Tony

@TiddlyTweeter

unread,
Aug 25, 2019, 9:16:56 AM8/25/19
to TiddlyWiki
TonyM wrote:
[02][0-255][0-255][0-255]

[192][168][0-255][0-255]

Do you mean "02" literally? In regex [02] means "0" OR "2". 
The pattern can be matched I'm sure.
But you need clarify what are "literals" & what are ranges.

TT 

Mohammad

unread,
Aug 25, 2019, 11:29:10 AM8/25/19
to TiddlyWiki
The duplicate words does not work!

It seems only if the first word of title repeated it will be matched. Look at the below two tiddler titles

  1. This is a Tiddler This is Nice
  2. Nice Tiddler is This Tiddler
The pattern will match the first but ignore the second while both have a duplicate word!

--Mohammad

On Friday, August 23, 2019 at 10:51:06 PM UTC+4:30, Mark S. wrote:
Ok, with duplicates and date formats. Note that the date formats only check for the format. You could still create
nonsensical dates that actually match the formats (Jan 55 9999, 1111.15.55). Actual validation of dates would take
real code massaging.

<$vars digonly="^[0-9]*$">
<$vars useme=<
<digonly>>>
</$vars>
</$vars>

<$select tiddler="myregexp">
<option value="^[0-9]*$">Only digits</option>
<option value="^[a-z]*$">Only lower case</option>
<option value="^[A-Z]*$">Only upper case</option>
<option value="^[\w-_]*$">Only alphanumeric, _, and -</option>
<option value="^[\w]{3,15}$">Only alphanum len 3-15</option>
<option value="^[A-Z]+.*$">Starts with capital</option>
<option value="^[0-9]+.*$">Starts with digit</option>
<option value="^.+\.[a-zA-Z]{3,4}$">Extensions only</option>
<option value="^.+(\.jpg|\.gpeg)$">Extension jpg gpeg</option>
<option value="^\b(\w{2,})\b.*\b\1\b.*$">Duplicate words</option>
<option value="^\b(\w{2,})\b.*\b\1\b.*$">Duplicate words</option>
<option value="^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s{1}\d{2}\s\d{4}$">Date like Jan 06 2019</option>
<option value="^\d{4}\.[0-1]\d\.[0-3]\d$">Date like 2019.08.25</option>
</$select>

<$list filter="[regexp{myregexp}sort[]]">

</$list>



On Friday, August 23, 2019 at 12:11:07 AM UTC-7, Mohammad wrote:
I am looking for examples and use cases of regexp in Tiddlywiki!
Those can be done current filter operators like prefix, search,... are not recommend to be done with regexp.

I appreciate your help, case and examples on this. Just give what you want to do.

Some case

Give a regexp pattern in Tiddlywiki to match all tiddlers name are

  1. only digits
  2. only lowercase letters
  3. only uppercase letters
  4. only alphanumeric and underscore and hyphen
  5. only alphanumeric with length between 3 and 15
  6. start with a capital letter
  7. start with a digit
  8. have a extension like mytiddler.ext
  9. have jpg or jpeg extension like mytiddler.jpg or mytiddler.gpeg
  10. are a date in format like Jan 06 2019 
  11. are a date in format like 2019.08.25 
  12. have duplicate words
  13. have a valid url


[This list will grow by more examples]


Please give your use case.

-- Mohammad

Mark S.

unread,
Aug 25, 2019, 1:21:15 PM8/25/19
to TiddlyWiki
Try

<option value="\b(\w{2,})\b.*\b\1\b">Duplicate words</option>

Mark S.

unread,
Aug 25, 2019, 1:55:47 PM8/25/19
to TiddlyWiki
I think he means "02" literally. Usually IP numbers aren't padded, so not sure.

It's the range 0-255 that's problematic. Here's what I have for the range:

<option value="^(\b\d\b|\b\d\d\b|1\d\d|2[0-4]\d|25[0-5])">IP range 0-256</option>

Hmm, I guess with an IP you could add the mandatory delimiter (usually ".") and repeat the group. But you would have to manually repeat the group at the end where the delimiter must not be.

And then there's zero padding. Most of the IP numbers I've seen are not zero-padded, but ...

I think the first thing I would do is see what the internet says.

A search for "regular expression ip address" immediately turns up a page from O'Reilly, with both a simple
version and an accurate version for checking IP. As I expected, they're able to do a repeat on the structure 3 times, but
have to do the last one by hand. They've figured out the 0 padding:

^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$

So ... no need to rebuild the wheel for most common use cases. Hmm, I wonder about IPv6 ?

Ok, sorry for the stream-of-consciousness problem-working.

Thanks!

@TiddlyTweeter

unread,
Aug 25, 2019, 2:04:13 PM8/25/19
to TiddlyWiki
Mohammad 

I'm not sure how much spaced repetition is an issue really ("I had an old an clock")? The commonest issue is simple sequential repeating ("I had an an ...").

But Mark's regex works on test data. Though I'd simplify it to ...

(\b\w{2,}\b)(.*)\1

Example match in test data (match in lines)...

-> ... <- is match
(\b\w{2,}\b)(.*)\1 ... spaced duplicate words to remove↩︎
↩︎
This is a Tiddler
->This<- is Nice↩︎
Nice Tiddler is This
->Tiddler<-↩︎
Remove this repeat of
->this<- Tiddler repeat of ->repeat<-.↩︎

TT

Mohammad

unread,
Aug 25, 2019, 2:44:56 PM8/25/19
to TiddlyWiki
Thanks Mark and Josiah,
 It seems the problem is with ^$.

I removed them and it works. But not sure where they  are required.

@Josiah,
 I add the consecutive duplicate words!

Thank you again

--Mohammad

@TiddlyTweeter

unread,
Aug 25, 2019, 2:54:56 PM8/25/19
to TiddlyWiki
Mohammad wrote:
 It seems the problem is with ^$.

There are definitely issues with "scope" in TW that are currently unclear.

Regex will fail, or do unexpected things, if its not fully clear what the context the regex is working in.

This needs clarifying IMO. 

TT

@TiddlyTweeter

unread,
Aug 25, 2019, 3:04:11 PM8/25/19
to TiddlyWiki
Mark S. wrote:
I think he means "02" literally. Usually IP numbers aren't padded, so not sure.

It's the range 0-255 that's problematic. Here's what I have for the range:

<option value="^(\b\d\b|\b\d\d\b|1\d\d|2[0-4]\d|25[0-5])">IP range 0-256</option>

IF that means numbers 000 to 255 it looks doable.

Hmm, I guess with an IP you could add the mandatory delimiter (usually ".") and repeat the group. But you would have to manually repeat the group at the end where the delimiter must not be.

That is quite easy in regex as you can make it just "\.?". Repeat is easy, just put the dot first on repeats.

And then there's zero padding. Most of the IP numbers I've seen are not zero-padded, but ...

That is much more difficult in regex. So long as the system throws the 0's away when not needed it may be okay?

I think the first thing I would do is see what the internet says.

A search for "regular expression ip address" immediately turns up a page from O'Reilly, with both a simple
version and an accurate version for checking IP. As I expected, they're able to do a repeat on the structure 3 times, but
have to do the last one by hand. They've figured out the 0 padding:

^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$

So ... no need to rebuild the wheel for most common use cases. Hmm, I wonder about IPv6 ?


Ok, sorry for the stream-of-consciousness problem-working.

Its interesting & useful!

TT

Mark S.

unread,
Aug 25, 2019, 3:06:08 PM8/25/19
to TiddlyWiki
Most of your examples have an implied scope of the entire tiddler title.

That is, if the tiddler can only have lower case letters, then every single character from start to end (^ to $) has to be lowercase.

But duplicate words can start and end anywhere inside the title, so the scope doesn't have to apply to every single character.

So, it was my fault for churning them out en masse, using previous versions as my template ;-)

Thanks!

Mohammad

unread,
Aug 25, 2019, 3:18:06 PM8/25/19
to TiddlyWiki
Thanks Mark
Thanks Josiah,
 I correct the example in documentation wiki http://tw-regexp.tiddlyspot.com/

Both of you help to demystify regexp (very scary when you are familiar and look at its strange pattern!!) in Tiddlywiki.
I think (as Mat also noted)  a better understanding of regexp can help a lot for more efficient using of Tiddlywiki.

I will ask more questions and hope I have your answer, and hopefully I will gradually add them to the documentation wiki.

Cheers
Mohammad

Mohammad

unread,
Aug 25, 2019, 3:31:33 PM8/25/19
to TiddlyWiki
See the latest update of documentation wiki in that most part of this thread has been documented

rev: 0.5

--Mohammad 

TonyM

unread,
Aug 25, 2019, 11:20:49 PM8/25/19
to TiddlyWiki
Folks,

Sorry for my absence for a while. The IP address idea the first character in each number should be [0-2] because only 0, 1 and 2 are valid in the hundreds position.

The idea would be to accept a value such as 124.3.0.1 and determine if it was valid ie no number between the dots should be other than a number from 0-255

Ok we may not be able to avoid 299 being used but we could att a test that the whole number not be greater than 255

Separate tests could check for 10.*.*.* 192.168.*.* 127.*.*.* or if equal to 1.1.1.1 or 0.0.0.0. to determine if they are local or public addresses. 

Such a facility would allow tiddlywiki to become a DNS database and more as it relates to IP Addresses. Especially TiddlyDesktop that could launch pings and NSLookups, trace and more including opening the sites in an iframe.

I have always believed that Tiddlywiki would make a good platform for the following
  • Network and Operations database and dashboard
  • Configuration management database
  • Website and device directory
  • System Change management
Not to mention my quite old idea of building  device wiki that has all the details about a device including config settings, manuals, diagnostic methods and the device wiki can be stored in a repository and a copy on a usb stick secured to the device. A OTG USB cable adaptor would allow you to open the device wiki with a mobile device and where valid if the device has storage even host the wiki on the device and access it over the network eg; the usb port on your router.

So a macro and regex that makes IP Addresses trivial to handle will be a boon to these applications.

Regards
Tony

TonyM

unread,
Aug 25, 2019, 11:36:55 PM8/25/19
to TiddlyWiki
Mohammad,

This is becoming a very helpful resource. thanks for sharing your work,

I noticed the test for leading caps in a title - I discovered a method on this to capitalise only when needed,

for example try this in a tiddler named tiddlername
<p style="text-transform: capitalize;"><<currentTiddler>></p>

or in 5.1.20 (First to lowercase is a good pattern see https://tiddlywiki.com/#titlecase%20Operator)
{{{ [{!!title}lowercase[]titlecase[]] }}}

<$text text={{{ [{!!title}lowercase[]titlecase[]] }}}/>

For example this can be used to capitalise a fieldname on display when you can't store it capitalised
{{{ [fields[]lowercase[]titlecase[]] }}}

I suppose what I am saying here is capitalisation is more a display feature than a necessary naming standard.

Regards
Tony

On Monday, August 26, 2019 at 5:31:33 AM UTC+10, Mohammad wrote:

@TiddlyTweeter

unread,
Aug 26, 2019, 3:16:51 AM8/26/19
to TiddlyWiki
Mat could you please try this ...

<option value="@@(\s|\S)+?@@">Match between "@@" pairs</option>

and let me know if it works.

TT

On Saturday, 24 August 2019 20:08:59 UTC+2, Mat wrote:
@TiddlyTweeter wrote:

<option value="@@(.+?)@@">Match between "@@" pairs</option>

The "+?" is to make the match "lazy" so it won't extend beyond the second @@ to a third pair of @@.
The content between them is passed to a "capturing group".

Thank you - but: 

If the match needs to span lines regex settings may need tweaking.

In deed, this common case is not found AFAICT:

@@.mystyle
foo bar
@@

<:-)

Mohammad

unread,
Aug 26, 2019, 3:28:39 AM8/26/19
to TiddlyWiki
Thanks Tony!
 This way I can use directly the title and the description filed can be avoided.

Cheers
Mohammad

@TiddlyTweeter

unread,
Aug 26, 2019, 6:42:17 AM8/26/19
to tiddl...@googlegroups.com
Mohammad,

http://tw-regexp.tiddlyspot.com/ is very good!!

I doubt I have to make my own version now :-). In a way its better there is ONE resource, not two.

One thing we need to document is matching in the "text" field. 

I think some of the our regex will FAIL because it is NOT clear what the default FLAGS for the "text" field are.

Its obvious from the discussion most users have no idea what regex flags are, or that they matter to what happens.

My impression is that "."  in the text field is NOT matching newlines (using the operator), which means many regex will break if they need to span them.

TT
Message has been deleted

Mohammad

unread,
Aug 26, 2019, 8:41:50 AM8/26/19
to TiddlyWiki
Hi TT,
 I think the simplest is the title and then tag and other fields and most difficult is the text!

--Mohammad

On Monday, August 26, 2019 at 3:12:17 PM UTC+4:30, @TiddlyTweeter wrote:
Mohammad,

http://tw-regexp.tiddlyspot.com/ is very good!!

I doubt I have to make my own version now :-). In a way its better there is ONE resource, not two.

May this include your previous tutorial.

@TiddlyTweeter

unread,
Aug 26, 2019, 9:01:45 AM8/26/19
to TiddlyWiki
The text field is no more difficult. Other than its longer.

The problem is explaining the flags.

TT

@TiddlyTweeter

unread,
Aug 26, 2019, 9:03:58 AM8/26/19
to TiddlyWiki
M: May this include your previous tutorial.

Feel free to use anything I wrote.

Mohammad

unread,
Aug 26, 2019, 11:12:54 AM8/26/19
to TiddlyWiki
Please see this suggestion:

  • tiddler starts with capital letter: ^[A-Z]   instead of ^[A-Z].*$
  • tiddler starts with digits: ^[0-9] instead of ^[0-9].*$

What do you think?

--Mohammad

Mark S.

unread,
Aug 26, 2019, 11:20:57 AM8/26/19
to TiddlyWiki
Yep. Those are improvements.

Mohammad

unread,
Aug 26, 2019, 11:55:38 AM8/26/19
to TiddlyWiki
Thanks for confirmation!

--Mohammad

@TiddlyTweeter

unread,
Aug 27, 2019, 6:13:19 AM8/27/19
to TiddlyWiki
TonyM wrote:
...  check for 10.*.*.* 192.168.*.* 127.*.*.* or if equal to 1.1.1.1 or 0.0.0.0. to determine if they are local or public addresses. 

I done this in two posts so its easier to understand. 

To help Mohammad document let's first deal with how to simply match IP sub-numbers 0 to 255 using regex in decimal & binary.

0-255, Decimal

<option value="\b([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b">0-255 Decimal, no leading zeros</option>

\b = "word boundary". (In regex, numbers are just word characters.)
"(...)" = a capturing group.
"|" = alternate matches within the group (like "or")
[0-9] = match from 0 to 9
[1-9][0-9] = 10 to 99
1[0-9][0-9] = 100 to 199
2[0-4][0-9] = 200 to 249
25[0-5] = 250 to 255
\b = "word boundary"


00000000-11111111, Binary

IP addresses, in binary, use leading zeros. These are very easy to match. 

255 decimal is 11111111 binary, so just repeat 1 or 0 8 times by using {8} next to the class or group.

Using [...] Character Class 
<option value="\b[01]{8}\b">00000000-11111111, Binary Byte</option>

Using "|" Alternation in a Capture Group 
<option value="\b(0|1){8}\b">00000000-11111111, Binary Byte</option>

Interesting fact: This is the maximum info a byte (8 bits) can hold, in decimal 255 & 0, i.e. 256 combinations, hence "magic number" 256.

TT

TonyM

unread,
Aug 27, 2019, 7:38:32 AM8/27/19
to TiddlyWiki
Thanks TT

Now to test a fixed number such as 127 I imagine that's just a string?

tony

@TiddlyTweeter

unread,
Aug 27, 2019, 8:33:50 AM8/27/19
to tiddl...@googlegroups.com
TonyM wrote:
...  check for 10.*.*.* 192.168.*.* 127.*.*.* or if equal to 1.1.1.1 or 0.0.0.0. to determine if they are local or public addresses. 

Using the 0-255 regex of last post you can build a more complex regex.

Because I never worked with IP addresses I consulted Wikipedia to understand ranges  IPv4, private network addresses.

IPv4, private network: 192.168.0.0 – 192.168.255.255

(\b192\.168(\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b){2})

I'll explain this one. The others should then be readable.

= open capture group 1
\b = word boundary
198 literal
\. = escape for literal stop-mark (otherwise "." matches  any character)
168 literal
= open capture group 2
\. = literal stop-mark
= open capture group 3
[0-9] etc. = regex we made in last post to match 0-255
= close capture group 3
\b = word boundary
= close capture group 2
{2} = repeat pattern in group 2 twice
= close capture group 1

IPv4, private network: 10.0.0.0 – 10.255.255.255
(\b10(\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b){3})

Much like previous, notic{3}.

IPv4, private network: 172.16.0.0 – 172.31.255.255

(\b172\.(1[6-9]|2[0-9]|3[01])(\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b){2})

This is a bit more complex as it needs 2 ranges. 16-31 and 0-255. Notice {2}.

IPv4 private networks starting: 10, 192.168, 172.16

(\b(192\.168|10|172\.(1[6-9]|2[0-9]|3[01]))(\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b){2,3})

This combines the three above into one regex. More advanced, but gives idea you can combine things. Notice {2,3}.

Test for this more complex one
-> ... is match
_Should NOT match_↩︎
9.255.255.255↩︎
10.0.0.00↩︎
147.168.255.255↩︎
172.32.255.255↩︎
172.14.255.255↩︎
192.168.1.256↩︎
192.27.255.255↩︎
↩︎
_Should match_↩︎
->10.0.0.0↩︎
->172.16.89.125↩︎
->192.168.1.1↩︎
->192.168.1.255↩︎

TT

@TiddlyTweeter

unread,
Aug 27, 2019, 8:36:16 AM8/27/19
to TiddlyWiki
TonyM wrote:

Now to test a fixed number such as 127 I imagine that's just a string?


Yes. The examples I just gave will help.

TT
 
Reply all
Reply to author
Forward
0 new messages