How modify this regexp?

116 views
Skip to first unread message

Mat

unread,
Mar 7, 2020, 1:31:15 PM3/7/20
to tiddl...@googlegroups.com
Original post deleted because I...um... misunderstood my own question. 
This also means that the first few posts refer to the old question.
Instead jump down to my post starting with: "More compact"

<:-)

PMario

unread,
Mar 7, 2020, 2:55:17 PM3/7/20
to tiddl...@googlegroups.com
Hi Mat,

How can this regexp be modified to accept any text that has <<foo ...... >>

For me this:

<$set name=re value="<<foo.*?>>">
{{{ [all[tiddlers]prefix[Draft]!is[current]regexp:text<re>] }}}
</$set>

does the trick.

-mario

PMario

unread,
Mar 7, 2020, 2:58:38 PM3/7/20
to TiddlyWiki
Hi,
I did modify the code in the first post to use a regexp: <<foo.*?>> ..
The first version was greedy so it found: <<foo test>> some more text >> instead of <<foo test>>
-mario

PMario

unread,
Mar 7, 2020, 3:08:16 PM3/7/20
to TiddlyWiki
Hi Mat,

The best description about regexp I've ever found is: https://www.regular-expressions.info/javascript.html

They describe it in a way that I actually could understand and modify by myself.

-mario

PMario

unread,
Mar 7, 2020, 3:17:25 PM3/7/20
to TiddlyWiki
Hi,

Are you sure, you want prefix[Draft]  instead of !prefix[Draft] ?? 

It only detects tiddlers in draft mode atm.

-m

Mat

unread,
Mar 7, 2020, 3:33:11 PM3/7/20
to TiddlyWiki
Much appreciated that you're looking at this!

...and your post makes me realize I am asking the totally wrong question (Doh!!! I got confused from dabbling with this so I forgot what I'm actually in need of)  :-/ 

The actual need is to catch the unclosed <<foo in this text, potentially surrounded by completed <<foo>>

Lorem ipsum

Lorem ipsum <<foo bar>> lorem impsum <<foo lorem ipsum <<foo>> and <<foo bar>> lorem.
...

Sorry for confusion and never mind the prefix[Draft] bits, they're not important, I should have removed them.

Again, I appreciate your and anyones help as I don't master regexp sufficiently.

<:-)

PMario

unread,
Mar 7, 2020, 3:59:53 PM3/7/20
to TiddlyWiki
On Saturday, March 7, 2020 at 9:33:11 PM UTC+1, Mat wrote:
The actual need is to catch the unclosed <<foo in this text, potentially surrounded by completed <<foo>>

hihi, .. That's a different thing :)
 

Lorem ipsum

Lorem ipsum <<foo bar>> lorem impsum <<foo lorem ipsum <<foo>> and <<foo bar>> lorem.
...


So the end of the line should stop a search. right?
If << comes up again .. there is a problem, so it should return the tiddler name. right
Is there a different "stop" indicator than << ?

-m

Mat

unread,
Mar 7, 2020, 4:12:41 PM3/7/20
to tiddl...@googlegroups.com

Lorem ipsum

Lorem ipsum <<foo bar>> lorem impsum <<foo lorem ipsum <<foo>> and <<foo bar>> lorem.
...


I think it can be phrased as:

Find the first "<<foo "
...that is not followed by ">>" nor "....>>" (where .... signifies any characters)
...OR that IS followed by "<<" or "....<<" (or, for that matter, any other forbidden characters inside a short form macro call)

But the end of the line should not stop the search.  The point is to catch the first uncomplete "<<foo" macrocall however long the tiddler text is.

<:-)

Mat

unread,
Mar 7, 2020, 4:21:28 PM3/7/20
to TiddlyWiki
More compact:

Find the first "<<foo "
...that is not followed by "....>>"   (where .... signifies any number of characters, including none)
...OR that IS followed by "....<<"  or character/s that is forbidden inside a short form macro call.


(This is pretty difficult to get right.)

<:-)

PMario

unread,
Mar 7, 2020, 5:03:23 PM3/7/20
to TiddlyWiki
I think, the same problem applies that Jeremy had, to detect this pattern. So if it could be detected, it would be valid syntax.

IMO you can _not_ detect this pattern in 1 run. You would have to find every appearance with: <<foo[^<]*  which finds <<xxxx<< ... In a second step you have to see if xxxx contains >> .. If yes -> OK ... If no problem.

1 regexp filter can't handle this. Especially, since our filters return tiddler names and not regexp capture groups, which would be needed.

-m

Mark S.

unread,
Mar 7, 2020, 5:14:04 PM3/7/20
to TiddlyWiki


On Saturday, March 7, 2020 at 2:03:23 PM UTC-8, PMario wrote:

1 regexp filter can't handle this. Especially, since our filters return tiddler names and not regexp capture groups, which would be needed.



PR 2963

Mat

unread,
Mar 8, 2020, 4:51:08 AM3/8/20
to TiddlyWiki
IMO you can _not_ detect this pattern in 1 run. You would have to find every appearance with: <<foo[^<]* which finds <<xxxx<< ... In a second step you have to see if xxxx contains >> .. If yes -> OK ... If no problem.

OK, I don't think it is necessary with one step for EditorMagic considering how it is the text of a single tiddler that is searched so it should be pretty fast regardless. With <<foo[^<]* as a start (thank you!) I'll see what I can come up with.

<:-)

Mat

unread,
Mar 8, 2020, 5:02:22 AM3/8/20
to TiddlyWiki
Mark S. wrote:
PR 2963 
 
i.e Returns text of input selection that matches regexpr - yes, that would be needed here.

I would also think #4452 Memory variable filter ops is relevant here to enable doing this in a single step.


(Hm, why do my projects hit the outer limits so often?)

<:-)
 

TiddlyTweeter

unread,
Mar 8, 2020, 5:47:33 AM3/8/20
to TiddlyWiki
(Hm, why do my projects hit the outer limits so often?) 

I think at least two of your recent ones are related to the current lack of a way under the TW rexexp filter operator to be able return capture groups.
This limits what you can do with regex in filters and requires a kind of coding jujitsu to get around.

TT

TiddlyTweeter

unread,
Mar 8, 2020, 6:02:56 AM3/8/20
to TiddlyWiki
PMario wrote:
The best description about regexp I've ever found is: https://www.regular-expressions.info/javascript.html 

They describe it in a way that I actually could understand and modify by myself. 

I agree. There is a depth of context in that site that is very helpful in understanding regex in TW.

(I personally also use the specialized commercial  tools (RegexBuddy, PowerGrep) that Jan Goyvaerts wrote.)

TT

TiddlyTweeter

unread,
Mar 8, 2020, 6:23:52 AM3/8/20
to TiddlyWiki
Ciao Mat

I'd also test this with "<<foo[^<]*?". The final qualifier "?"  means that "not <" will only march along the scope to the character just before the first proximal occurrence of "<". Under some conditions, without it, it might match the final occurrence.

TT

TiddlyTweeter

unread,
Mar 8, 2020, 8:16:10 AM3/8/20
to TiddlyWiki
(This is pretty difficult to get right.)

Its a very interesting regex issue. Regex often looks counter intuitive. Using negatives can get complicated.

It IS possible to match the needed in one regex. But you would have to chop off the suffix since we can't currently do that by silent non-capturing, then reinstate it so that the "<<" becomes ">><<"

(<<foo([^>])*?)([<]{2})

That matches this ...

Annotation 2020-03-08 130948.jpg


--|match|--

<< suffix to replace

TT
Message has been deleted

Mat

unread,
Mar 8, 2020, 9:41:06 AM3/8/20
to TiddlyWiki
TT - that does seem to work! Great!!!

For anyone curious, I have this:

A tiddler with this arbitrary text

title:tid
text
: Welcome to TiddlyWiki <<foo , a unique [[non-linear]] xx <<foo bar>>notebook for.

And this is my testing code (at the moment):

<$set name=re value='(<<foo([^>])*?)([<]{2})'>
<$set name=txt filter='[[tid]get[text]]'>
txt: <
<txt>><br><br>
list: <$list filter="[{tid}regexp
<re>splitregexp<re>]">

</$list>
<br>
<$set name=pre filter="[{tid}regexp
<re>splitregexp<re>first[]]">
pre:<
<pre>>
<br><br>
<$set name=target filter="[{tid}regexp
<re>splitregexp<re>nth[2]]">
target: <
<target>>
<br><br>
<$set name=post filter="[{tid}regexp
<re>splitregexp<re>rest[2]]">
post: <
<post>>
</$set>
</$set>
</$set>
</$set>
</$set>

To be clear, the application in EditorMagic is to be replace the incomplete macro call with a complete one and to keep the surrounding text (the pre and the post) intact. In the above setup, the "target" contains


Other than the added square brackets that mess up the link ([[non-linear]]) I'll have to split the real target, i.e <<foo from the rest and prepend the rest to the "post" text. So ideally, the regexp would cover that IF it says "<<foo blabla <<" THEN it should ONLY capture "<<foo" rather than include the blabla with it. I'm guessing this is too complex for a regex and will have to be done manually. 

<:-)


Mat

unread,
Mar 8, 2020, 9:47:37 AM3/8/20
to TiddlyWiki
Meanwhile, I added this request at gh:


which argues for a set of filter ops (or even a single filter op) that stems from the most basic use case I can think of. 

Do go there and express your agreement or disgust over it. ;-)

<:-)

TiddlyTweeter

unread,
Mar 8, 2020, 10:03:06 AM3/8/20
to TiddlyWiki
 Mat wrote:
TT - that does seem to work! Great!!!

It should work fine.

The two cases it will fail in are ...

-- When the problem is nested (I don't think that will happen for the OP).

-- When the item is the LAST in sequence. (That can be addressed but avoid it if it its not absolutely needed as it will make the regex very complex)

TT
Reply all
Reply to author
Forward
0 new messages