Re: Regex \%V question

75 views
Skip to first unread message

Christian Brabandt

unread,
Nov 8, 2010, 3:34:55 PM11/8/10
to Vim, vim...@vim.org
Hi Benjamin!

(re-posting to vim-dev, for clarity).

On Fr, 05 Nov 2010, Benjamin R. Haskell wrote:

> I still don't quite understand why my attempted solution to rameo's
> problem didn't work... As a pared down example, why is the entire
> visual range matched in this:
>
> x = outside visual block, V = nonspaces in visual, ' ' = space in visual
>
> /\%V\%(\S\+\s*\)*\%V
>
> xxxxx VVV VVV VVV xxxxx - text
> mmmmmmmmmmmmmmm - match
> xxxxx VVV xxxxx - text
> mmmmmmm - match
>
> I don't understand how the leading spaces in the visual range can be
> matched by a pattern that can't match leading spaces.

I think, I understand this part. This part boils down to a visual
selection item followed by zero or more of a sequence of any number of
non-space items followed by zero or more space. In other words, this can
match /\%V\%V and in fact that is what it matches.

> Removing the optionality, it's also weird, as the trailing space
> (singular!?) isn't matched:
>
> /\%V\S\+\s*\%V
> xxxxx VVV VVV VVV xxxxx - text
> mmmmmmmmmmmm - match
> xxxxx VVV xxxxx - text
> mmmm - match
>
> Can anyone shed some light on this?

This is a bug. The regular expression engine is quite complex in Vim. I
think, the attached patch fixes it.

regards,
Christian

re_visual.patch

Benjamin R. Haskell

unread,
Nov 8, 2010, 4:27:17 PM11/8/10
to vim...@googlegroups.com, vim...@googlegroups.com
On Mon, 8 Nov 2010, Christian Brabandt wrote:

> Hi Benjamin!
>
> (re-posting to vim-dev, for clarity).
>
> On Fr, 05 Nov 2010, Benjamin R. Haskell wrote:
>
>> I still don't quite understand why my attempted solution to rameo's
>> problem didn't work... As a pared down example, why is the entire
>> visual range matched in this:
>>
>> x = outside visual block, V = nonspaces in visual, ' ' = space in visual
>>
>> /\%V\%(\S\+\s*\)*\%V
>>
>> xxxxx VVV VVV VVV xxxxx - text
>> mmmmmmmmmmmmmmm - match
>> xxxxx VVV xxxxx - text
>> mmmmmmm - match
>>
>> I don't understand how the leading spaces in the visual range can be
>> matched by a pattern that can't match leading spaces.
>
> I think, I understand this part. This part boils down to a visual
> selection item followed by zero or more of a sequence of any number of
> non-space items followed by zero or more space. In other words, this
> can match /\%V\%V and in fact that is what it matches.

Ah! Thank you. I finally see now. I was thrown by the fact that
hlsearch hilights the character following a zero-width match. (So, the
highlight wasn't indicating a single match of the entire string; it was
indicating three matches: one for each leading space where \%V\%V was
matched, and the last for the runs of non-space+ space* within the
visual range.)


>> Removing the optionality, it's also weird, as the trailing space
>> (singular!?) isn't matched:
>>
>> /\%V\S\+\s*\%V
>> xxxxx VVV VVV VVV xxxxx - text
>> mmmmmmmmmmmm - match
>> xxxxx VVV xxxxx - text
>> mmmm - match
>>
>> Can anyone shed some light on this?
>
> This is a bug. The regular expression engine is quite complex in Vim.
> I think, the attached patch fixes it.
>

Works for me for this particular case.

--
Best,
Ben

James Vega

unread,
Nov 8, 2010, 4:33:08 PM11/8/10
to vim...@googlegroups.com

What I see with your patch is that end of the match is including the
first non-whitespace character after the end of the whitespace sequence.
This means all matches other than the first start on the second
non-whitespace character in the \S sequence and if the last
non-whitespace sequence is only one character, the subsequent whitespace
sequence won't be matched.

Using the style from above (with alternating case for 'm' indicating the
distinct matches):

xxxxx VVV VVV V xxxx - text
mmmmmMMMM - match

Also, I see inconsistent highlighting with 'incsearch' enabled when the
user has typed part of an escape sequence (e.g., '\' or '\%'). The
highlighting for every line below some arbitrary line completely
disappears until the escape sequence is completed. Performing a
":redraw!" or "/<Up>" may also trigger this.

If this happens while typing the search string, complete the escape
sequence, backspace to make it incomplete, complete it, etc. You'll see
the portion of the file that's properly syntax highlighted increase
every time you change the (in)complete status of the escape sequence.

Not sure if it's related, but the patch also introduces this warning:

regexp.c: In function ‘regtry’:
regexp.c:3741: warning: comparison between pointer and integer


--
James
GPG Key: 1024D/61326D40 2003-09-02 James Vega <jame...@jamessan.com>

Bram Moolenaar

unread,
Nov 10, 2010, 6:45:52 AM11/10/10
to Christian Brabandt, Vim, vim...@vim.org

Christian Brabandt wrote:

Thanks for making a patch.

Like you say, the regular expression code is very complex. Therefore we
should test as much as we can. I've had it on my todo list to add many
tests for the regexp engine, but nothing much got done yet. This is
also needed to switch to the faster regexp engine that's available.

Can you at least add a test for this specific issue?


--
BRIDGEKEEPER: What is your favorite colour?
LAUNCELOT: Blue.
BRIDGEKEEPER: Right. Off you go.
"Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Christian Brabandt

unread,
Nov 10, 2010, 7:48:08 AM11/10/10
to Bram Moolenaar, Vim, vim...@vim.org
On Wed, November 10, 2010 12:45 pm, Bram Moolenaar wrote:

Well, as James pointed out, it's not the correct way to fix it.
I haven't found a proper way to fix it.

>
> Like you say, the regular expression code is very complex. Therefore we
> should test as much as we can. I've had it on my todo list to add many
> tests for the regexp engine, but nothing much got done yet. This is
> also needed to switch to the faster regexp engine that's available.
>
> Can you at least add a test for this specific issue?

Sure, but first this issue needs to be fixed properly.

regards,
Christian

Christian Brabandt

unread,
Nov 18, 2010, 6:07:47 PM11/18/10
to vim...@googlegroups.com
Hi James!

Here is another patch, including a basic test, that should address all
the issues. I couldn't reproduce the highlighting problem, though. This
seems to work ok for me now for the test cases I could imagine.

regards,
Christian

visual.patch

James Vega

unread,
Nov 22, 2010, 1:09:53 PM11/22/10
to vim...@googlegroups.com

This does seem to work better. I'm still seeing the highlighting
corruption with "set hls is". I seem to have found another problem the
patch introduces, though, as well as one that exists without the patch.

Here's the introduced problem:

$ printf "foo\nbar\n" > testfile
$ vim -u NONE -N --cmd 'set hls' testfile
" Set the visual block to contain both lines and place the cursor
" just after 'bar' on the second line
<C-v>j$<Esc>
" Search for the test pattern, note that the cursor doesn't move off
" the r


/\%V\S\+\s*\%V

" Attempt to jump to the next match of the pattern, cursor still
" doesn't move
n

For some reason, with the patch the r is trapping the cursor. If you
move the cursor to the start of the file, and use n to cycle through the
matches, you'll correctly land on the 'f' and 'b' and then jump to the
'r' and get stuck there. If you cycle backwards with N, the cursor
won't get stuck but it does see the 'r' as a valid match.

As for the existing problem that I discovered, use the same test file.
Visually select only the foo line and perform the search. Now visually
select only the bar line. Both foo and bar will be highlighted until a
redraw is forced (via either <C-l> or :redraw!).

James Vega

unread,
Nov 22, 2010, 1:36:11 PM11/22/10
to vim...@googlegroups.com
On Mon, Nov 22, 2010 at 1:09 PM, James Vega <jame...@jamessan.com> wrote:
> On Thu, Nov 18, 2010 at 6:07 PM, Christian Brabandt <cbl...@256bit.org> wrote:
>> Here is another patch, including a basic test, that should address all
>> the issues. I couldn't reproduce the highlighting problem, though. This
>> seems to work ok for me now for the test cases I could imagine.
>
> This does seem to work better. I'm still seeing the highlighting
> corruption with "set hls is".  I seem to have found another problem the
> patch introduces, though, as well as one that exists without the patch.

The patch also seems to break test24 and test71.

../vim -u unix.vim -U NONE --noplugin -s dotest.in test24.in
21,22c21,22
< x aaaaa xx a
< x aaaaa xx a
---
> xx aaaaa xx a
> xx aaaaa xx a

../vim -u unix.vim -U NONE --noplugin -s dotest.in test71.in
4,6c4,6
< OK 1234567890123456789012345678901234567
< OK ine 2 foo bar blah
< OK ine 3 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
---
> OK 01234567890123456789012345678901234567
> OK line 2 foo bar blah
> OK line 3 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Bram Moolenaar

unread,
Dec 2, 2010, 11:13:54 AM12/2/10
to James Vega, vim...@googlegroups.com

James Vega wrote:

Christian, did you do any more work on this patch?

--
Sometimes you can protect millions of dollars in your budget simply by buying
a bag of cookies, dropping it on the budget anylyst's desk, and saying
something deeply personal such as "How was your weekend, big guy?"
(Scott Adams - The Dilbert principle)

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\

\\\ an exciting new programming language -- http://www.Zimbu.org ///

Christian Brabandt

unread,
Dec 2, 2010, 11:39:17 AM12/2/10
to vim...@googlegroups.com
On Thu, December 2, 2010 5:13 pm, Bram Moolenaar wrote:
> Christian, did you do any more work on this patch?

(There is another problem. Take a word, say "test", visually select
it and then search for \%Vtest\%V. It won't match, because \%V
needs to match at the same cell as the t. And this doesn't work.

Effectively, \%V needs to match after the last column of the visually
selected region. But then, 'hls' will be off by one cell, which isn't
right either.)

So, no I didn't work on this anymore because I am too scared of the RE
code ;(

regards,
Christian

Bram Moolenaar

unread,
Dec 30, 2010, 8:38:13 AM12/30/10
to Christian Brabandt, vim...@googlegroups.com

Christian Brabandt wrote:

Any progress? Anyone wants to dive into this?
Or should we abandon this patch?

--
Over the years, I've developed my sense of deja vu so acutely that now
I can remember things that *have* happened before ...

Christian Brabandt

unread,
Dec 30, 2010, 8:52:41 AM12/30/10
to vim...@googlegroups.com
Hi Bram!

On Do, 30 Dez 2010, Bram Moolenaar wrote:

> Any progress? Anyone wants to dive into this?
> Or should we abandon this patch?

The reason for posting other patches has been, that I haven't found a
good way to fix this issue and haven't look into this since then. I'll
look into this again in the current spare time. But I am not too
optimistic.

regards,
Christian

Reply all
Reply to author
Forward
0 new messages