search results depend on length

35 views
Skip to first unread message

Thilo Six

unread,
Oct 29, 2011, 5:22:17 AM10/29/11
to vim...@vim.org
Hello

attached is a example file. When searching in there with:
/vi\|vim
or
/vim\|vi

does not give the same results. Which is what i would expect.
Actually the latter gives the results i was vim asking for.
My vim version is a bit dated though: 7.2.445 .
Or am i doing something wrong?

--
Regards,
Thilo

4096R/0xC70B1A8F
721B 1BA0 095C 1ABA 3FC6 7C18 89A4 A2A0 C70B 1A8F

fooo

Ivan Krasilnikov

unread,
Oct 29, 2011, 9:55:18 AM10/29/11
to vim...@googlegroups.com
2011/10/29 Thilo Six <T....@gmx.de>:

> attached is a example file. When searching in there with:
> /vi\|vim
> or
> /vim\|vi
>
> does not give the same results. Which is what i would expect.

This is the intended behavior. From :help regexp:
"A pattern is one or more branches, separated by "\|". ...
If more than one branch matches, the first one is used."

So, /vi\|vim is same as /vi.

Thilo Six

unread,
Oct 29, 2011, 10:07:49 AM10/29/11
to vim...@vim.org
Ivan Krasilnikov wrote the following on 29.10.2011 15:55

Hello Ivan,

-- <snip> --


> This is the intended behavior. From :help regexp:
> "A pattern is one or more branches, separated by "\|". ...
> If more than one branch matches, the first one is used."
>
> So, /vi\|vim is same as /vi.

Thank you for the pointers.
I need to mentally keep a note about that for future.

Thilo Six

unread,
Oct 29, 2011, 2:04:04 PM10/29/11
to vim...@vim.org
Thilo Six wrote the following on 29.10.2011 16:07

hello,

-- <snip> --

>> So, /vi\|vim is same as /vi.
>
> Thank you for the pointers.
> I need to mentally keep a note about that for future.


,----[ :help regexp ]-------------------------

If more than one branch matches, the first one is used.

`---------------------------------------------

I don't speak c, so why is there the limitation mentioned above?
It can't be for performance because both searches are equally fast (humanly
speaking). But it breaks expectations as i tell vim to search for 'vi' OR 'vim'
in both cases.
Maybe if i understand the technical background i get more used to it.

James McCoy

unread,
Oct 29, 2011, 2:25:37 PM10/29/11
to vim...@googlegroups.com
On Sat, Oct 29, 2011 at 2:04 PM, Thilo Six <T....@gmx.de> wrote:
> Thilo Six wrote the following on 29.10.2011 16:07
>
> hello,
>
> -- <snip> --
>
>>> So, /vi\|vim is same as /vi.
>>
>> Thank you for the pointers.
>> I need to mentally keep a note about that for future.
>
>
> ,----[ :help regexp ]-------------------------
>
> If more than one branch matches, the first one is used.
> `---------------------------------------------
>
> I don't speak c, so why is there the limitation mentioned above?
> It can't be for performance because both searches are equally fast (humanly
> speaking).

This is a common design choice for regular expressions. The left-most
pattern that matches is what is used. You'll see similar behavior
with PCRE:

$ printf "vim\n" | perl -pe '/(vi|vim)/ && print "$1\n"'
vi

> But it breaks expectations as i tell vim to search for 'vi' OR 'vim'
> in both cases.
> Maybe if i understand the technical background i get more used to it.

In this specific example, sure it's not that much more work to use the
longest of the two patterns as THE match, but this is a trivial case.
You can have much more complicated patterns and multiple alternations.
Then do you really want to have the regex engine try all the
alternations just to find the one that's matches the most text and is
that really what you want in that specific scenario? It's much easier
to just change the regular expression to do what you want with the
understanding of how regular expressions are used.

I'd suggest the book "Mastering Regular Expressions". It does a
pretty good job of explaining how various regular expression engines
work and things to keep in mind when crafting regular expressions.

--
James
GPG Key: 1024D/61326D40 2003-09-02 James McCoy <jame...@jamessan.com>

Thilo Six

unread,
Oct 29, 2011, 2:40:28 PM10/29/11
to vim...@vim.org
James McCoy wrote the following on 29.10.2011 20:25

Konichiwa James San,

-- <snip> --


> It's much easier
> to just change the regular expression to do what you want with the
> understanding of how regular expressions are used.

Thats why i asked again. Thank you very much for the explanation.

> I'd suggest the book "Mastering Regular Expressions". It does a
> pretty good job of explaining how various regular expression engines
> work and things to keep in mind when crafting regular expressions.

I'll do my best.

Tony Mechelynck

unread,
Oct 30, 2011, 7:08:19 PM10/30/11
to vim...@googlegroups.com, Thilo Six
On 29/10/11 11:22, Thilo Six wrote:
> Hello
>
> attached is a example file. When searching in there with:
> /vi\|vim
> or
> /vim\|vi
>
> does not give the same results. Which is what i would expect.
> Actually the latter gives the results i was vim asking for.
> My vim version is a bit dated though: 7.2.445 .
> Or am i doing something wrong?
>

"vi" always matches wherever "vim" matches, so you have a kind of
degenerate case here.

Maybe what you wanted was

/vi\%[m]

or

/vim\=

See
:help /\=
:help /\%[]


Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
207. Your given you one phone call in prison and you ask them for a laptop.

Thilo Six

unread,
Oct 31, 2011, 1:31:34 PM10/31/11
to vim...@vim.org
Tony Mechelynck wrote the following on 31.10.2011 00:08

Hello Tony,

-- <snip> --

> "vi" always matches wherever "vim" matches, so you have a kind of
> degenerate case here.
>
> Maybe what you wanted was
>
> /vi\%[m]
>
> or
>
> /vim\=
>
> See
> :help /\=
> :help /\%[]
>
>
> Best regards,
> Tony.

Thanks for the pointer. There are quite a lot of search atoms. Although i'll
always try to look at 'pattern-atoms' i obviously have to learn.
Which is quite fun sometimes. It is like playing vim-chess. Kind of.
But it is quite a lot less fun if you are in hurry have to achieve something by
date.

Thank you.

Ernie Rael

unread,
Oct 31, 2011, 3:37:15 PM10/31/11
to vim...@googlegroups.com
On 10/31/2011 10:31 AM, Thilo Six wrote:
> Tony Mechelynck wrote the following on 31.10.2011 00:08
>
> Hello Tony,
>
> -- <snip> --
>
>> "vi" always matches wherever "vim" matches, so you have a kind of
>> degenerate case here.
>>
>> Maybe what you wanted was
>>
>> /vi\%[m]
>>
>> or
>>
>> /vim\=
>>
>> See
>> :help /\=
>> :help /\%[]
>>
>>
>> Best regards,
>> Tony.
> Thanks for the pointer. There are quite a lot of search atoms. Although i'll
> always try to look at 'pattern-atoms' i obviously have to learn.
> Which is quite fun sometimes. It is like playing vim-chess. Kind of.
> But it is quite a lot less fun if you are in hurry have to achieve something by
> date.
>
> Thank you.
Off topic, but I've been wondering...

Is there an option to use perl/python syntax for RE? If not, would that
be a welcome patch? Could it be done with vim's scripting? Also an
optional string which indicates which chars need escaping (like ')' or
'?') might be useful.

-ernie

Ingo Karkat

unread,
Oct 31, 2011, 3:53:40 PM10/31/11
to vim...@googlegroups.com
On 31-Oct-2011 20:37, Ernie Rael wrote:

> Off topic, but I've been wondering...
>
> Is there an option to use perl/python syntax for RE? If not, would that be a
> welcome patch? Could it be done with vim's scripting?

Both have unique constructs, cp. :help perl-patterns.

> Also an optional string which indicates which chars need escaping
> (like ')' or '?') might be useful.

Unless I misunderstand, isn't that achieved through changing the magic-ness of a
pattern: \v \V \m \M ?!

-- regards, ingo

Ernie Rael

unread,
Oct 31, 2011, 4:11:27 PM10/31/11
to vim...@googlegroups.com
On 10/31/2011 12:53 PM, Ingo Karkat wrote:
> On 31-Oct-2011 20:37, Ernie Rael wrote:
>
>> Off topic, but I've been wondering...
>>
>> Is there an option to use perl/python syntax for RE? If not, would that be a
>> welcome patch? Could it be done with vim's scripting?
> Both have unique constructs, cp. :help perl-patterns.
True. I am willing to give up the vim only features and the perl only
features to use a syntax I already use a lot (a syntax which is also the
base of the java syntax).

>
>> Also an optional string which indicates which chars need escaping
>> (like ')' or '?') might be useful.
> Unless I misunderstand, isn't that achieved through changing the magic-ness of a
> pattern: \v \V \m \M ?!
It's similar, but a list of chars that need escaping is more flexible. I
could specify that '(' needs to be escaped to mean grouping and that '?'
does not need to be escaped to mean optional. Using the list could be
dependent on the "magic-ness of a pattern".
>
> -- regards, ingo
>

Tony Mechelynck

unread,
Oct 31, 2011, 4:31:58 PM10/31/11
to vim...@googlegroups.com, Thilo Six, vim...@vim.org
On 31/10/11 18:31, Thilo Six wrote:
[...]

> Thanks for the pointer. There are quite a lot of search atoms. Although i'll
> always try to look at 'pattern-atoms' i obviously have to learn.
> Which is quite fun sometimes. It is like playing vim-chess. Kind of.
> But it is quite a lot less fun if you are in hurry have to achieve something by
> date.
>
> Thank you.

|pattern-atoms| is the encyclopaedic treatment. There is also a cheat
sheet starting at |pattern-overview|, which might be easier to consult.


Best regards,
Tony.
--
A day for firm decisions!!!!! Or is it?

Thilo Six

unread,
Oct 31, 2011, 4:55:27 PM10/31/11
to vim...@vim.org
Tony Mechelynck wrote the following on 31.10.2011 21:31

Hello Tony,

-- <snip> --

> |pattern-atoms| is the encyclopaedic treatment. There is also a cheat

> sheet starting at |pattern-overview|, which might be easier to consult.
>
>
> Best regards,
> Tony.

Thanks again.

Christian Brabandt

unread,
Oct 31, 2011, 5:59:38 PM10/31/11
to vim...@googlegroups.com
On Mo, 31 Okt 2011, Ernie Rael wrote:

> It's similar, but a list of chars that need escaping is more flexible.
> I could specify that '(' needs to be escaped to mean grouping and that
> '?' does not need to be escaped to mean optional. Using the list could
> be dependent on the "magic-ness of a pattern".

Hm, interesting concept. Attached is a simple script to try out.

Use
:let g:re_dont_escape = '()|?'
to specify which chars have a special meaning and don't need to be
escaped (only for using the literal version). So in this example, '()
wouldn't need to be escaped for grouping, '|' means OR and '?' means
optional match.

When searching, press <f7> to translate the pattern into a vim pattern.
It basically only adds/removes the backslashes (so you need to know all
vim specific atoms, like '\@<=' and can't use e.g. Perl look-arounds).

Disclaimer, only very basically tested.

regards,
Christian

search_escape.vim

Andy Wokula

unread,
Nov 1, 2011, 7:50:15 AM11/1/11
to vim...@googlegroups.com

Neat! attached is a mod with backslash/'ff'-bugs fixed and simpler
notation, shorter gen'd patterns and preparation for multi-char items,
otherwise no feats added.

--
Andy

search_escape.vim

Ernie Rael

unread,
Nov 1, 2011, 2:04:59 PM11/1/11
to vim...@googlegroups.com
Thanks Christian and Andy.

I'm on vacation now, won't be giving this any play until after next
week. But it looks like some pros are working it over.

-ernie

Christian Brabandt

unread,
Nov 1, 2011, 6:56:41 PM11/1/11
to vim...@googlegroups.com
Hi Andy!

Nice. Here is your version extended by not replacing inside collations.


regards,
Christian
--

search_escape.vim
Reply all
Reply to author
Forward
0 new messages