Collection issue: backslash after dash

54 views
Skip to first unread message

Andy Wokula

unread,
Jun 28, 2011, 7:57:37 AM6/28/11
to vim...@googlegroups.com
Strange: one can't write a collection with range [X-Y] where Y is the
character ']'.

I thought the following should work, but it doesn't:
/[@-\]]

Problem: the range is '@' to '\', and ']' ends the collection; the next ']'
matches itself.

(It's surprising that '\]' within '[]' not always means ']' literally!)

Ok, so the char directly after '-' ends the range?
/[@-]]

No, the collection is '[@-]' followed by ']' which matches itself. The
help says it:
| For '-' you can also make it the first or last character: "[-xyz]",
| "[^-xyz]" or "[xyz-]".

Ok, this works:
/[@-\\]]

but it matches the range '@-\' plus the char ']'.


A range where \] is the first character works:
/[w\]-a]\C

matches ] ^ _ ` a w


Is it a bug that '\' after '-' in a collection is taken literally?

--
Andy

Xavier Wang

unread,
Jun 28, 2011, 8:09:21 AM6/28/11
to vim...@googlegroups.com

> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
You can try ] followed [ immediately. i.e []@-?] Witch ? Stands for the character before ] in ASCII

Bram Moolenaar

unread,
Jun 29, 2011, 4:12:54 PM6/29/11
to Andy Wokula, vim...@googlegroups.com

Andy Wokula wrote:

> Strange: one can't write a collection with range [X-Y] where Y is the
> character ']'.
>
> I thought the following should work, but it doesn't:
> /[@-\]]
>
> Problem: the range is '@' to '\', and ']' ends the collection; the next ']'
> matches itself.
>
> (It's surprising that '\]' within '[]' not always means ']' literally!)
>
> Ok, so the char directly after '-' ends the range?
> /[@-]]
>
> No, the collection is '[@-]' followed by ']' which matches itself. The
> help says it:
> | For '-' you can also make it the first or last character: "[-xyz]",
> | "[^-xyz]" or "[xyz-]".
>
> Ok, this works:
> /[@-\\]]
>
> but it matches the range '@-\' plus the char ']'.

Well, that's correct, ] is right after \.

> A range where \] is the first character works:
> /[w\]-a]\C
>
> matches ] ^ _ ` a w
>
>
> Is it a bug that '\' after '-' in a collection is taken literally?

It's indeed stange.

--
hundred-and-one symptoms of being an internet addict:
238. You think faxes are old-fashioned.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Ben Fritz

unread,
Jan 3, 2013, 11:07:37 AM1/3/13
to vim...@googlegroups.com, anw...@yahoo.de
On Thursday, January 3, 2013 2:51:38 AM UTC-6, martinwguy wrote:

> On Tuesday, 28 June 2011 13:57:37 UTC+2, Andy Wokula wrote:
> > Strange: one can't write a collection with range [X-Y] where Y is the
> > character ']'.
> >
> > I thought the following should work, but it doesn't:
> > /[@-\]]
> >
> > Is it a bug that '\' after '-' in a collection is taken literally?
>
> No, that's normal vi behaviour. \ is not special in a character range (it stands for itself) and to include ] you need to specify it as the first character in the range.

I disagree, and consider it a bug. :help /\] says:

- To include a literal ']', '^', '-' or '\' in the collection, put a
backslash before it: "[xyz\]]", "[\^xyz]", "[xy\-z]" and "[xyz\\]".
(Note: POSIX does not support the use of a backslash this way). For
']' you can also make it the first character (following a possible
"^"): "[]xyz]" or "[^]xyz]" {not in Vi}.


For '-' you can also make it the first or last character: "[-xyz]",

"[^-xyz]" or "[xyz-]". For '\' you can also let it be followed by
any character that's not in "^]-\bdertnoUux". "[\xyz]" matches '\',
'x', 'y' and 'z'. It's better to use "\\" though, future expansions
may use other characters after '\'.

This works:

/[[\\\]]

This does not work, even though it should do the same thing if the above help entry were implemented as stated:

/[[-\]]

Using your example, this does work, but I would not expect it to:

/[][-\]

I would expect this to not be treated as a collection at all, because the closing ] has a \ in front.

There is obviously at least a documentation bug here.

Christian Brabandt

unread,
Jan 4, 2013, 8:29:50 AM1/4/13
to vim...@googlegroups.com
Hi Ben!

On Do, 03 Jan 2013, Ben Fritz wrote:

> On Thursday, January 3, 2013 2:51:38 AM UTC-6, martinwguy wrote:
> > On Tuesday, 28 June 2011 13:57:37 UTC+2, Andy Wokula wrote:
> > > Strange: one can't write a collection with range [X-Y] where Y is the
> > > character ']'.
> > >
> > > I thought the following should work, but it doesn't:
> > > /[@-\]]
> > >
> > > Is it a bug that '\' after '-' in a collection is taken literally?
> >
> > No, that's normal vi behaviour. \ is not special in a character range (it stands for itself) and to include ] you need to specify it as the first character in the range.

That is how POSIX defines it:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05

,----
| A bracket expression is either a matching list expression or a
| non-matching list expression. It consists of one or more expressions:
| collating elements, collating symbols, equivalence classes, character
| classes, or range expressions. The <right-square-bracket> ( ']' ) shall
| lose its special meaning and represent itself in a bracket expression if
| it occurs first in the list (after an initial <circumflex> ( '^' ), if
| any). Otherwise, it shall terminate the bracket expression, unless it
| appears in a collating symbol (such as "[.].]" ) or is the ending
| <right-square-bracket> for a collating symbol, equivalence class, or
| character class. The special characters '.' , '*' , '[' , and '\\' (
| <period>, <asterisk>, <left-square-bracket>, and <backslash>,
| respectively) shall lose their special meaning within a bracket
| expression.
`----

>
> I disagree, and consider it a bug. :help /\] says:
>
> - To include a literal ']', '^', '-' or '\' in the collection, put a
> backslash before it: "[xyz\]]", "[\^xyz]", "[xy\-z]" and "[xyz\\]".
> (Note: POSIX does not support the use of a backslash this way). For
> ']' you can also make it the first character (following a possible
> "^"): "[]xyz]" or "[^]xyz]" {not in Vi}.
> For '-' you can also make it the first or last character: "[-xyz]",
> "[^-xyz]" or "[xyz-]". For '\' you can also let it be followed by
> any character that's not in "^]-\bdertnoUux". "[\xyz]" matches '\',
> 'x', 'y' and 'z'. It's better to use "\\" though, future expansions
> may use other characters after '\'.
>
> This works:
>
> /[[\\\]]

Looks like a Vim extension to BRE (as stated in your quotation from the
help).

>
> This does not work, even though it should do the same thing if the above help entry were implemented as stated:
>
> /[[-\]]

Yes, the backslash doesn't have a special meaning when used within a
range. Not sure, we should fix this.

>
> Using your example, this does work, but I would not expect it to:
>
> /[][-\]
>
> I would expect this to not be treated as a collection at all, because the closing ] has a \ in front.

Yes, but the standard demands other. However, I think
/[]\-] would be more cleaner and is suggested by the standard:

,----
| If a bracket expression specifies both '-' and ']' , the ']' shall be
| placed first (after the '^' , if any) and the '-' last within the
| bracket expression.
`----

regards,
Christian
--
Haben Sie Ihre Begabung von der Mutter? -
Nein, ich habe sie mit der Vatermilch eingesogen.
-- Heinz Erhardt

Andy Wokula

unread,
Jan 4, 2013, 8:34:31 AM1/4/13
to vim...@googlegroups.com, martinwguy
Am 03.01.2013 09:51, schrieb martinwguy:
> On Tuesday, 28 June 2011 13:57:37 UTC+2, Andy Wokula wrote:
>> Strange: one can't write a collection with range [X-Y] where Y is the
>> character ']'.
>>
>> I thought the following should work, but it doesn't:
>> /[@-\]]
>>
>> Is it a bug that '\' after '-' in a collection is taken literally?
>
> No, that's normal vi behaviour.

The context is Vim, not Vi:
:set nocp cpo&vim

> \ is not special in a character range (it stands for itself) and to
> include ] you need to specify it as the first character in the range.

Even with set 'cp', `\]' is still special. See:
:h cpo-\

Do you actually use Vi?

> In the example you give
> /[]@-\]
> (knowing that \ is the character previous to ])
(my pattern `[@-\\]]' also made use of it)

So far, it looks like if Vim just forgot to implement a certain case.
There is no apparent reason why `\]' is allowed for X but not for Y in
a [X-Y] collection.

--
Andy

Christian Brabandt

unread,
Jan 4, 2013, 8:43:07 AM1/4/13
to vim...@googlegroups.com
Hi
I think, this is implicitly mentioned below :h /[]

,----
| - The following translations are accepted when the 'l' flag is not
| included in 'cpoptions' {not in Vi}:
| \e <Esc>
| \t <Tab>
| \r <CR> (NOT end-of-line!)
| \b <BS>
| \n line break, see above |/[\n]|
| \d123 decimal number of character
| \o40 octal number of character up to 0377
| \x20 hexadecimal number of character up to 0xff
| \u20AC hex. number of multibyte character up to 0xffff
| \U1234 hex. number of multibyte character up to 0xffffffff
| NOTE: The other backslash codes mentioned above do not work inside
| []!
`----

However, to fix this, the following patch can be applied:

diff --git a/src/regexp.c b/src/regexp.c
--- a/src/regexp.c
+++ b/src/regexp.c
@@ -2344,7 +2344,12 @@

/* Handle \o40, \x20 and \u20AC style sequences */
if (endc == '\\' && !cpo_lit && !cpo_bsl)
+ {
endc = coll_get_char();
+ /* Skip over backslash */
+ if (endc == '\\')
+ endc = *regparse++;
+ }



Mit freundlichen Gr��en
Christian
--
Wenn der Knecht zum Waldrand hetzt, war das �rtchen schon besetzt.

Christian Brabandt

unread,
Jan 4, 2013, 10:07:42 AM1/4/13
to vim...@googlegroups.com
Hi

On Fr, 04 Jan 2013, Christian Brabandt wrote:

>
> However, to fix this, the following patch can be applied:

And finally, here is a better patch, supporting multibyte chars and
including a test.

Mit freundlichen Gr��en
Christian
--
Das unmittelbare Gewahrwerden der Urph�nomene versetzt uns in
eine Art von Angst, wir f�hlen unsere Unzul�nglichkeit; nur durch das
ewige Spiel der Empirie belebt erfreuen sie uns.
-- Goethe, Maximen und Reflektionen, Nr. 817
regexp_collation_bslash.diff

Andy Wokula

unread,
Jan 4, 2013, 10:20:20 AM1/4/13
to vim...@googlegroups.com, Christian Brabandt
Am 04.01.2013 14:29, schrieb Christian Brabandt:
>> /[[-\]]
>
> Yes, the backslash doesn't have a special meaning when used within a
> range. Not sure, we should fix this.

Whether to fix this was my original question (June 2011).
The rest is off-topic for this thread.

--
Andy

martinwguy

unread,
Jan 4, 2013, 3:42:35 PM1/4/13
to Andy Wokula, vim...@googlegroups.com
On 4 January 2013 14:34, Andy Wokula <anw...@yahoo.de> wrote:
> Am 03.01.2013 09:51, schrieb martinwguy:
>>> Is it a bug that '\' after '-' in a collection is taken literally?
>> No, that's normal vi behaviour.
> The context is Vim, not Vi:
> :set nocp cpo&vim

Er, I thought vim was a reimplementation of vi.

>> \ is not special in a character range (it stands for itself) and to
>> include ] you need to specify it as the first character in the range.
>
> Even with set 'cp', `\]' is still special. See:
> :h cpo-\

Mmm, sorry, I don't know what :se cp/nocp is.


> Do you actually use Vi?

Hum, it sounds like you're putting your fists up. Bad sign.
Yes, since 1982 for all my work. I am also the maintainer for another
vi clone, "xvi".
Is that enough for you?

>> In the example you give
>> /[]@-\]
>> (knowing that \ is the character previous to ])
>
> (my pattern `[@-\\]]' also made use of it)
>
> So far, it looks like if Vim just forgot to implement a certain case.
> There is no apparent reason why `\]' is allowed for X but not for Y in
> a [X-Y] collection.

No, you're thinking that vi should do as you would expect according to
your own thinking.
That may be reasonable if we were designing a new editor, but vim is a
vi clone, so needs to implement what vi, and the other dozen vi
clones, do, so as not to break people's scripts.

That said, it is open source, so you are free to take it, make the
change you desire and suse your own version.

Or take it up with Bill Joy in the 1970s, but fr that you will need a
time machine...

M

Gary Johnson

unread,
Jan 4, 2013, 4:22:38 PM1/4/13
to vim...@googlegroups.com
On 2013-01-04, martinwguy wrote:
> On 4 January 2013 14:34, Andy Wokula wrote:
> > Am 03.01.2013 09:51, schrieb martinwguy:
> >>> Is it a bug that '\' after '-' in a collection is taken literally?
> >> No, that's normal vi behaviour.
> > The context is Vim, not Vi:
> > :set nocp cpo&vim
>
> Er, I thought vim was a reimplementation of vi.

It is. To a point. See

:help design-compatible
:help vi-differences

> >> \ is not special in a character range (it stands for itself) and to
> >> include ] you need to specify it as the first character in the range.
> >
> > Even with set 'cp', `\]' is still special. See:
> > :h cpo-\
>
> Mmm, sorry, I don't know what :se cp/nocp is.

:help 'cp'

> > Do you actually use Vi?
>
> Hum, it sounds like you're putting your fists up. Bad sign.
> Yes, since 1982 for all my work. I am also the maintainer for another
> vi clone, "xvi".
> Is that enough for you?

There are a couple of ways that question could be read. I think
Andy meant it as, "Do you use vi and not Vim?", and I think you
took it as, "Do you know how to use vi?"

Regards,
Gary

Andy Wokula

unread,
Jan 7, 2013, 5:30:12 AM1/7/13
to vim...@googlegroups.com, martinwguy
Yep, I meant the former.

Vim added the backslash for escaping within a collection, but it does so
inconsistently. This has nothing to do with Vi.

--
Andy

martinwguy

unread,
Jan 7, 2013, 3:11:30 PM1/7/13
to Andy Wokula, vim...@googlegroups.com
> Vim added the backslash for escaping within a collection, but it does so
> inconsistently. This has nothing to do with Vi.

OK, my bad. I didn't know that vim was becoming deliberately
non-backward-compatible with standard vi.

Good luck resolving this issue in whatever way you think best

M
Reply all
Reply to author
Forward
0 new messages