regexp : windows filename recognition

36 views
Skip to first unread message

Ni Va

unread,
Oct 9, 2019, 5:42:46 AM10/9/19
to vim_use
Hi,

Here is a kind of filename in fat chars that I would like to recognize:

  $FOOBBBAR_Foooofbar_f_oobar_(2019-07-29) - Copie.zed.lnk                                                  232.0 KiB     2019

Thank you

aro...@vex.net

unread,
Oct 9, 2019, 8:09:25 AM10/9/19
to vim...@googlegroups.com
> Here is a kind of filename in fat chars that I would like to recognize:
>
> * $FOOBBBAR_Foooofbar_f_oobar_(2019-07-29) - Copie.zed.lnk*
> 232.0 KiB 2019

There's a practically infinite universe of expressions that could be made
to match it. What's distinctive about it?

It might help if you could provide an example of something that you don't
want to match, and point out what discriminates the match vs non-match.

Ni Va

unread,
Oct 9, 2019, 8:16:05 AM10/9/19
to vim_use
Anything chars contained in windows'filename before a lot of spaces and begining on third char after start of line.

Anything chars of filename= \w\s-_. many times

^..filename                                    some others chars.*$

Thank you

Andy Wokula

unread,
Oct 9, 2019, 12:19:24 PM10/9/19
to vim...@googlegroups.com
Am 09.10.2019 um 14:16 schrieb Ni Va:
> Anything chars contained in windows'filename before a lot of spaces
> and begining on third char after start of line.
>
> Anything chars of filename= \w\s-_. many times
>
> ^..filename some others chars.*$

/^..\zs.\+$

List of false positives:
XXX

--
Andy

Ni Va

unread,
Oct 10, 2019, 4:44:42 AM10/10/19
to vim_use
Don't understand why it returns 2 first chars on this example :

+ 20191009_191004_Vim.8.1.2125/                                                                                              4.0 KiB [D] 2019-10-
echo substitute(getline(line('.')),'^..\zs\(.\+\)\(\s\+\d\+\.\d\+\)\@=.*$','\1', "") returns + 20191009_191004_Vim.8.1.2125/

Thank you for help.
NiVa
examples.txt

Andy Wokula

unread,
Oct 10, 2019, 11:19:25 AM10/10/19
to vim...@googlegroups.com
Am 10.10.2019 um 10:44 schrieb Ni Va:
> Don't understand why it returns 2 first chars on this example :
>
> + 20191009_191004_Vim.8.1.2125/ 4.0 KiB [D] 2019-10-
> echo substitute(getline(line('.')),'^..\zs\(.\+\)\(\s\+\d\+\.\d\+\)\@=.*$','\1', "") returns + 20191009_191004_Vim.8.1.2125/

The match starts after `\zs', what comes before is not substituted.

I wonder why the greedy \(.\+\) does not include any spaces.
Looks like \(\s\+\d\+\.\d\+\)\@= is checked before any backtracking takes place.
"no backtracking" => actually this depends on re=0 or re=2.

:echo substitute(getline('.'), '^..\(.\{-}\)\s\+\d\+\.\d\+.\{,20}$', '\1', '')

--
Andy

Ni Va

unread,
Oct 11, 2019, 9:27:05 AM10/11/19
to vim_use


It works well with .*$ in order to substitute and get first backward reference.
^..\(.\{-}\)\s\+\d\+\.\d\+.\{,20} needed to add .*$


Thank you.
Sans titre.png
Reply all
Reply to author
Forward
0 new messages