Summer of Code: Regexp

9 views
Skip to first unread message

Andrei Aiordachioaie

unread,
Mar 20, 2008, 6:46:50 AM3/20/08
to vim_dev
Hello,

I'm a final-year student from Jacobs University Bremen, studying
Computer Science. I am interested in improving the regular expression
code for Vim, as part of the Summer of Code, and maybe even
afterwards. The following are what I could do for the summer. Do you
think it's too little or too much?

I understand that a new regexp engine has been written during SoC last
year, but it is not officially included in vim. I would like to do the
necessary testing and integrate the new engine in the main code. So
far, I was only able to download the archives that the 2 students
submitted. Is there an existing svn repo that includes the vim code
and the new engine?

An idea that sounds really interesting is using approximate regular
expressions to find similar words. There are a number of projects that
have already implemented it, such as Agrep, libTre, or lib-bitap. Do
you think that we could use one of the existing libraries for the
algorithm itself ? Or would we have to reimplement it?

Also, regexps in vim look hard to read, because of the many escapes
that have to be used by default. Maybe we should consider enabling the
"very magic" in the global configuration file, when vim gets
installed ? This would make vim regexps friendlier for newbies :-)

What exactly is the T-search algorithm, for searching strings ? I
tried googling for "C't article, August 1997", but I don't think there
was anything relevant. Can someone point me in the right direction?

There are a number of small things or bugs related to regexps that I
could help with, such as
Regexp: matchlist('12a4aaa', '^\(.\{-}\)\(\%5c\@<=a\+\)\(.\+\)\?')
returns ['12a4', 'aaa', '4aaa'], but should be ['12a4', 'aaa', '']
or
Recognize "[a-z]", "[0-9]", etc. and replace them with the faster
"\l" and "\d".
or
allowing ":23,45/pat/flags" to search for "pat" in lines 23 to 45?

Do you have any other ideas or suggestions?

Cheers,
Andrei

Jürgen Krämer

unread,
Mar 20, 2008, 7:01:54 AM3/20/08
to vim...@googlegroups.com

Hi,

Andrei Aiordachioaie wrote:
>
> What exactly is the T-search algorithm, for searching strings ? I
> tried googling for "C't article, August 1997", but I don't think there
> was anything relevant. Can someone point me in the right direction?

c't is a german computer magazine. The article you have been looking for
is "Blitzfindig -- Texte schnell durchsuchen mit T-Search" and was
printed in c't 8/1997, pages 292ff.

If you understand German you can buy it on their web site: go to
<http://www.heise.de/ct/inhverz/suche> and enter "T-Search" in the
search box.

Regards,
Jürgen

--
Sometimes I think the surest sign that intelligent life exists elsewhere
in the universe is that none of it has tried to contact us. (Calvin)

Bram Moolenaar

unread,
Mar 20, 2008, 8:30:04 AM3/20/08
to Andrei Aiordachioaie, vim_dev

Andrei Aiordachioaie wrote:

> I'm a final-year student from Jacobs University Bremen, studying
> Computer Science. I am interested in improving the regular expression
> code for Vim, as part of the Summer of Code, and maybe even
> afterwards. The following are what I could do for the summer. Do you
> think it's too little or too much?
>
> I understand that a new regexp engine has been written during SoC last
> year, but it is not officially included in vim. I would like to do the
> necessary testing and integrate the new engine in the main code. So
> far, I was only able to download the archives that the 2 students
> submitted. Is there an existing svn repo that includes the vim code
> and the new engine?

Sounds good. Look here for the work done so far:
http://code.google.com/p/vim-soc-regexp/

> An idea that sounds really interesting is using approximate regular
> expressions to find similar words. There are a number of projects that
> have already implemented it, such as Agrep, libTre, or lib-bitap. Do
> you think that we could use one of the existing libraries for the
> algorithm itself ? Or would we have to reimplement it?

Let's do the fast regexp work first. It's easy to underestimate how
much work this stuff is.

> Also, regexps in vim look hard to read, because of the many escapes
> that have to be used by default. Maybe we should consider enabling the
> "very magic" in the global configuration file, when vim gets
> installed ? This would make vim regexps friendlier for newbies :-)

No, because this breaks Vim script portability. Currently there is the
'magic' option, and that is a problem already. Most scripts assume it's
on, thus if you switch it off lots of things will fall down.

> What exactly is the T-search algorithm, for searching strings ? I
> tried googling for "C't article, August 1997", but I don't think there
> was anything relevant. Can someone point me in the right direction?

I don't have a reference at hand...

> There are a number of small things or bugs related to regexps that I
> could help with, such as
> Regexp: matchlist('12a4aaa', '^\(.\{-}\)\(\%5c\@<=a\+\)\(.\+\)\?')
> returns ['12a4', 'aaa', '4aaa'], but should be ['12a4', 'aaa', '']
> or
> Recognize "[a-z]", "[0-9]", etc. and replace them with the faster
> "\l" and "\d".
> or
> allowing ":23,45/pat/flags" to search for "pat" in lines 23 to 45?
>
> Do you have any other ideas or suggestions?

These are also nice, but the main work is to get the fast regexp engine
included. I'm sure this will take up the two months that a student has
available.

--
hundred-and-one symptoms of being an internet addict:
161. You get up before the sun rises to check your e-mail, and you
find yourself in the very same chair long after the sun has set.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Antony Scriven

unread,
Mar 20, 2008, 10:46:32 AM3/20/08
to vim...@googlegroups.com
Hi

On 20/03/2008, Bram Moolenaar <Br...@moolenaar.net> wrote:

> Andrei Aiordachioaie wrote:
>
> [...]


>
>
> > Also, regexps in vim look hard to read, because of the
> > many escapes that have to be used by default. Maybe we
> > should consider enabling the "very magic" in the global
> > configuration file, when vim gets installed ? This
> > would make vim regexps friendlier for newbies :-)
>
> No, because this breaks Vim script portability. Currently
> there is the 'magic' option, and that is a problem
> already. Most scripts assume it's on, thus if you switch
> it off lots of things will fall down.

Isn't there any way to get a script to run in it's own
environment, unaffected by a user's settings? --Antony

Charles E Campbell Jr

unread,
Mar 20, 2008, 12:20:02 PM3/20/08
to vim...@googlegroups.com
Antony Scriven wrote:
>
> Isn't there any way to get a script to run in it's own
> environment, unaffected by a user's settings? --Antony
>
Not really. Sometimes a script can be independent (ie. opens its own
window/buffer/playground), but often it needs to work with the user's
preferences. Think about the matchparen.vim plugin that comes with vim;
folks would rightfully be upset if it changed their preferences to suit
the plugin. Consider some of my plugins: Drawit and Align. Both of
those do things in the user's buffer, not their own buffer. Align's
regular expression use should probably be controlled by the user's magic
option, for example.

Now, it might be helpful if there was an environment stack and some
functions:

PushEnvironment(env)
StdEnvironment(env)
PopEnvironment()

where "env" would allow one to indicate screen position, options, maps,
window layout, visible buffers, registers, current working directory,
alternate buffer, marks, and/or global variables (I'm thinking of a
bitmask sort of thing, or something using Lists). I probably forgot
something! Some of this functionality is present (winsaveview(),
winrestview(), getpos(), setpos(), etc).

However, given that vim.sf.net has a lot of plugins, its likely that
even if these commands were implemented that many if not most plugins
would not use them.

In the interim: my cecutil.vim package has the ability to save/restore
marks, save/restore window/cursor positioning, and save/restore user
maps. I'm sure there are others with similar solutions (Hari,
undoubtedly, who has *tons* of utilities!). Also, there's my
pluginkiller plugin -- useful for testing plugins against a lot of
options that I've found often have deleterious effects on plugins.

Regards,
Chip Campbell

Antony Scriven

unread,
Mar 20, 2008, 1:12:36 PM3/20/08
to vim...@googlegroups.com
On 20/03/2008, Charles E Campbell Jr <drc...@campbellfamily.biz> wrote:

> Antony Scriven wrote:
> >
> > Isn't there any way to get a script to run in it's own
> > environment, unaffected by a user's settings? --Antony
>

> Not really. [...]

Sorry, I phrased myself badly. I meant changing Vim rather
than using current features or writing a script.

> Now, it might be helpful if there was an environment
> stack and some functions:
>
> PushEnvironment(env)
> StdEnvironment(env)
> PopEnvironment()

Right, that's the kind of thing I had in mind, but built
into Vim.

> [...]


>
> However, given that vim.sf.net has a lot of plugins, its
> likely that even if these commands were implemented that
> many if not most plugins would not use them.

I don't follow your argument. If a useful feature is
implemented, why wouldn't script writers take it up? E.g.
@Spell or lists and dictionaries. I expect there would be
resistance to adopting someone else's VimL library, but
built-in functions are another matter.

> In the interim: my cecutil.vim package has the ability to
> save/restore marks, save/restore window/cursor
> positioning, and save/restore user maps. I'm sure there
> are others with similar solutions (Hari, undoubtedly, who
> has *tons* of utilities!). Also, there's my pluginkiller
> plugin -- useful for testing plugins against a lot of
> options that I've found often have deleterious effects on
> plugins.

I've no doubt that all this is useful, but shouldn't there
be a more graceful built-in system for handling these
problems? Or is the general consensus that the current
situation is adequate? Writing a simple script is easy, but
I find that writing a robust script is perhaps harder than
it could be.

To get back to the original question of creating
a 'verymagic' option: Bram, if there was a way to isolate
scripts from a user's settings, coudn't this option then
become viable? --Antony

Matthew Winn

unread,
Mar 21, 2008, 4:43:14 AM3/21/08
to vim...@vim.org
On Thu, 20 Mar 2008 03:46:50 -0700 (PDT), Andrei Aiordachioaie
<andre...@gmail.com> wrote:

> Also, regexps in vim look hard to read, because of the many escapes
> that have to be used by default. Maybe we should consider enabling the
> "very magic" in the global configuration file, when vim gets
> installed ? This would make vim regexps friendlier for newbies :-)

The point of Vim's (vi's) regular expressions is to make them quicker
to type interactively, which is how most regular expressions are used.
For example, you're more likely to want to search for a literal ( than
to want grouping, so by default ( is a literal character and \( is a
metacharacter. On the other hand * rarely occurs in text, so * is a
metacharacter and \* is a literal asterisk. The idea is that in the
most common situations the bare character does what you want and the
backslashed character has the less common meaning. Speed of typing
takes precedence over consistency.

Contrast this with Perl, where regular expressions are built into the
language. There you write regular expressions once and then refer back
to them every time you edit the code, so consistency of syntax is more
important than speed of typing. In Perl a bare character is literal
if it's a letter or digit and special if it's punctuation, while a
backslashed character is special if it's a letter or digit and literal
if it's punctuation. That works well for Perl, but a strategy that
makes sense for a programming language doesn't necessarily make sense
for an editor.

--
Matthew Winn

Erik Falor

unread,
Mar 21, 2008, 11:14:45 AM3/21/08
to vim...@googlegroups.com
Thank you for a very good explanation of why Vim's regexp are the way they are.  
It will help me remember when I need to escape a character.


--
Erik Falor
Registered Linux User #445632 http://counter.li.org

John Little

unread,
Mar 23, 2008, 7:38:37 PM3/23/08
to vim_dev


Andrei Aiordachioaie <andrei6...@gmail.com> wrote:
> Do you have any other ideas or suggestions?
and later said:
> Also, regexps in vim look hard to read...

Matthew Winn added:
> The point of Vim's (vi's) regular expressions is to make them quicker
> to type interactively, which is how most regular expressions are used.
...
> Contrast this with Perl, where regular expressions are built into the
> language...a strategy that makes sense for a programming language
> doesn't necessarily make sense for an editor.

I really like Perl's x modifier on regexes (which makes whitespace and
comments insignificant in the regex, allowing it to be spread over
several lines with indentation and comments). I wonder if something
similar would work in Vim, being intended for Vim scripts rather than
interactive use.

--
John Little

Dasn

unread,
Mar 21, 2008, 6:53:53 PM3/21/08
to vim...@googlegroups.com

That point is arguable. IIRC, elvis uses perl kind regex.
What about grep, sed, awk, python..., they all use so-called similar
regexes, but each of them have their own reasons for supporting their
specifics.
So, I think the point is, at the beginning of a project, most FREE
developers often tend to be a little 'selfish' when they are writing
softwares to solve their own problem rather than consider others how to
use it. As the development continues, the project cannot reject the
burden of his own history for some compatibility reasons.

--
Dasn

Andrei Aiordachioaie

unread,
Mar 28, 2008, 6:31:47 AM3/28/08
to vim_dev
> /// Bram Moolenaar -- B...@Moolenaar.net --http://www.Moolenaar.net \\\
> /// sponsor Vim, vote for features --http://www.Vim.org/sponsor/\\\
> \\\ download, build and distribute --http://www.A-A-P.org ///

Andrei Aiordachioaie

unread,
Mar 28, 2008, 6:53:34 AM3/28/08
to vim_dev


On Mar 20, 1:30 pm, Bram Moolenaar <B...@moolenaar.net> wrote:
>
> Let's do the fast regexp work first. It's easy to underestimate how
> much work this stuff is.

I looked at the updated regexp code that Xiaozhou Liu has maintained,
and it looks a lot closer to being included. The problems I see so far
with the new engine are:
- the three test cases that fail, but of course there may be more bugs
- compatibility with the old engine.

From what I've looked at the test-cases, it seems that the NFA
implementation is not greedy, as it should be. I will look more into
it.

So for the project, I want to extend the test-suite to compare the way
regexps are handled in the old vs the new engine. Maybe this uncovers
other bugs. Then, the largest portion of the project would be fixing
the found bugs. And if that takes little time, I could work on the old
regexp engine bugs. Do you have any other ideas? Would this be enough
for a 2.5 months project?

The todo list mentions using regexp search in the gtk find&replace
dialog. That might also deserve some attention, though I imagine it's
pretty straightforward.

Cheers,
Andrei

Ben Schmidt

unread,
Mar 28, 2008, 8:05:42 AM3/28/08
to vim...@googlegroups.com
> The point of Vim's (vi's) regular expressions is to make them quicker
> to type interactively, which is how most regular expressions are used.
> For example, you're more likely to want to search for a literal ( than
> to want grouping, so by default ( is a literal character and \( is a
> metacharacter. On the other hand * rarely occurs in text, so * is a
> metacharacter and \* is a literal asterisk. The idea is that in the
> most common situations the bare character does what you want and the
> backslashed character has the less common meaning. Speed of typing
> takes precedence over consistency.

I'm not sure this stands up to scrutiny. Is searching for text containing literal
plus signs really more common than wanting to search for one or more of a certain
atom? Do people search for dollar signs so rarely that it is better for it to
represent end of line by default? Are pipe characters searched for more often than
alternatives?! Are braces and parentheses so common yet square brackets so rare?

I think a better explanation might be backward compatibility. The story may go
something like this: In the beginning, there was the simple regex, supporting just
., *, ^, $, but once scripts were written with just these, and therefore assuming
other punctuation matched itself, the regex language could only be extended by
escaping the new characters used or by breaking stuff, and people opted for the
escapes.

Newer designs, such as Perl, don't suffer from this restriction. A 'new' vi (which
I personally would be quite a fan of--a rewrite that does away with the
inconsistencies and oddities that are the result of compatibility issues and a
sprawling codebase, with a well designed, consistent and powerful scripting
language, consistent operator behaviour, etc., etc.) would not suffer from that
restriction either, and although because, yes, an editor is different to a
scripting language, I doubt the regex language would be the same as Perl, I don't
think it would be the same as Vim either.

Smiles,

Ben.

Bram Moolenaar

unread,
Mar 29, 2008, 8:44:16 AM3/29/08
to Andrei Aiordachioaie, vim_dev

Andrei Aiordachioaie wrote:

> On Mar 20, 1:30 pm, Bram Moolenaar <B...@moolenaar.net> wrote:
> >
> > Let's do the fast regexp work first. It's easy to underestimate how
> > much work this stuff is.
>
> I looked at the updated regexp code that Xiaozhou Liu has maintained,
> and it looks a lot closer to being included. The problems I see so far
> with the new engine are:
> - the three test cases that fail, but of course there may be more bugs
> - compatibility with the old engine.
>
> >From what I've looked at the test-cases, it seems that the NFA
> implementation is not greedy, as it should be. I will look more into
> it.
>
> So for the project, I want to extend the test-suite to compare the way
> regexps are handled in the old vs the new engine. Maybe this uncovers
> other bugs. Then, the largest portion of the project would be fixing
> the found bugs. And if that takes little time, I could work on the old
> regexp engine bugs. Do you have any other ideas? Would this be enough
> for a 2.5 months project?

Another big task is to merge the code, removing things that were
duplicated. My current idea is to first move everything into regexp.c,
then remove the duplicated stuff, then clean it up and perhaps move the
two engines to separate files. This should be done in small steps,
making sure everything still works after each step.

There currently are quite a few variables global to regexp.c, which
makes this difficult. One can't simply make them local, passing them
around to function calls will decrease the performance.

> The todo list mentions using regexp search in the gtk find&replace
> dialog. That might also deserve some attention, though I imagine it's
> pretty straightforward.

I would call that a separate task. The regexp task should better try to
improve the regexp code itself, not the many places where it is used.
The only exception is that the interface should be change to allow for
two results: just checking if there is a match (can be done much quicker
with DFA) and figuring out exactly what text is matched (including sub
matches). For a Vim script line "if a =~ pattern" we only need the
first. For a ":s" command we need the second.

--
hundred-and-one symptoms of being an internet addict:

178. You look for an icon to double-click to open your bedroom window.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///

Xiaozhou Liu

unread,
Mar 30, 2008, 1:55:14 AM3/30/08
to vim...@googlegroups.com
On Sat, Mar 29, 2008 at 8:44 PM, Bram Moolenaar <Br...@moolenaar.net> wrote:
>
>
> Andrei Aiordachioaie wrote:
>
> > On Mar 20, 1:30 pm, Bram Moolenaar <B...@moolenaar.net> wrote:
> > >
> > > Let's do the fast regexp work first. It's easy to underestimate how
> > > much work this stuff is.
> >
> > I looked at the updated regexp code that Xiaozhou Liu has maintained,
> > and it looks a lot closer to being included. The problems I see so far
> > with the new engine are:
> > - the three test cases that fail, but of course there may be more bugs
> > - compatibility with the old engine.
> >
> > >From what I've looked at the test-cases, it seems that the NFA
> > implementation is not greedy, as it should be. I will look more into
> > it.
> >
> > So for the project, I want to extend the test-suite to compare the way
> > regexps are handled in the old vs the new engine. Maybe this uncovers
> > other bugs. Then, the largest portion of the project would be fixing
> > the found bugs. And if that takes little time, I could work on the old
> > regexp engine bugs. Do you have any other ideas? Would this be enough
> > for a 2.5 months project?
>
> Another big task is to merge the code, removing things that were
> duplicated. My current idea is to first move everything into regexp.c,
> then remove the duplicated stuff,

I have done this. Everything is in regexp.c and there is no duplication.

> then clean it up and perhaps move the
> two engines to separate files. This should be done in small steps,
> making sure everything still works after each step.

I think it is diffcult to move the two engines to separate files because
they have a lot to share. It's better to keep them in regexp.c.

Xiaozhou

Bram Moolenaar

unread,
Mar 30, 2008, 8:17:33 AM3/30/08
to Xiaozhou Liu, vim...@googlegroups.com

Xiaozhou Liu wrote:

> On Sat, Mar 29, 2008 at 8:44 PM, Bram Moolenaar <Br...@moolenaar.net> wrote:
> > Andrei Aiordachioaie wrote:
> >
> > > On Mar 20, 1:30 pm, Bram Moolenaar <B...@moolenaar.net> wrote:
> > > >
> > > > Let's do the fast regexp work first. It's easy to underestimate how
> > > > much work this stuff is.
> > >
> > > I looked at the updated regexp code that Xiaozhou Liu has maintained,
> > > and it looks a lot closer to being included. The problems I see so far
> > > with the new engine are:
> > > - the three test cases that fail, but of course there may be more bugs
> > > - compatibility with the old engine.
> > >
> > > From what I've looked at the test-cases, it seems that the NFA
> > > implementation is not greedy, as it should be. I will look more into
> > > it.
> > >
> > > So for the project, I want to extend the test-suite to compare the way
> > > regexps are handled in the old vs the new engine. Maybe this uncovers
> > > other bugs. Then, the largest portion of the project would be fixing
> > > the found bugs. And if that takes little time, I could work on the old
> > > regexp engine bugs. Do you have any other ideas? Would this be enough
> > > for a 2.5 months project?
> >
> > Another big task is to merge the code, removing things that were
> > duplicated. My current idea is to first move everything into regexp.c,
> > then remove the duplicated stuff,
>
> I have done this. Everything is in regexp.c and there is no duplication.

Ah, great. I should look at the code again.

> > then clean it up and perhaps move the
> > two engines to separate files. This should be done in small steps,
> > making sure everything still works after each step.
>
> I think it is diffcult to move the two engines to separate files because
> they have a lot to share. It's better to keep them in regexp.c.

It's not a requirement. If moving engines to a separate file makes the
code more complicated or less efficient, then it can stay in one big
file.

Anyway, perhaps you can help mentoring the student who is going to work
on the regexp code? Your experience will help a lot.

--
hundred-and-one symptoms of being an internet addict:

190. You quickly hand over your wallet, leather jacket, and car keys
during a mugging, then proceed to beat the crap out of your
assailant when he asks for your laptop.

Ian Young

unread,
Mar 30, 2008, 6:54:43 PM3/30/08
to vim...@googlegroups.com
Sorry to get back to you so late - here's what I can offer:

As far as I'm aware, the code in the vim71-ian branch of the
repository contains almost all of the stable work done by both myself
and Xiaozhou, so that's the best place to look. There's a bunch of
testing code in that branch as well, but it isn't all documented
(sorry). The tools I've been using are vgrep, regtest, and the
run_tests shell script (found in reg_test/). Xiaozhou also wrote up a
test file for use with 'make test', but I'm not well acquainted with
its contents.

On Fri, Mar 28, 2008 at 5:53 AM, Andrei Aiordachioaie
<andre...@gmail.com> wrote:
>
> From what I've looked at the test-cases, it seems that the NFA
> implementation is not greedy, as it should be. I will look more into
> it.

It's greedy in its own way: IIRC, leftmost-first, with the exception
of ordered alternation (see
http://groups.google.com/group/vim_dev/browse_thread/thread/9db490f9c4297c8e
for a discussion of that feature).

> So for the project, I want to extend the test-suite to compare the way
> regexps are handled in the old vs the new engine. Maybe this uncovers
> other bugs. Then, the largest portion of the project would be fixing
> the found bugs. And if that takes little time, I could work on the old
> regexp engine bugs.

The largest batch of test cases is in reg_test/files/basic.dat, which
can be run with "./regtest --engine=nfa reg_test/files/basic.dat".

This file has been modified so all tests succeed with the old vim
matching engine. So the failures there represent the differences
between the old and new engines. The --engine=[nfa,bt] flag on
regtest and vgrep control which engine is used, so you can compare
easily. There are a few lingering bugs to be ironed out, but it seems
like we're pretty close to a correct engine - more of the work will
probably go into making it faster.

> Do you have any other ideas? Would this be enough
> for a 2.5 months project?

Here's what I wrote to another student who enquired about the project:

"The short answer is yes, there's more work to be done by another
student. I've been slowly working on fixing a few lingering problems
in the code we wrote last summer (thus the commits you saw). The code
is very close to running correctly. However, it's not super fast at
this point, so one big project might be optimizing the new code so
that it is more comparable to the speed of the old engine on
non-pathological cases. There are also some more features that would
be great to add (off the top of my head, a couple are multibyte
characters and the \{n,m} construct). And of course, there's a
non-trivial amount of work in just preparing the code for inclusion in
Vim's source. I just haven't found the time this semester to do as
much as I had hoped, so again, yes, I think another summer on this
project would prove fruitful. If you'd like a better idea of where
development left off, I suggest poking through the archives of the
group we used at <http://groups.google.com/group/vim-soc-regexp>. The
last couple commits I've made are not yet documented, so don't worry
too much about those for the moment."

Hope all this helps,
Ian

Ian Young

unread,
Mar 30, 2008, 7:03:43 PM3/30/08
to vim...@googlegroups.com
On Sun, Mar 30, 2008 at 5:54 PM, Ian Young <ian.gr...@gmail.com> wrote:
> Sorry to get back to you so late - here's what I can offer:
>
> As far as I'm aware, the code in the vim71-ian branch of the
> repository contains almost all of the stable work done by both myself
> and Xiaozhou, so that's the best place to look.

Correction: I just discovered that Xiaozhou has made some commits to
the 'vim71' branch (i.e. the trunk) that I have missed - I've been
watching the 'vim71-sbboat' branch for changes and so didn't see
these. It looks like he's done some much-needed code cleanup, but I'm
not sure how the feature set / bugfixes compare to what's in
'vim71-ian'.

Ian

Andrei Aiordachioaie

unread,
Mar 31, 2008, 10:04:25 AM3/31/08
to vim_dev


On Mar 31, 12:54 am, "Ian Young" <ian.greenl...@gmail.com> wrote:
> Sorry to get back to you so late - here's what I can offer:
>
> As far as I'm aware, the code in the vim71-ian branch of the
> repository contains almost all of the stable work done by both myself
> and Xiaozhou, so that's the best place to look. There's a bunch of
> testing code in that branch as well, but it isn't all documented
> (sorry). The tools I've been using are vgrep, regtest, and the
> run_tests shell script (found in reg_test/). Xiaozhou also wrote up a
> test file for use with 'make test', but I'm not well acquainted with
> its contents.
>
> On Fri, Mar 28, 2008 at 5:53 AM, Andrei Aiordachioaie<andrei6...@gmail.com> wrote:
>
> > From what I've looked at the test-cases, it seems that the NFA
> > implementation is not greedy, as it should be. I will look more into
> > it.
>
> It's greedy in its own way: IIRC, leftmost-first, with the exception
> of ordered alternation (seehttp://groups.google.com/group/vim_dev/browse_thread/thread/9db490f9c...
Thanks a lot for your reply. Running regtest with the NFA engine
crashes for me right at the first test. The old engine passes all
tests though. I'll try to find a way to include them in the main
testing suite, along with the new engine.

Cheers,
Andrei
Reply all
Reply to author
Forward
0 new messages