Diff filler broken with non-standard but valid diffexpr input

300 views
Skip to first unread message

Benjamin Fritz

unread,
Jun 9, 2012, 12:04:17 AM6/9/12
to vim_dev
I've been working on a plugin to provide a manual diff alignment feature,
similar to KDiff3 or BeyondCompare. The idea is that a human can sometimes
figure out which lines should be shown side-by-side in a diff than a
computer can, so we should let them.

:help diff-diffexpr simply says:

> The output of "diff" must be a normal "ed" style diff. Do NOT use a context
> diff. This example explains the format that Vim expects: >
>
> 1a2
> > bbb
> 4d4
> < 111
> 7c7
> < GGG
> ---
> > ggg
>
> The "1a2" item appends the line "bbb".
> The "4d4" item deletes the line "111".
> The '7c7" item replaces the line "GGG" with "ggg".

I took this to mean that any valid ed-style diff can be generated by the
diffexpr and used by Vim. I assumed that Vim would basically use the hunks
in the diff to place filler lines and align the text for each buffer. with
this assumption, my design for the manual alignment was to break each file
into chunks, write the chunks with writefile(), diff the chunks, and
combine the output of those diffs into one big diff for diffexpr to
finally return to Vim (correcting line numbers based on the starting line
number of the current chunk). For simple diffs, this seems to work fine.
See the attached diff2_with_alignment.png.

But, a slightly more complicated example is shown in
diff_without_alignment.png. This is the view before using the plugin to
align things; the ed-style diff is attached as orig.patch.

Using my plugin, the diffexpr can generate the aligned.patch output. I can
use GNU patch and this patch to transform either file into the other
without errors. However, Vim displays the diff as shown in
diff_with_alignment.png. Note that there is no diff filler whatsoever, and
the highlighting is wrong! This also happens in gvim -N -u NONE -i NONE,
after sourcing my plugin and running a diff.

I think I've accidentally uncovered a bug in Vim; or at least, an
undocumented dependency on a specific diff *algorithm*, not just any old
ed-style diff format. I spent a few minutes looking at the code in diff.c,
but it looks like I'll need to spend a couple hours at least just to
figure out what's going on in there. Any help? Am I doing something wrong?

Current version of the plugin is attached if you want to play with it and
generate the diffs yourself, otherwise see the attached diffs and
screenshots.
aligned.patch
orig.patch
dif2_without_alignment.png
diff2_with_alignment.png
diff_with_alignment.png
diff_without_alignment.png
manDiffAlign_0.1.zip

Christian Brabandt

unread,
Jun 10, 2012, 8:52:37 AM6/10/12
to vim_dev
Hi Benjamin!
How did you generate the aligned.patch file? No matter what I do, for me
the plugin always generates:

2,7c2,4
< a
< b
< cb
< db
< eb
< f
---
> cc
> dd
> ee

But perhaps, I'm using your plugin wrongly, e.g. how many alignemnt
points do I have to set and where? But even after I manually copied the
aligned.patch file over the resulting file, Vim still gets the alignment
wrong.

BTW: attached is a patch, that prevents the use of using diffexpr for
checking if diff really works (using the "line1" vs. "line2" check):

regards,
Christian
diff_diffexpr.diff

Ben Fritz

unread,
Jun 10, 2012, 11:40:54 PM6/10/12
to vim...@googlegroups.com
On Sunday, June 10, 2012 7:52:37 AM UTC-5, Christian Brabandt wrote:
> How did you generate the aligned.patch file? No matter what I do, for me
> the plugin always generates:
>
> 2,7c2,4
> < a
> < b
> < cb
> < db
> < eb
> < f
> ---
> > cc
> > dd
> > ee
>
> But perhaps, I'm using your plugin wrongly, e.g. how many alignemnt
> points do I have to set and where? But even after I manually copied the
> aligned.patch file over the resulting file, Vim still gets the alignment
> wrong.
>

Oh, sorry.

enter the text shown in the screenshots, place the cursor on line 4 of file_a.txt ("cb"), press <Leader>mda to "mark diff alignment". Go to line 2 of file_b.txt and again type <Leader>mda> to indicate that those two points correspond to each other. Then run :diffupdate.

I think I have logic in there that ignores all the alignment points if there aren't the same number in each buffer but I'm not sure if I got around to that.

> BTW: attached is a patch, that prevents the use of using diffexpr for
> checking if diff really works (using the "line1" vs. "line2" check):
>

I don't think that's a very good idea, what if somebody uses diffexpr to specify the full path to diff? Or to use a separate program entirely, and they don't even have diff installed on their system? Then the test will always fail, right?

But it should be documented by diffexpr how the test actually works, it took me quite by surprise.

Christian Brabandt

unread,
Jun 11, 2012, 2:06:03 PM6/11/12
to vim...@googlegroups.com
Hi Ben!

On So, 10 Jun 2012, Ben Fritz wrote:

> On Sunday, June 10, 2012 7:52:37 AM UTC-5, Christian Brabandt wrote:
> > How did you generate the aligned.patch file? No matter what I do, for me
> > the plugin always generates:
> >
> > 2,7c2,4
> > < a
> > < b
> > < cb
> > < db
> > < eb
> > < f
> > ---
> > > cc
> > > dd
> > > ee
> >
> > But perhaps, I'm using your plugin wrongly, e.g. how many alignemnt
> > points do I have to set and where? But even after I manually copied the
> > aligned.patch file over the resulting file, Vim still gets the alignment
> > wrong.
> >
>
> Oh, sorry.
>
> enter the text shown in the screenshots, place the cursor on line 4 of
> file_a.txt ("cb"), press <Leader>mda to "mark diff alignment". Go to
> line 2 of file_b.txt and again type <Leader>mda> to indicate that
> those two points correspond to each other. Then run :diffupdate.

I did that. I still get the above patch. Ah, I found the problem. You
are parsing the output of sign place, and on my localized version it
prints:

:sign place
--- Zeichen ---
Zeichen f�r ../diff1.txt:
Zeile=2 id=1339437602 Name=manDiffAlign01
Zeichen f�r ../diff2.txt:
Zeile=4 id=1339437599 Name=manDiffAlign01

So it is probably better to look for the first equal sign.

After changing my Vim to an english localization I could reproduce your
issue. I think the problem is, that Vim thinks, the 2 diff regions are
overlapping. This looks like an "off-by 1" error for me, and indeed this
patch fixes it for me:

diff --git a/src/diff.c b/src/diff.c
--- a/src/diff.c
+++ b/src/diff.c
@@ -1318,8 +1318,8 @@
}

if (dp != NULL
- && lnum_orig <= dp->df_lnum[idx_orig] + dp->df_count[idx_orig]
- && lnum_orig + count_orig >= dp->df_lnum[idx_orig])
+ && lnum_orig < dp->df_lnum[idx_orig] + dp->df_count[idx_orig]
+ && lnum_orig + count_orig > dp->df_lnum[idx_orig])
{
/* New block overlaps with existing block(s).
* First find last block that overlaps. */

> > BTW: attached is a patch, that prevents the use of using diffexpr
> > for checking if diff really works (using the "line1" vs. "line2"
> > check):
> >
>
> I don't think that's a very good idea, what if somebody uses diffexpr
> to specify the full path to diff? Or to use a separate program
> entirely, and they don't even have diff installed on their system?
> Then the test will always fail, right?

Good point, indeed.

> But it should be documented by diffexpr how the test actually works,
> it took me quite by surprise.

Yes, this looks unsurprisingly when first using 'diffexpr'.

regards,
Christian

Ben Fritz

unread,
Jun 11, 2012, 3:16:28 PM6/11/12
to vim...@googlegroups.com
On Monday, June 11, 2012 1:06:03 PM UTC-5, Christian Brabandt wrote:
> Hi Ben!
>
> On So, 10 Jun 2012, Ben Fritz wrote:
>
> > On Sunday, June 10, 2012 7:52:37 AM UTC-5, Christian Brabandt wrote:
> > >
> > > But perhaps, I'm using your plugin wrongly, e.g. how many alignemnt
> > > points do I have to set and where? But even after I manually copied the
> > > aligned.patch file over the resulting file, Vim still gets the alignment
> > > wrong.
> > >
> >
> > Oh, sorry.
> >
> > enter the text shown in the screenshots, place the cursor on line 4 of
> > file_a.txt ("cb"), press <Leader>mda to "mark diff alignment". Go to
> > line 2 of file_b.txt and again type <Leader>mda> to indicate that
> > those two points correspond to each other. Then run :diffupdate.
>
> I did that. I still get the above patch. Ah, I found the problem. You
> are parsing the output of sign place, and on my localized version it
> prints:
>
> :sign place
> --- Zeichen ---
> Zeichen f�r ../diff1.txt:
> Zeile=2 id=1339437602 Name=manDiffAlign01
> Zeichen f�r ../diff2.txt:
> Zeile=4 id=1339437599 Name=manDiffAlign01
>
> So it is probably better to look for the first equal sign.
>

Oops! I was trying to be lazy since I wasn't sure exactly what the output
format was going to be, but forgot about different locales. I'll do
something about that before I release the plugin. Thanks.

> After changing my Vim to an english localization I could reproduce your
> issue. I think the problem is, that Vim thinks, the 2 diff regions are
> overlapping.

I suspected as much...but didn't have time yet to puzzle out how that code
was supposed to work. Thanks for looking into it!

> This looks like an "off-by 1" error for me, and indeed this
> patch fixes it for me:
>
> diff --git a/src/diff.c b/src/diff.c
> --- a/src/diff.c
> +++ b/src/diff.c
> @@ -1318,8 +1318,8 @@
> }
>
> if (dp != NULL
> - && lnum_orig <= dp->df_lnum[idx_orig] + dp->df_count[idx_orig]
> - && lnum_orig + count_orig >= dp->df_lnum[idx_orig])
> + && lnum_orig < dp->df_lnum[idx_orig] + dp->df_count[idx_orig]
> + && lnum_orig + count_orig > dp->df_lnum[idx_orig])
> {
> /* New block overlaps with existing block(s).
> * First find last block that overlaps. */
>

I applied the patch and things do behave better now. Thanks! Something is
still off, however. It looks like some leading diff filler is missing. See
attached screenshots:

patched_window2.png shows that the alignment at least works, and the
DiffChange and DiffText highlighting is correct.

But patched_window1.png shows that the second window ought to have some
leading diff filler which is missing, screwing up the alignment. All I did
between these screenshots is CTRL-W_W to switch windows. The jump in window
position when I do this was actually pretty surprising; I thought it was
fixed entirely until I switched windows.

Ben Fritz

unread,
Jun 11, 2012, 3:19:22 PM6/11/12
to vim...@googlegroups.com
On Monday, June 11, 2012 2:16:28 PM UTC-5, Ben Fritz wrote:
> It looks like some leading diff filler is missing. See
> attached screenshots:
>
> patched_window2.png shows that the alignment at least works, and the
> DiffChange and DiffText highlighting is correct.
>
> But patched_window1.png shows that the second window ought to have some
> leading diff filler which is missing, screwing up the alignment. All I did
> between these screenshots is CTRL-W_W to switch windows. The jump in window
> position when I do this was actually pretty surprising; I thought it was
> fixed entirely until I switched windows.

Oops, actually attaching the images would probably help.

patched_window2.png
patched_window1.png

Bram Moolenaar

unread,
Jun 13, 2012, 9:33:08 AM6/13/12
to Christian Brabandt, vim...@googlegroups.com
I'm not sure including this patch will not cause problems. The code
assumes there is always at least one unchanged lines between diff
blocks. When two blocks are touching (no line in between) that means
they should actually be one block.

Changing the assumption is tricky, it might work in some cases but not
always. It may cause problems for commands like "do" and "dp".
We would at least need some good tests to make sure nothing breaks.

--
hundred-and-one symptoms of being an internet addict:
30. Even though you died last week, you've managed to retain OPS on your
favorite IRC channel.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Ben Fritz

unread,
Jun 13, 2012, 11:16:42 AM6/13/12
to vim...@googlegroups.com, Christian Brabandt
On Wednesday, June 13, 2012 8:33:08 AM UTC-5, Bram Moolenaar wrote:
>
> I'm not sure including this patch will not cause problems. The code
> assumes there is always at least one unchanged lines between diff
> blocks. When two blocks are touching (no line in between) that means
> they should actually be one block.
>
> Changing the assumption is tricky, it might work in some cases but not
> always. It may cause problems for commands like "do" and "dp".
> We would at least need some good tests to make sure nothing breaks.
>

OK, so Vim can still combine adjacent blocks. But can it be be updated to use this assumption, yet correctly handle any valid ed-style diff (in terms of adding diff filler and alignment of lines which are part of the same hunk in the diff)?

I certainly have no problem with do/dp getting the entire combination of changes for adjacent hunks, I just want Vim to show me the effect of those hunks properly.

Ben Fritz

unread,
Jun 13, 2012, 11:22:04 AM6/13/12
to vim...@googlegroups.com, Christian Brabandt
On Wednesday, June 13, 2012 8:33:08 AM UTC-5, Bram Moolenaar wrote:
>
> I'm not sure including this patch will not cause problems. The code
> assumes there is always at least one unchanged lines between diff
> blocks. When two blocks are touching (no line in between) that means
> they should actually be one block.
>

Actually the current patch does cause one problem, as I noted earlier in the thread. Somehow the leading diff filler is missing but the changes are aligned anyway, causing one of the windows to jump out of alignment when switching back and forth between them.

> Changing the assumption is tricky, it might work in some cases but not
> always. It may cause problems for commands like "do" and "dp".
> We would at least need some good tests to make sure nothing breaks.
>

What sort of tests would you want to see? Make a diffexpr to return adjacent diffs and test that do, dp, :diffget, and :diffput still work on those blocks and also non-adjacent ones? Repeat with default diff output?

I've never messed with the test suite before (does it work on Windows?) but I'd be willing to look into it if there's a patch to support adjacent-but-separate hunks in the diff output.

Bram Moolenaar

unread,
Jun 13, 2012, 4:53:44 PM6/13/12
to Ben Fritz, vim...@googlegroups.com, Christian Brabandt

Ben Fritz wrote:

> On Wednesday, June 13, 2012 8:33:08 AM UTC-5, Bram Moolenaar wrote:
> >
> > I'm not sure including this patch will not cause problems. The code
> > assumes there is always at least one unchanged lines between diff
> > blocks. When two blocks are touching (no line in between) that means
> > they should actually be one block.
> >
> > Changing the assumption is tricky, it might work in some cases but not
> > always. It may cause problems for commands like "do" and "dp".
> > We would at least need some good tests to make sure nothing breaks.
> >
>
> OK, so Vim can still combine adjacent blocks. But can it be be updated
> to use this assumption, yet correctly handle any valid ed-style diff
> (in terms of adding diff filler and alignment of lines which are part
> of the same hunk in the diff)?

What valid ed-style diff is not handled? Note that a diff with two
adjecent blocks is not valid, diff will never produce it.

> I certainly have no problem with do/dp getting the entire combination
> of changes for adjacent hunks, I just want Vim to show me the effect
> of those hunks properly.
>
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php

--
To define recursion, we must first define recursion.

Ben Fritz

unread,
Jun 13, 2012, 5:37:25 PM6/13/12
to vim...@googlegroups.com, Ben Fritz, Christian Brabandt
On Wednesday, June 13, 2012 3:53:44 PM UTC-5, Bram Moolenaar wrote:

> Ben Fritz wrote:
>
> >
> > OK, so Vim can still combine adjacent blocks. But can it be be updated
> > to use this assumption, yet correctly handle any valid ed-style diff
> > (in terms of adding diff filler and alignment of lines which are part
> > of the same hunk in the diff)?
>
> What valid ed-style diff is not handled? Note that a diff with two
> adjecent blocks is not valid, diff will never produce it.
>

Why is a diff with adjacent blocks not valid? How do you know a posix-compliant diff will never produce it? I don't see anything in man pages online about limitations on adjacent hunks. I Googled for "posix diff format" and looked at:

http://www.unix.com/man-page/posix/1posix/diff/
http://pubs.opengroup.org/onlinepubs/007904875/utilities/diff.html
http://pubs.opengroup.org/onlinepubs/007908799/xcu/diff.html
http://www.gnu.org/software/hello/manual/diff.html#Detailed%20ed

Even if diff itself only finds non-overlapping sections due to the algorithm used ( http://en.wikipedia.org/wiki/Diff#Algorithm ), the *output format* does not have this limitation. Indeed, rather than a single "change" operation, a diff tool could decide to output both a deletion and an addition at the same location, and as far as I can tell this should be considered valid input. Handling this last scenario might very well break what I'm trying to do anyway, but even so, Vim should not completely fail to highlight the second file as shown in the screenshots I posted earlier. And, I think it should be possible to handle such situations and still align lines affected by a "change" action in the diff.

Bram Moolenaar

unread,
Jun 14, 2012, 2:59:54 PM6/14/12
to Ben Fritz, vim...@googlegroups.com, Christian Brabandt
My definition of "valid" is something that diff would produce. Can you
show me two input files for which diff produces an ed style diff with
adjecent blocks?

--
Have you heard about the new Beowulf cluster? It's so fast, it executes
an infinite loop in 6 seconds.

Christian Brabandt

unread,
Jun 14, 2012, 4:36:06 PM6/14/12
to vim...@googlegroups.com
Hi Bram!

On Do, 14 Jun 2012, Bram Moolenaar wrote:

> My definition of "valid" is something that diff would produce. Can you
> show me two input files for which diff produces an ed style diff with
> adjecent blocks?

Seems, the more widely used diffs, always merge adjacent hunks. The only
reference I found to that was the GNU manual stating�:
"In this case, diff normally shifts the hunk's boundaries when this
merges adjacent hunks"

But according to the robustness principle ("be conservative in what you
do, be liberal in what you accept") and since it's still perfectly valid
input and patch(1) happily accepts it, shouldn't Vim also be able to
handle such a diff?

�)
http://www.gnu.org/software/diffutils/manual/html_node/diff-Performance.html

regards,
Christian
--
Ohne Kirche - keine H�lle.
-- Max Frisch

Bram Moolenaar

unread,
Jun 15, 2012, 2:31:30 PM6/15/12
to Christian Brabandt, vim...@googlegroups.com

Christian Brabandt wrote:

> On Do, 14 Jun 2012, Bram Moolenaar wrote:
>
> > My definition of "valid" is something that diff would produce. Can you
> > show me two input files for which diff produces an ed style diff with
> > adjecent blocks?
>
> Seems, the more widely used diffs, always merge adjacent hunks. The only
> reference I found to that was the GNU manual statingą:
> "In this case, diff normally shifts the hunk's boundaries when this
> merges adjacent hunks"
>
> But according to the robustness principle ("be conservative in what you
> do, be liberal in what you accept") and since it's still perfectly valid
> input and patch(1) happily accepts it, shouldn't Vim also be able to
> handle such a diff?
>
> ą)
> http://www.gnu.org/software/diffutils/manual/html_node/diff-Performance.html

The problem is that when diff blocks are allowed to touch each other
then updates for changes become tricky. Not to mention bugs that will
be uncovered. Thus I'm not against it, just careful about adding a
feature for few users that will cause problems for many more users.

--
A day without sunshine is like, well, night.
Reply all
Reply to author
Forward
0 new messages