[bug] use of uninitialized memory in regexp_nfa.c with invalid back reference

28 views
Skip to first unread message

Dominique Pellé

unread,
Sep 26, 2015, 7:21:44 PM9/26/15
to vim_dev
Hi

Valgrind detects use of uninitialized memory when doing:

$ valgrind --track-origins=yes \
vim -u NONE -c '/\(\&\|\1\)\(x\)' 2> vg.log

Vim gives:

Error detected while processing /home/pel/sb/vim/src/undef.vim:
line 1:
E486: Pattern not found: \(\&\|\1\)\(x\)
Press ENTER or type command to continue

I would expect Vim to detect an invalid back reference here
instead of saying "Pattern not found". And valgrind log file
vg.log shows access to uninitialized memory:

==6067== Memcheck, a memory error detector
==6067== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==6067== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==6067== Command: ./vim -u NONE -c /\\(\\&\\|\\1\\)\\(x\\)
==6067==
==6067== Conditional jump or move depends on uninitialised value(s)
==6067== at 0x5010C6: sub_equal (regexp_nfa.c:4027)
==6067== by 0x5029F7: has_state_with_pos.isra.16 (regexp_nfa.c:4109)
==6067== by 0x502C2B: addstate (regexp_nfa.c:4379)
==6067== by 0x502E03: addstate (regexp_nfa.c:4547)
==6067== by 0x502ED3: addstate (regexp_nfa.c:4651)
==6067== by 0x5032B2: addstate_here (regexp_nfa.c:4692)
==6067== by 0x51890E: nfa_regmatch (regexp_nfa.c:6699)
==6067== by 0x51AB3F: nfa_regtry (regexp_nfa.c:6893)
==6067== by 0x51AF62: nfa_regexec_both (regexp_nfa.c:7084)
==6067== by 0x51E9A8: vim_regexec_multi (regexp.c:8276)
==6067== by 0x53184B: searchit (search.c:714)
==6067== by 0x532725: do_search (search.c:1432)
==6067== Uninitialised value was created by a heap allocation
==6067== at 0x4C2AB80: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6067== by 0x4C0B2A: lalloc (misc2.c:921)
==6067== by 0x518391: nfa_regmatch (regexp_nfa.c:5473)
==6067== by 0x51AB3F: nfa_regtry (regexp_nfa.c:6893)
==6067== by 0x51AF62: nfa_regexec_both (regexp_nfa.c:7084)
==6067== by 0x51E9A8: vim_regexec_multi (regexp.c:8276)
==6067== by 0x53184B: searchit (search.c:714)
==6067== by 0x532725: do_search (search.c:1432)
==6067== by 0x45E401: get_address (ex_docmd.c:4508)
==6067== by 0x46613F: do_cmdline (ex_docmd.c:2183)
==6067== by 0x409464: main (main.c:2926)
.... snip more errors after that....

Code in regexp_nfa.c:4027 where error is detected:

3986 static int
3987 sub_equal(sub1, sub2)
3988 regsub_T *sub1;
3989 regsub_T *sub2;
3990 {
....
4026 s2 = -1;
!!4027 if (s1 != s2)
4028 return FALSE;
4029 if (s1 != -1 && sub1->list.multi[i].end_col
4030 !=
sub2->list.multi[i].end_col)
4031 return FALSE;

Adding some printf(), I can see that neither s1 nor s2
are initialized at line 4027.

Valgrind does not complain when using the old regexp engine (re=1)
with the same regexp:

$ valgrind --track-origins=yes \
vim -c 'set re=1' -u NONE -c '/\(\&\|\1\)\(x\)' 2> vg.log

With the old regexp engine, vim properly detects the invalid
regexp saying:

Error detected while processing /home/pel/sb/vim/src/undef.vim:
line 1:
E65: Illegal back reference

The bug was found using afl-fuzz. I don't know how to fix it.

Regards
Dominique

Tony Mechelynck

unread,
Sep 26, 2015, 10:52:01 PM9/26/15
to vim_dev
I would expect them both to be set in the code you omitted just above
line 4027, as follows (lines 4017-4032) — or could s1 somehow be left
undefined if (i < sub1->in_use) and (s1 = sub1->list.multi[i]end_lnum)
(and similarly for s2)?

if (nfa_has_backref)
{
if (i < sub1->in_use)
s1 = sub1->list.multi[i].end_lnum;
else
s1 = -1;
if (i < sub2->in_use)
s2 = sub2->list.multi[i].end_lnum;
else
s2 = -1;
if (s1 != s2)
return FALSE;
if (s1 != -1 && sub1->list.multi[i].end_col
!= sub2->list.multi[i].end_col)
return FALSE;
}

My source repository is at commit
https://github.com/vim/vim/commit/ca63501fbcd1cf9c8aa9ff12c093c95b62a89ed7
(i.e. the runtime files update following patch 7.4.884).


Best regards,
Tony.

Dominique Pellé

unread,
Sep 27, 2015, 2:03:59 AM9/27/15
to vim_dev
0>>
Hi Tony

No. Valgrind forwards uninitialized values in assignments
and has no false positives. If s1 and s2 were detected as
uninitialized, it means that sub1->list.multi[i].end_lnum and
sub2->list.multi[i].end_lnum were also uninitialized.

Valgrind does not complain at the point where uninitialized
values are copied (because that's common and not a bug).
Instead, it warns later when a 'if' is made on an uninitialized
value, the outcome being then undefined behavior.

I forgot to say: bug happens with the latest vim-7.4.884
and older.

Regards
Dominique

Bram Moolenaar

unread,
Sep 27, 2015, 8:25:27 AM9/27/15
to Dominique Pellé, vim_dev
Thanks for reporting.

--
When a fly lands on the ceiling, does it do a half roll or
a half loop?

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
Reply all
Reply to author
Forward
0 new messages