Expect core dumps with special regexp

38 views
Skip to first unread message

heinrichmartin

unread,
Jun 16, 2022, 5:09:34 AMJun 16
to
Hi,
I've chased a strange core dump of Expect down to quite a simple regexp - now I am stuck and looking for advice/ideas/memories:

expect:/tmp$ set exp_library
/usr/lib/expect5.45
expect:/tmp$ set tcl_patchLevel
8.6.4
expect:/tmp$ exp_internal 0
expect:/tmp$ log_user 0
1
expect:/tmp$ spawn ls -l
10133
expect:/tmp$ expect -re {(?x)[\r\n]+?}
expect:/tmp$ expect -re {(?x)[\r\n]+?
>}
alloc: invalid block: 0x25fdfe0: ef ef 0
Aborted (core dumped)

Notes:
* A trailing newline in the expanded syntax makes a difference, but it is not the sole contributor to the core dump. (I have a more complex regexp that also ends in [\r\n]+? and that works ...)
* The regexp alone, i.e. with [regexp], has no issue.
* The core dump happens independent of whether the spawn id is open, i.e. while preparing, not while matching against the buffer.

expect:/tmp$ regexp {(?x)[\r\n]+?} foo
0
expect:/tmp$ regexp {(?x)[\r\n]+?
>} foo
0
expect:/tmp$ regexp {(?x)[\r\n]+?
>} foo\nbar\n
1

It looks like Expect has issue generating the glob-gate for that regexp.
no core dump:
expect -re {(?x)^
(foo|bar)
[\r\n]+?
}
core dump:
expect -re {(?x)^
(foo)
[\r\n]+?
}

Does anyone have an idea/hint?

heinrichmartin

unread,
Jun 16, 2022, 10:50:09 AMJun 16
to
On Thursday, June 16, 2022 at 11:09:34 AM UTC+2, heinrichmartin wrote:
> It looks like Expect has issue generating the glob-gate for that regexp.
> no core dump:
> expect -re {(?x)^
> (foo|bar)
> [\r\n]+?
> }
> core dump:
> expect -re {(?x)^
> (foo)
> [\r\n]+?
> }
>
> Does anyone have an idea/hint?

Actually, I have made huge progress by just writing it down for c.l.t. ... and I am answering my own question now:

The difference is (obviously) the branch "|". Expect stops generating glob-gate when a branch is discovered (retglob.c:236:/* branching is too complex */ goto error).
At this point, I have two possible workarounds: (1) do not use expanded syntax for simple regexp or (2) add a bogus branch, e.g. "(?:a|a)" for "a" or in this case "(?:\r|\n)" for "[\r\n]".

Then, I assume the issue is in retglob.c:223:
if (expanded) {
/* Expanded syntax, whitespace and comments, ignore. */
while (MATCHC (' ') ||
MATCHC (0x9) ||
MATCHC (0xa)) CHOP (1);
/* XXX not checking strlen before proceeding */
if (MATCHC ('#')) {
CHOPC (0xa);
if (strlen) CHOP (1);
continue;
}
}

The tight while-loop is safe, iff the string is \0-terminated, but afterwards: if (0 == strlen) break;

Given that I have a simple workaround, I am going to skip fix+build+verify+deploy and will work with simple syntax instead. If comments are needed, one could assemble and document the regexp in a variable before the actual expect.

HTH
Martin

heinrichmartin

unread,
Jun 16, 2022, 11:06:51 AMJun 16
to
On Thursday, June 16, 2022 at 4:50:09 PM UTC+2, heinrichmartin wrote:
> Given that I have a simple workaround, I am going to skip fix+build+verify+deploy and will work with simple syntax instead. If comments are needed, one could assemble and document the regexp in a variable before the actual expect.

https://core.tcl-lang.org/expect/tktview/be7d99dcc2a2f082bcdb43cc02f6d636d093699b

"Note that trailing whitespace is quite common if the regexp with expanded syntax is written like a code block inside braces across several lines, i.e. they end in newline and indentation."
Reply all
Reply to author
Forward
0 new messages