Channel callbacks randomly fail to fire in Vim for win32

45 views
Skip to first unread message

Kay Zheng

unread,
Jan 18, 2017, 4:39:35 AM1/18/17
to v...@vim.org
Hi.

I've been working on a Vim plugin using the channel APIs. When testing
the plugin on Windows (Windows 7 Pro. SP1 32-bit), one of my channel
callbacks failed to fire sometimes.

There was no error or anything. It looked like the message passing
through the channel just went missing. I used Wireshark to capture the
packets, and it turned out the message which was supposed to trigger
the callback did go through the sockets, but Vim just won't call my
callback.

I tried ch_logfile(...). But with the logging enabled, I could not
reproduce the problem anymore. Yuck!

I've been reviewing and testing my code, hoping to find something
relevant, but the code worked fine on my other system (Fedora Linux).

The difficulty here is that the callback failure happened randomly.
The code involved was not trivial and I couldn't find a reliable way
to reproduce the problem. So I turn to the nice Vim people here for
suggestions.

I'm using the latest win32 Vim release from vim.org: 8.0 with patches
1-69 (32-bit GUI version with OLE support).

And the code involved is on Github:
https://github.com/l04m33/vlime/blob/00f0f2505597ee5e6f3ff0a268e303b0cfeb2605/vim/plugin/vlime.vim#L45

The callback function on line 45 is the one failed to fire. I used a
"chaining" mechanism to call those callbacks and save some typing.

Any suggestions would be greatly appreciated.

Regards.

Kay Z.

Kay Zheng

unread,
Jan 23, 2017, 12:05:12 AM1/23/17
to v...@vim.org
I made the callback failure happen on my Linux system too. Turned out
it has something to do with the message length (or the time needed to
receive the whole message?) Longer messages tend to fail more often.

But I couldn't find a fixed length to trigger this problem. In my
set-up, messages of around 4k bytes were very likely to fail, and
messages of around 40k bytes would definitely fail.

I tried setting the 'drop' option of ch_open(...) to 'never', but it
makes no difference regarding this problem.

I'm using a self-compiled Vim from the official source repo on GitHub,
commit c81299684b2b9045e56525d3da3f45e8440fbf0d.

Regards,

Kay Z.

Christian Brabandt

unread,
Jan 23, 2017, 5:41:13 AM1/23/17
to v...@vim.org
Hi Kay!

On Mo, 23 Jan 2017, Kay Zheng wrote:

> I made the callback failure happen on my Linux system too. Turned out
> it has something to do with the message length (or the time needed to
> receive the whole message?) Longer messages tend to fail more often.
>
> But I couldn't find a fixed length to trigger this problem. In my
> set-up, messages of around 4k bytes were very likely to fail, and
> messages of around 40k bytes would definitely fail.
>
> I tried setting the 'drop' option of ch_open(...) to 'never', but it
> makes no difference regarding this problem.
>
> I'm using a self-compiled Vim from the official source repo on GitHub,
> commit c81299684b2b9045e56525d3da3f45e8440fbf0d.

Can you show an easy way to reproduce? Something we could add into a
test?

Best,
Christian
--
Menschen erraten heißet nichts als sich ähnl(icher) Erfahrungen
besinnen. Mit einem ganz neuen Charakter kömmt der größte
Menschenkenner nicht aus.
-- Jean Paul

Kay Zheng

unread,
Jan 24, 2017, 12:34:48 AM1/24/17
to v...@vim.org
Now I have these logs captured using ch_logfile(...):

https://gist.github.com/anonymous/d9b00634422a69b7ea3729ca6ac21b40

The first received message failed but the second succeeded. I verified
both messages, concatenated the fragments manually and parsed them
with some trivial Python code. they're all legit JSON strings, and
they are actually the same except the message IDs.

Having these info, I think the root cause lies in the channel (or
JSON?) implementation inside Vim.

I will keep looking into it, but I would be grateful if there's
someone familiar with the channel implementation that can help me out.

Thanks!

Kay Z.

unread,
Jan 24, 2017, 12:55:24 AM1/24/17
to vim_use, v...@vim.org
Hi Christian,

>
> Can you show an easy way to reproduce? Something we could add into a
> test?
>

I'll keep trying. I put up some logs from ch_logfile(...) earlier in this thread. I hope that's useful to you.

Bram Moolenaar

unread,
Jan 24, 2017, 7:19:20 AM1/24/17
to vim...@googlegroups.com, Kay Zheng, v...@vim.org

Kay Zheng wrote:

> Now I have these logs captured using ch_logfile(...):
>
> https://gist.github.com/anonymous/d9b00634422a69b7ea3729ca6ac21b40
>
> The first received message failed but the second succeeded. I verified
> both messages, concatenated the fragments manually and parsed them
> with some trivial Python code. they're all legit JSON strings, and
> they are actually the same except the message IDs.
>
> Having these info, I think the root cause lies in the channel (or
> JSON?) implementation inside Vim.
>
> I will keep looking into it, but I would be grateful if there's
> someone familiar with the channel implementation that can help me out.

I suppose this will be hard to reproduce in a debugger, because of
timing. In the channel logfile you may not see enough to pinpoint the
problem. You can add some more ch_logs() calls in channel_read, where
it is calling sock_read() or fd_read(). E.g. bylogging the number of
bytes received. With the timestamp and other log messages you may be
able to guess what goes wrong.


--
Veni, Vidi, Video -- I came, I saw, I taped what I saw.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
Reply all
Reply to author
Forward
0 new messages