Re: Channel callbacks randomly fail to fire in Vim for win32

90 views
Skip to first unread message

Kay Zheng

unread,
Feb 6, 2017, 8:08:59 AM2/6/17
to v...@vim.org, vim...@vim.org
Hi folks,

I did some research on the channel code, and if I read it right,
there's an issue in channel_fill(...).

AFAIK, a channel in JSON mode works like this:

1. When a network message arrives, it tries to parse a complete JSON object.
2. If there were a complete JSON object, it saves the object and tries
to call the callbacks. Otherwise, it may call channel_fill(...) to
fetch more data.
3. If channel_fill(...) cannot provide enough data, it puts the
partial message back into the channel buffers via channel_save(...),
for the next parsing operation to start over.

The problem is, channel_fill(...) may drop part of the buffered
message according to reader->js_used, thus messing up the partial
message saved in step 3.

I managed to reproduce the problem with these code:

https://gist.github.com/l04m33/d1c910f854d19520101dc1bc332277eb

Run server.py first, then source client.vim in Vim. The second JSON
message will NOT trigger the channel callback.

I came up with a simple patch, by always preserving the "used parts"
of the messages. I'm using commit
544d3bc9f0e494cb712a33b61558b8e8e12b1e0b from the GitHub repo.

diff --git a/src/channel.c b/src/channel.c
index bd31bf3..ea307ee 100644
--- a/src/channel.c
+++ b/src/channel.c
@@ -1846,33 +1846,32 @@ channel_fill(js_read_T *reader)
channel_T *channel = (channel_T *)reader->js_cookie;
ch_part_T part = reader->js_cookie_arg;
char_u *next = channel_get(channel, part);
- int unused;
+ int buffered_len;
int len;
char_u *p;

if (next == NULL)
return FALSE;

- unused = reader->js_end - reader->js_buf - reader->js_used;
- if (unused > 0)
+ buffered_len = reader->js_end - reader->js_buf;
+ if (buffered_len > 0)
{
/* Prepend unused text. */
len = (int)STRLEN(next);
- p = alloc(unused + len + 1);
+ p = alloc(buffered_len + len + 1);
if (p == NULL)
{
vim_free(next);
return FALSE;
}
- mch_memmove(p, reader->js_buf + reader->js_used, unused);
- mch_memmove(p + unused, next, len + 1);
+ mch_memmove(p, reader->js_buf, buffered_len);
+ mch_memmove(p + buffered_len, next, len + 1);
vim_free(next);
next = p;
}

vim_free(reader->js_buf);
reader->js_buf = next;
- reader->js_used = 0;
return TRUE;
}


2017-01-23 13:04 GMT+08:00 Kay Zheng <l04...@gmail.com>:
> I made the callback failure happen on my Linux system too. Turned out
> it has something to do with the message length (or the time needed to
> receive the whole message?) Longer messages tend to fail more often.
>
> But I couldn't find a fixed length to trigger this problem. In my
> set-up, messages of around 4k bytes were very likely to fail, and
> messages of around 40k bytes would definitely fail.
>
> I tried setting the 'drop' option of ch_open(...) to 'never', but it
> makes no difference regarding this problem.
>
> I'm using a self-compiled Vim from the official source repo on GitHub,
> commit c81299684b2b9045e56525d3da3f45e8440fbf0d.
>
> Regards,
>
> Kay Z.
>
> 2017-01-18 17:39 GMT+08:00 Kay Zheng <l04...@gmail.com>:
>> Hi.
>>
>> I've been working on a Vim plugin using the channel APIs. When testing
>> the plugin on Windows (Windows 7 Pro. SP1 32-bit), one of my channel
>> callbacks failed to fire sometimes.
>>
>> There was no error or anything. It looked like the message passing
>> through the channel just went missing. I used Wireshark to capture the
>> packets, and it turned out the message which was supposed to trigger
>> the callback did go through the sockets, but Vim just won't call my
>> callback.
>>
>> I tried ch_logfile(...). But with the logging enabled, I could not
>> reproduce the problem anymore. Yuck!
>>
>> I've been reviewing and testing my code, hoping to find something
>> relevant, but the code worked fine on my other system (Fedora Linux).
>>
>> The difficulty here is that the callback failure happened randomly.
>> The code involved was not trivial and I couldn't find a reliable way
>> to reproduce the problem. So I turn to the nice Vim people here for
>> suggestions.
>>
>> I'm using the latest win32 Vim release from vim.org: 8.0 with patches
>> 1-69 (32-bit GUI version with OLE support).
>>
>> And the code involved is on Github:
>> https://github.com/l04m33/vlime/blob/00f0f2505597ee5e6f3ff0a268e303b0cfeb2605/vim/plugin/vlime.vim#L45
>>
>> The callback function on line 45 is the one failed to fire. I used a
>> "chaining" mechanism to call those callbacks and save some typing.
>>
>> Any suggestions would be greatly appreciated.
>>
>> Regards.
>>
>> Kay Z.

Bram Moolenaar

unread,
Feb 6, 2017, 3:42:16 PM2/6/17
to vim...@googlegroups.com, Kay Zheng, v...@vim.org, vim...@vim.org

Kay Zheng wrote:

> I did some research on the channel code, and if I read it right,
> there's an issue in channel_fill(...).
>
> AFAIK, a channel in JSON mode works like this:
>
> 1. When a network message arrives, it tries to parse a complete JSON object.
> 2. If there were a complete JSON object, it saves the object and tries
> to call the callbacks. Otherwise, it may call channel_fill(...) to
> fetch more data.
> 3. If channel_fill(...) cannot provide enough data, it puts the
> partial message back into the channel buffers via channel_save(...),
> for the next parsing operation to start over.
>
> The problem is, channel_fill(...) may drop part of the buffered
> message according to reader->js_used, thus messing up the partial
> message saved in step 3.
>
> I managed to reproduce the problem with these code:
>
> https://gist.github.com/l04m33/d1c910f854d19520101dc1bc332277eb
>
> Run server.py first, then source client.vim in Vim. The second JSON
> message will NOT trigger the channel callback.
>
> I came up with a simple patch, by always preserving the "used parts"
> of the messages. I'm using commit
> 544d3bc9f0e494cb712a33b61558b8e8e12b1e0b from the GitHub repo.

I'm glad you have been able to figure out this problem.
I thought there was a test for a split message, but apparently it
doesn't catch this situation. I managed to add a test that does.

I also notice that when the message receives in parts, the 100 msec
timeout is not reset. I'll fix that too.

--
"I know that there are people who don't love their fellow man,
and I hate those people!" - Tom Lehrer

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
Reply all
Reply to author
Forward
0 new messages