Vim appears to stall channel based requests

111 views
Skip to first unread message

Paul Jolly

unread,
Jun 14, 2019, 4:12:30 PM6/14/19
to vim...@googlegroups.com
Hi all,

In https://github.com/leitzler/govim (master branch) Pontus is adding
support to govim for signs that correspond to quickfix entries.

This is done by calling sign_place/sign_unplace as required to get the
signs for a window in the correct state.

If there are lots of errors in a file, then there can be a large
number of calls to sign_place/sign_unplace made by govim in rapid
succession.

(Incidentally, it would be great if there was a “batch” method to set
the signs for a buffer, akin to setqflist for signs in a buffer)

However, on what is a relatively slow machine, @leitzler is seeing Vim
stall when handling calls from govim to sign_place:

https://gist.github.com/myitcv/f8579a20ce327d61c550c85df71ea4a2

Notice the ~13 second gap between the call to sign_place:

1.102629 RECV on 0(sock):
'[0,[36,"call","sign_place",0,"govim","govimerr",1,{"lnum":10}]]

and the response:

14.247893 SEND on 0(sock):
'[38,["function","function:GOVIM_internal_SetUserBusy",[1]]]
'
14.248988 on 0: Blocking read JSON for id 38
14.249019 : looking for messages on channels
14.249025 on 0: Getting JSON message 0
14.249028 on 0: Invoking channel callback <SNR>12_define
14.249201 SEND on 0(sock): '[39,["callback",36,["",7]]]

Indeed, the response only gets sent after 13 seconds because of a
cursor movement, which as you can see is also handled by a
channel-based function. That is to say a cursor movement (which
triggers a call to a channel-based plugin function) appears to "fix"
things - a few more function calls are handled, but then things
"stall" again. i.e. not all the sign_place calls have been completed.

I realise this is all rather vague (and indeed it would be somewhat
moot if we had a batch function for setting signs for a buffer), but
is there anything that could be causing Vim to stall like this?

Vim v8.1.1512.

Thanks in advance for any suggestions/pointers,


Paul

Pontus Leitzler

unread,
Jun 23, 2019, 4:13:41 PM6/23/19
to vim_dev
Hi everyone,

I've been trying to create a smaller reproducible that can demo what happens, but it seems to be a complex combination that isn't trivial to slice down.

So I went for another option, debugging vim instead. Since the binary I have can reproduce the behavior consistently I thought that it might give me some new clues.
First a disclaimer, this is the first time I look at vim code, and my C is, at best, very rusty :).

It looks like the signs are being placed, but there are still signs to place, when the select()-call after channel_select_setup(...) in os_unix.c:RealWaitForChar(...) get called with NULL as timeout.
Then it hangs on the select() until a key is pressed.

There are also another occasion when that select() isn't called with 0 as timeout, and that is when it is called with whatever updatetime is set to in my .vimrc. That also makes sign placement stop for as long as updatetime is set to.

Since I'm not that into the vim source I don't know if the channel should be a part of the select but isn't, or if the issue is that the channel already has been read into a buffer that can't be processed since the select-call stops further execution. But I hope that this info might shred some new light to this issue. If you have any patch ideas to vim that can give additional info I'd be glad to try them out.

Best regards, Pontus

Pontus Leitzler

unread,
Jun 23, 2019, 4:50:49 PM6/23/19
to vim_dev
I wrote the previous post yesterday but since it was my first it had to be moderated before showing up.

Since then I did some more investigation and it turns out that the channel is indeed read in channel_read(...), but then the blocking select() prevents the incoming message to be processed.
As soon as the blocking select returns (when I press a key), channel_parse_messages(...) is called and parses the buffer read before the select()-call.

I expect that select() should be called with 0 timeout as long as the channel buffers haven't been processed entirely, is that a correct assumption? Should I write an issue in the github tracker?

Regards, Pontus

Bram Moolenaar

unread,
Jun 23, 2019, 11:08:14 PM6/23/19
to vim...@googlegroups.com, Pontus Leitzler
So messages are ready to process, but Vim instead blocks on getting more
input? Can you find out where RealWaitForChar() is called from?
In inchar_loop() it should call parse_queued_messages() before blocking,
and if more input is needed set the wait time to 10.

--
hundred-and-one symptoms of being an internet addict:
260. Co-workers have to E-mail you about the fire alarm to get
you out of the building.

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Pontus Leitzler

unread,
Jun 24, 2019, 7:59:08 AM6/24/19
to vim_dev
In both cases it's called from the end of WaitForCharOrMouse.
I'm not entirely sure I get all line numbers right, even when I compile vim with -g -O0, so I had to go with the "Fprintf to stderr"-monkey style :).

One further step in the stack trace:
ui_wait_for_chars_or_timer(...), the line "if (wait_func(due_time, interrupted, ignore_input))".
Due_time is -1, interrupted 1 and ignore input 0.

I guess it's the calls to has_pending_job() and channel_any_readahead() that decides if there are more to read? But wouldn't channel_any_readahead() require the message to have been parsed to json before returning TRUE?

An example flow when it works:
Channel_read - 63 bytes
Channel_parse_messages
Channel_parse_json
Channel_parse_json
Channel_parse_json
RealWaitForChar (from mch_breakcheck, with 0 timeout)

Then after a while:
Channel_read - 64 bytes
RealWaitForChar (from mch_breakcheck, with 0 timeout)
RealWaitForChar (from mch_breakcheck, with 0 timeout)
RealWaitForChar (from mch_breakcheck, with 0 timeout)
RealWaitForChar (from mch_breakcheck, with 0 timeout)
RealWaitForChar (from ui_wait_for_chars_or_timer as wait_func.. with -1 timeout)
// The select() call that hangs until keypressed
Channel_parse_messages
Channel_parse_json
Channel_parse_json
Channel_parse_json

Pontus Leitzler

unread,
Jun 24, 2019, 2:23:34 PM6/24/19
to vim_dev

This is a patch that actually fixes the problem, but I have no idea what other side effects it might have. Also note that the line numbers are way off since I've been adding stuff.

@@ -2853,7 +2853,7 @@ channel_has_readahead(channel_T *channel, ch_part_T part)
jsonq_T *head = &channel->ch_part[part].ch_json_head;
jsonq_T *item = head->jq_next;

- return item != NULL;
+ return (item != NULL) || (channel_peek(channel, part) != NULL);
}
return channel_peek(channel, part) != NULL;
}

Marc Weber

unread,
Jun 24, 2019, 6:11:59 PM6/24/19
to vim_dev
https://github.com/MarcWeber/vim-addon-signs

all you do is pass a list, incremental updates get applied.

However if you have > 5,000 signs or so it might make sense to limit to
first X items ..

Eventually could still be done faster eg using Python whatever.

Marc Weber

Bram Moolenaar

unread,
Jun 24, 2019, 10:13:16 PM6/24/19
to vim...@googlegroups.com, Pontus Leitzler
That doesn't look right, because there should not be a complete json message.
If there is, perhaps a call to parse it is missing?
Perhaps a call to channel_parse_json() is needed.

--
hundred-and-one symptoms of being an internet addict:
267. You get an extra phone line so you can get phone calls.

Bram Moolenaar

unread,
Jun 24, 2019, 10:13:16 PM6/24/19
to vim...@googlegroups.com, Pontus Leitzler

> In both cases it's called from the end of WaitForCharOrMouse.
> I'm not entirely sure I get all line numbers right, even when I compile vim with -g -O0, so I had to go with the "Fprintf to stderr"-monkey style :).
>
> One further step in the stack trace:
> ui_wait_for_chars_or_timer(...), the line "if (wait_func(due_time, interrupted, ignore_input))".
> Due_time is -1, interrupted 1 and ignore input 0.
>
> I guess it's the calls to has_pending_job() and
> channel_any_readahead() that decides if there are more to read? But
> wouldn't channel_any_readahead() require the message to have been
> parsed to json before returning TRUE?

If the JSON is incomplete we need to receive more text to be able to
parse it. Thus we might as well block on receiving something.

> An example flow when it works:
> Channel_read - 63 bytes
> Channel_parse_messages
> Channel_parse_json
> Channel_parse_json
> Channel_parse_json
> RealWaitForChar (from mch_breakcheck, with 0 timeout)
>
> Then after a while:
> Channel_read - 64 bytes
> RealWaitForChar (from mch_breakcheck, with 0 timeout)

So what is in those 64 bytes? If you look in the channel log it should
be there. You don't quote from a channel log, did you use ch_logfile()?

> RealWaitForChar (from mch_breakcheck, with 0 timeout)
> RealWaitForChar (from mch_breakcheck, with 0 timeout)
> RealWaitForChar (from mch_breakcheck, with 0 timeout)
> RealWaitForChar (from ui_wait_for_chars_or_timer as wait_func.. with -1 timeout)
> // The select() call that hangs until keypressed
> Channel_parse_messages
> Channel_parse_json
> Channel_parse_json
> Channel_parse_json

--
Two fish in a tank. One says to the other:
"Do you know how to drive this thing?"

Pontus Leitzler

unread,
Jun 26, 2019, 6:19:40 PM6/26/19
to vim_dev
> That doesn't look right, because there should not be a complete json message.
> If there is, perhaps a call to parse it is missing?
> Perhaps a call to channel_parse_json() is needed.

Yes, maybe there is a channel_parse_json() missing somewhere. But as the flow log shows it's also a channel_parse_messages()-call missing.

> If the JSON is incomplete we need to receive more text to be able to parse it. Thus we might as well block on receiving something.

Thing is, I think that the JSON is actually complete, but since channel_parse_messages() isn't called it never gets the chance to parse the JSON content. That's how I read the logs at least.

> So what is in those 64 bytes? If you look in the channel log it should be there. You don't quote from a channel log, did you use ch_logfile()?

That was just pseudo-logs handwritten :) I ran it again and this time include what is actually read from the channel in channel_read().

Since I do get a few signs placed I can see in the log how it looks like when it works, this is one of the signs that gets placed:
RealWaitForChar() is called with 0 delay
channel_read() is called from RealWaitForChar(), it reads 63 bytes into the buffer
channel_save() is called from channel_read() with the buffer that starts with:
[0,[34,"call","sign_place",0,"govim","govimerr",1,{"lnum":6}]] (and 0x0a at the end)
RealWaitForChar() is called with 0 delay
RealWaitForChar() is called with 0 delay
RealWaitForChar() is called with 0 delay
channel_parse_messages() is called from parse_queued_messages()
channel_peek is called (for part 0)
channel_get_json is called
channel_parse_json is called
channel_peek is called (for part 1)
channel_get_json is called
channel_peek is called (for part 0)
channel_get_json is called
channel_parse_json is called
channel_peek is called (for part 1)
channel_get_json is called
channel_get_json is called
channel_parse_json is called
channel_peek is called (for part 2)
channel_get_json is called

And when it doesn't work it never gets to the channel_parse_messages() call:
RealWaitForChar() is called with 0 delay
channel_read() is called from RealWaitForChar(), it reads 64 bytes into the buffer
channel_save() is called from channel_read() with the buffer that starts with:
[0,[41,"call","sign_place",0,"govim","govimerr",1,{"lnum":12}]] (and 0x0a at the end)
RealWaitForChar() is called with 0 delay
RealWaitForChar() is called with 0 delay
RealWaitForChar() is called with 0 delay
channel_peek is called (part 0, I don't know from where)
channel_peek is called (part 0, I don't know from where)
RealWaitForChar() is called with -1 (as wait_func)

Ok, so I see what you referred to with:


> In inchar_loop() it should call parse_queued_messages() before blocking,

I did run again and logged the inchar_loop() calls, there are only two in the entire run and both with -1 as wtime.

So, we end up inside inchar_loop(), before calling wait_func() with a wait_time value set to -1 (that will block).
And to ensure that parse_queued_messages() are called before sleeping, either has_pending_job() or channel_any_readahead() must return TRUE, right?

That would explain why the patch to channel_any_readahead() works, since it will make it return TRUE?
Or should it be another check to ensure that parse_queued_messages() is called before blocking?

Best regards, Pontus

Pontus Leitzler

unread,
Jun 27, 2019, 3:33:16 PM6/27/19
to vim_dev
Just to summarize:
- channel_read() reads 63 bytes and calls channel_save()
- channel_save() saves those to a new node->rq_buffer at the end of channel head

To convert it to JSON someone must call channel_parse_json(), and it can be either from channel_read_json_block() or may_invoke_callback().
My logs shows that it is via may_invoke_callback() until it hangs.

- may_invoke_callback() is called from channel_parse_messages()
- channel_parse_messages() is called from parse_queued_messages()
- parse_queued_messages() is called from inchar_loop()

But, since channel_parse_json() haven't been called yet for the latest message, inchar_loop() will allow a wait_func()-call with wait_time == -1 (that leads to RealWaitForChar() NULL).

To avoid it, inchar_loop() calls channel_any_readahead(), but it will return FALSE since the messages haven't been parsed to JSON yet and it only checks the JSON queue in JSON mode.

So I guess the solution to this is to either let channel_any_readahead() call channel_peek() in JSON mode as well, or add a call to channel_parse_json() somewhere to ensure that the incoming message has been parsed so that channel_any_readahead() returns TRUE in JSON-mode (?).

I'm sorry if my explanation is messy, it's probably a combination of me not being familiar with the code base as well as not being native English :)

Best regards, Pontus

Bram Moolenaar

unread,
Jun 27, 2019, 10:58:09 PM6/27/19
to vim...@googlegroups.com, Pontus Leitzler

Pontus Leitzler wrote:

> Just to summarize:
> - channel_read() reads 63 bytes and calls channel_save()
> - channel_save() saves those to a new node->rq_buffer at the end of channel=
> head
>
> To convert it to JSON someone must call channel_parse_json(), and it can be=
> either from channel_read_json_block() or may_invoke_callback().
> My logs shows that it is via may_invoke_callback() until it hangs.
>
> - may_invoke_callback() is called from channel_parse_messages()
> - channel_parse_messages() is called from parse_queued_messages()
> - parse_queued_messages() is called from inchar_loop()
>
> But, since channel_parse_json() haven't been called yet for the latest mess=
> age, inchar_loop() will allow a wait_func()-call with wait_time =3D=3D -1 (=
> that leads to RealWaitForChar() NULL).
>
> To avoid it, inchar_loop() calls channel_any_readahead(), but it will retur=
> n FALSE since the messages haven't been parsed to JSON yet and it only chec=
> ks the JSON queue in JSON mode.
>
> So I guess the solution to this is to either let channel_any_readahead() ca=
> ll channel_peek() in JSON mode as well, or add a call to channel_parse_json=
> () somewhere to ensure that the incoming message has been parsed so that ch=
> annel_any_readahead() returns TRUE in JSON-mode (?).
>
> I'm sorry if my explanation is messy, it's probably a combination of me not=
> being familiar with the code base as well as not being native English :)

Thanks for the digging and explanation. channel_has_readahead() is
called in a few places, checking if there is something ready to read.
But it doesn't use any received text on a JSON channel if it hasn't been
parsed yet. It looks like calling channel_parse_json() in
channel_has_readahead() would be the right thing to do.

Please try if this patch fixes your problem:

--- prev/src/channel.c 2019-06-24 00:43:31.463691826 +0200
+++ channel.c 2019-06-28 04:41:25.767304149 +0200
@@ -2850,10 +2850,13 @@

if (ch_mode == MODE_JSON || ch_mode == MODE_JS)
{
- jsonq_T *head = &channel->ch_part[part].ch_json_head;
- jsonq_T *item = head->jq_next;
+ jsonq_T *head;

- return item != NULL;
+ // Parse json from readahead before checking the queue.
+ channel_parse_json(channel, part);
+
+ head = &channel->ch_part[part].ch_json_head;
+ return head->jq_next != NULL;
}
return channel_peek(channel, part) != NULL;
}

--
The greatest lies of all time:
(1) The check is in the mail.
(2) We have a really challenging assignment for you.
(3) I love you.
(4) All bugs have been fixed.
(5) This won't hurt a bit.
(6) Honey, I just need to debug this program and be home in 5 minutes.
(7) I have just sent you an e-mail about that.
(8) Of course I'll respect you in the morning.
(9) I'm from the government, and I'm here to help you.

Pontus Leitzler

unread,
Jun 28, 2019, 8:05:21 AM6/28/19
to vim_dev
> Please try if this patch fixes your problem:
>
> --- prev/src/channel.c 2019-06-24 00:43:31.463691826 +0200
> +++ channel.c 2019-06-28 04:41:25.767304149 +0200
> @@ -2850,10 +2850,13 @@
>
> if (ch_mode == MODE_JSON || ch_mode == MODE_JS)
> {
> - jsonq_T *head = &channel->ch_part[part].ch_json_head;
> - jsonq_T *item = head->jq_next;
> + jsonq_T *head;
>
> - return item != NULL;
> + // Parse json from readahead before checking the queue.
> + channel_parse_json(channel, part);
> +
> + head = &channel->ch_part[part].ch_json_head;
> + return head->jq_next != NULL;
> }
> return channel_peek(channel, part) != NULL;
> }
>

Works like a charm!

Bram Moolenaar

unread,
Jun 28, 2019, 4:07:25 PM6/28/19
to vim...@googlegroups.com, Pontus Leitzler
Thanks for checking, I'll include that change then.
Would be nice to have a test, but that may take some effort to create.

--
Facepalm reply #3: "I had a great time in Manhattan" "I thought you were
going to New York?"

Pontus Leitzler

unread,
Jun 28, 2019, 5:05:03 PM6/28/19
to vim_dev
Yeah, I don't really understand what triggers it. I've tried to create a small reproducible but I can't seen to make it trigger this. Could be a timing issue.

One thing I thought about is the 10ms delay that is added if there is more to read from the channel(?). That might introduce some latency to channel-based plugins that send many small messages.

In my case Paul Jolly's proposal about a sign_setlist (https://github.com/vim/vim/issues/4557) would help, but I guess there are other cases where it might get slow.

Paul Jolly

unread,
Jun 29, 2019, 9:20:10 AM6/29/19
to vim...@googlegroups.com
Thank you, Bram and Pontus, for tracking this one down!
> --
> --
> You received this message from the "vim_dev" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups "vim_dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/vim_dev/7ac768c0-6bc8-421b-b5bb-7e260c77271a%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages