Graceful shutdown of a faye-websocket-ruby client

504 views
Skip to first unread message

Matt Wynne

unread,
Apr 23, 2014, 6:18:16 PM4/23/14
to faye-...@googlegroups.com
Hi,

I'm building a websocket client in Ruby using faye-websocket.

I've come across an intermittent bug in my code which seems to be caused by me closing down the client too quickly after the last message has been sent. The close event reaches the server first, and the message never arrives before the server closes the connection.

If I hack a sleep into the transmitting code to give the message some time to arrive before I start to shut down, it all works.

I have two questions.

Firstly - is my assumption correct? Is it possible that a race condition could cause the close message to arrive at the server before the last actual message send arrives?

Secondly - is there a way to ask Faye::Websocket to close down gracefully so that it will take care of ensuring all messages have been transmitted to the server before closing the connection? Or do I need to build in some kind of acknowledgement into my protocol? Or some other alternative I haven't considered?

Thanks in advance!


I can also share the code if it helps.

cheers,
Matt

James Coglan

unread,
Apr 24, 2014, 4:35:48 AM4/24/14
to faye-...@googlegroups.com
On 23 April 2014 23:18, Matt Wynne <matt....@gmail.com> wrote:
Firstly - is my assumption correct? Is it possible that a race condition could cause the close message to arrive at the server before the last actual message send arrives?

Hi Matt,

There shouldn't be a race condition unless you're using the socket from multiple threads. Text messages and close frames are written to the socket in sequence and should be processed sequentially by the server.

I've put a little server/client together to show what should be happening. Here's my config.ru:

require 'faye/websocket'
Faye::WebSocket.load_adapter 'thin'

run -> env do
  ws = Faye::WebSocket.new(env)

  ws.on :message do |message|
    p [:server_message, message.data]
  end

  ws.on :close do |close|
    p [:server_close, close.code, close.reason]
  end
end

And my client.rb:

require 'faye/websocket'
require 'eventmachine'

EM.run {
  ws = Faye::WebSocket::Client.new('ws://localhost:3000/')

  ws.on :open do
    p [:client_open]
    ws.send 'Hello'
    ws.close
  end

  ws.on :close do |close|
    p [:client_close, close.code, close.reason]
    EM.stop
  end
}

If you open `faye-websocket-0.7.2/lib/faye/websocket/api.rb`, you'll see two methods called `write(data)` and `parse(data)`. These deal with all the TCP I/O for the underlying protocol, so if you put some logging in there to print out the bytes being send and received you might find out what's going on.

(I've just realised that you also need to add logging to the `write(data)` method in `faye-websocket-0.7.2/lib/faye/websocket.rb`, where it turns out I'm needlessly overriding the method from the API module. I'll fix that in the next release.)

On the client side, I get the following log pattern:

[:write, "47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 0d 0a 48 6f 73 74 3a 20 6c
6f 63 61 6c 68 6f 73 74 3a 33 30 30 30 0d 0a 55 70 67 72 61 64 65 3a 20 77 65 62
73 6f 63 6b 65 74 0d 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 55 70 67 72 61 64 65
0d 0a 53 65 63 2d 57 65 62 53 6f 63 6b 65 74 2d 4b 65 79 3a 20 51 7a 6c 61 33 43
54 6a 33 45 4b 67 6a 77 62 4d 43 64 78 38 6a 67 3d 3d 0d 0a 53 65 63 2d 57 65 62
53 6f 63 6b 65 74 2d 56 65 72 73 69 6f 6e 3a 20 31 33 0d 0a 0d 0a"]
[:parse, "48 54 54 50 2f 31 2e 31 20 31 30 31 20 53 77 69 74 63 68 69 6e 67 20
50 72 6f 74 6f 63 6f 6c 73 0d 0a 55 70 67 72 61 64 65 3a 20 77 65 62 73 6f 63 6b
65 74 0d 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 55 70 67 72 61 64 65 0d 0a 53 65
63 2d 57 65 62 53 6f 63 6b 65 74 2d 41 63 63 65 70 74 3a 20 5a 75 48 58 75 41 49
4e 53 32 7a 64 63 75 5a 68 36 4e 6a 57 57 46 56 38 39 6c 49 3d 0d 0a 0d 0a"]
[:client_open]
[:write, "81 85 65 02 39 f4 2d 67 55 98 0a"]
[:write, "88 82 21 85 16 4c 22 6d"]
[:parse, "88 02 03 e8"]
[:client_close, 1000, ""]

The first :write and :parse are the HTTP request and response, after which the client is open. Then there are two :writes and one :parse before the socket is closed. The first write is a text message frame, the second is a close frame, and the parse is the client processing the server's close frame.

WebSocket frames are structured like this. The first byte tells you the type of a frame, and whether it's 'final' (i.e. the message is not part of a multi-frame sequence). The second byte tells you if it's masked, and the message length. If it's masked there are then 4 random bytes, followed by the message XORed with those bytes. If unmasked, the message is appended as plain UTF-8.

The messages sent by the client are masked, those sent by the server are not. Text frames contain the text you told the socket to send, and close frames contain a two-byte status code.

    81 85 65 02 39 f4 2d 67 55 98 0a
    +- +- \----+----/ \-----+------/
    |  |       |            |
    |  |      mask       message
    |  |
    |  +-- 8=masked, 5=length
    |
    +----- 8=final, 1=text

    88 82 21 85 16 4c 22 6d
    +- +- \----+----/ \-+-/
    |  |       |        |
    |  |      mask   message
    |  |
    |  +-- 8=masked, 2=length
    |
    +----- 8=final, 8=close

    88 02 03 e8
    +- +- \-+-/
    |  |    |
    |  | message
    |  |
    |  +-- 0=unmasked, 2=length
    |
    +----- 8=final, 8=close

On the server side, I'm seeing this:

[:write, "48 54 54 50 2f 31 2e 31 20 31 30 31 20 53 77 69 74 63 68 69 6e 67 20
50 72 6f 74 6f 63 6f 6c 73 0d 0a 55 70 67 72 61 64 65 3a 20 77 65 62 73 6f 63 6b
65 74 0d 0a 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 55 70 67 72 61 64 65 0d 0a 53 65
63 2d 57 65 62 53 6f 63 6b 65 74 2d 41 63 63 65 70 74 3a 20 5a 75 48 58 75 41 49
4e 53 32 7a 64 63 75 5a 68 36 4e 6a 57 57 46 56 38 39 6c 49 3d 0d 0a 0d 0a"]
[:parse, "81 85 65 02 39 f4 2d 67 55 98 0a 88 82 21 85 16 4c 22 6d"]
[:server_message, "Hello"]
[:write, "88 02 03 e8"]
[:server_close, 1000, ""]

That is, it receives the frame bytes written by the client in one chunk, and correctly interprets them as the text message "Hello" followed by a close frame, which it sends a response to. The client receives that response and closes the TCP connection.
 
Secondly - is there a way to ask Faye::Websocket to close down gracefully so that it will take care of ensuring all messages have been transmitted to the server before closing the connection? Or do I need to build in some kind of acknowledgement into my protocol? Or some other alternative I haven't considered?

This should not be necessary -- the underlying websocket-driver writes bytes to the TCP socket in the order that it receives instructions, so if it's told to send a text message followed by a closing frame, that's what should happen. If it's not doing that, then it's a bug in websocket-driver. I'd be interested to know what you're seeing in your logs at the TCP level.

James Coglan

unread,
Apr 24, 2014, 4:51:52 AM4/24/14
to faye-...@googlegroups.com
On 24 April 2014 09:35, James Coglan <jco...@gmail.com> wrote:

    81 85 65 02 39 f4 2d 67 55 98 0a
    +- +- \----+----/ \-----+------/
    |  |       |            |
    |  |      mask       message
    |  |
    |  +-- 8=masked, 5=length
    |
    +----- 8=final, 1=text

More precisely, on the second byte, the first bit is the 'masked' flag and the remaining 7 bits are the frame length. So if the byte is >= 0x80 then the frame is masked. There is also a mechanism for dealing with lengths that don't fit in 7 bits. If the given length is 126, then the real length is given by the following 2 bytes. If 127, it's given by the following 8 bytes. The mask and payload follow after those extra length bytes.

James Coglan

unread,
Apr 24, 2014, 7:19:09 PM4/24/14
to faye-...@googlegroups.com
On 24 April 2014 09:35, James Coglan <jco...@gmail.com> wrote:
(I've just realised that you also need to add logging to the `write(data)` method in `faye-websocket-0.7.2/lib/faye/websocket.rb`, where it turns out I'm needlessly overriding the method from the API module. I'll fix that in the next release.)

I've just released faye-websocket 0.7.3, which removes this unnecessary method override. 

Matt Wynne

unread,
Apr 25, 2014, 5:07:55 AM4/25/14
to faye-...@googlegroups.com
Thanks for replying so quickly and thoroughly James.

On 24 Apr 2014, at 09:35, James Coglan <jco...@gmail.com> wrote:

On 23 April 2014 23:18, Matt Wynne <matt....@gmail.com> wrote:
Firstly - is my assumption correct? Is it possible that a race condition could cause the close message to arrive at the server before the last actual message send arrives?

Hi Matt,

There shouldn't be a race condition unless you're using the socket from multiple threads.

Bingo!

I was using separate threads because I need the client to work in the background while other stuff is going on. I just hadn’t properly got my head around how it all hangs together.

I think the main thing that your example below helped me to realise is that the bulk of the work needs to be done in the ‘open’ event handler on the client. I had assumed that if I did that I would block the main EventMachine reactor loop, but I guess the `queue.pop` will be giving control back to the reactor loop regularly enough.

I’ve now adapted the code to look like this: https://github.com/cucumber-ltd/cucumber-pro-ruby/blob/master/lib/cucumber/pro.rb and I can’t get the race condition to reappear.

Any comments on my approach are very welcome!

--
You received this message because you are subscribed to a topic in the Google Groups "Faye users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/faye-users/C9aV8e0ZiKo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to faye-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

cheers,

signature.asc

James Coglan

unread,
Apr 25, 2014, 5:25:03 AM4/25/14
to faye-...@googlegroups.com
On 25 April 2014 10:07, Matt Wynne <matt....@gmail.com> wrote:
I think the main thing that your example below helped me to realise is that the bulk of the work needs to be done in the ‘open’ event handler on the client. I had assumed that if I did that I would block the main EventMachine reactor loop, but I guess the `queue.pop` will be giving control back to the reactor loop regularly enough.

I’ve now adapted the code to look like this: https://github.com/cucumber-ltd/cucumber-pro-ruby/blob/master/lib/cucumber/pro.rb and I can’t get the race condition to reappear.

I would have thought that https://github.com/cucumber-ltd/cucumber-pro-ruby/blob/master/lib/cucumber/pro.rb#L75 would block the event loop, although I'm often surprised at the behaviour of EM stuff. Generally, if you're doing stuff like this where multiple actors need to write to some serial protocol, I'd wrap the socket in a queue or a locking mechanism to make sure things are written in a consistent order. 

Matt Wynne

unread,
Apr 27, 2014, 4:15:11 PM4/27/14
to faye-...@googlegroups.com
On 25 Apr 2014, at 10:25, James Coglan <jco...@gmail.com> wrote:

On 25 April 2014 10:07, Matt Wynne <matt....@gmail.com> wrote:
I think the main thing that your example below helped me to realise is that the bulk of the work needs to be done in the ‘open’ event handler on the client. I had assumed that if I did that I would block the main EventMachine reactor loop, but I guess the `queue.pop` will be giving control back to the reactor loop regularly enough.

I’ve now adapted the code to look like this: https://github.com/cucumber-ltd/cucumber-pro-ruby/blob/master/lib/cucumber/pro.rb and I can’t get the race condition to reappear.

I would have thought that https://github.com/cucumber-ltd/cucumber-pro-ruby/blob/master/lib/cucumber/pro.rb#L75 would block the event loop, although I'm often surprised at the behaviour of EM stuff.

You were right about this - the socket wasn’t able to receive any messages while it was in this sending loop, so it looks like that blocked the EM thread. I’ve changed it to use a recursive EM.next_tick call instead which seems to be working fine.



signature.asc
Reply all
Reply to author
Forward
0 new messages