Passenger with Rails 5's ActionCable not working?

2.042 Aufrufe
Direkt zur ersten ungelesenen Nachricht

Chris Oliver

ungelesen,
03.01.2016, 15:54:1003.01.16
an Phusion Passenger Discussions
Hey guys! I'm trying to run ActionCable in Rails 5 with Passenger 5.0.22. So far, no luck. 

I'm using this fork of the original Rails actioncable-examples that's been updated to use the Rails 5.0.0.beta1 library instead of the old gem for actioncable. https://github.com/dhaneshnm/actioncable-examples

First off, I tried setting up two servers in Nginx using Passenger to load the Rails config.ru and the ActionCable config.ru separately on a different port.

Here's my Nginx config:

server {
  listen 80;
  server_name 192.168.1.30;

  passenger_enabled on;
  rails_env production;
  root /home/deploy/actioncable-example/public;
}

server {
  listen 28080;
  server_name 192.168.1.30;

  passenger_enabled on;
  rails_env production;
  root /home/deploy/actioncable-example/cable/public; # Try to set it up to load the cable/config.ru for ActionCable
}

The problem with this is that I get a "Sending 502 response: application did not send a complete response" in /var/log/nginx/error.log each time that ActionCable's client side JS tries to connect to the websocket.

Client side reports "WebSocket connection to 'ws://192.168.1.30/cable' failed: WebSocket is closed before the connection is established." as expected when receiving the 502. 

Also I've tried mounting ActionCable to a URL using config/routes.rb to no avail (and updating the URL in the JS for the websocket accordingly).

  mount ActionCable.server => "/cable"

My understanding was this approach would remove the need to run two separate servers in your nginx config. (Correct?)

On Twitter, someone from Phusion replied to my question and mentioned they did successfully get this working (https://twitter.com/phusion_nl/status/681826248121782272). I've also tested the faye-websocket demo you guys have and that does correctly work. So I'm not entirely sure what needs to be fixed here, could be an ActionCable problem or a Passenger problem.

Any ideas what might be going wrong?

Daniel Knoppel

ungelesen,
04.01.2016, 10:06:3004.01.16
an Phusion Passenger Discussions
Hi Chris, thanks for getting on to the support forum.

I tested the dhaneshnm fork with Passenger standalone (using nginx internally) and both the cable server as well as the regular app are launched (live comment example) without errors:

:actioncable_fork/cable$ passenger start --port 28080
..
..
App 30562 stderr: I, [2016-01-04T15:56:37.470365 #30562]  INFO -- : Celluloid 0.17.2 is running in BACKPORTED mode. [ http://git.io/vJf3J ]

:actioncable_fork$ passenger start
..
..
Started GET "/" [WebSocket] for 127.0.0.1 at 2016-01-04 15:58:01 +0100

N.B. the live chat in the fork doesn't seem to work without refreshing the page manually, but the same thing happens with the scripts using puma / rails s, so that  doesn't seem Passenger related.

I'll give it a try with the "single server" setup soon too.

- Daniel

Chris Oliver

ungelesen,
04.01.2016, 10:26:1004.01.16
an Phusion Passenger Discussions
Yeah, everything seems to boot up properly. The websockets are the primary thing that isn't working for me. At one point if I left the page open long enough, it seems to have frozen Passenger. The websockets retry a connection and it looks like maybe it just ends up consuming all the available workers because they never return a full response. 

I spent a few hours poking around yesterday and couldn't quite pinpoint what's actually going wrong. It seems like the Celluloid thread hangs causing the Faye websocket threads to not complete. It's hard for me to say whether it's a Passenger issue or actually something wrong in ActionCable / Celluloid / etc. I also wasn't entirely sure if this had anything to do with Rack Hijack or not. 

Everything does work for me when I used Puma in both configurations (separate server and the single server setup). The websockets were able to process fine and the Celluloid thread didn't hang. I can give you some logs for both versions to compare if that's helpful.

Cheers,
Chris

Daniel Knoppel

ungelesen,
06.01.2016, 06:02:2606.01.16
an Phusion Passenger Discussions
A small update: I see what you're saying. So I had tested things with the 5.0.0.alpha (and related dependencies), and that was all working. In the 5.0.0.beta one of the two passengers gets hanging sessions.

More specifically, the one that is hosting the /cable part. I also see from its logs that:
"[info] 28446#0: *4 epoll_wait() reported that client prematurely closed connection, so upstream connection is closed too while sending request to upstream, client: 127.0.0.1, server: _, request: "GET / HTTP/1.1"

I'm not sure why the client side would close the connection, but Passenger supports half-closing connections so if the application side doesn't send any data it will keep the session open/hanging. Maybe there's something in Passenger's initial reply that the client doesn't appreciate. I do see a difference in the response headers for the websocket open.

To be continued.

- Daniel

Chris Oliver

ungelesen,
06.01.2016, 14:08:1906.01.16
an Phusion Passenger Discussions
That's interesting to note that the alpha worked. I had not tried that version of it. Curious what you find out!

Dan Fox

ungelesen,
06.01.2016, 21:55:4906.01.16
an Phusion Passenger Discussions
I've spent several hours reading through the source for Passenger, Celluloid, and ActionCable and I'm pretty sure that the problem is Celluloid does not work across processes. It uses condition variables to wait and signal the actor threads to process messages in the queue. 

  1. When the Passenger worker process receives the request from ActionCable to create a new websocket, ActionCable creates one and configures the socket to handle a few messages from Faye such as on_open as seen here https://github.com/rails/rails/blob/master/actioncable/lib/action_cable/connection/base.rb#L73
  2. The send_async attempts to invoke the on_open method on a pool of Celluloid actors. 
  3. This causes the Celluloid proxy for the pool to add the method call as a message to the actor pool's mailbox as seen here https://github.com/celluloid/celluloid/blob/master/lib/celluloid/proxy/async.rb#L12
  4. The mailbox's shovel, "<<", message eventually triggers a signal on the mailbox's condition variable (as seen here https://github.com/celluloid/celluloid/blob/v0.17.2/lib/celluloid/mailbox.rb#L41) which is supposed to wake up the thread which is waiting to process messages but the signal (which is being triggered in the Passenger worker process) isn't waking the actor thread (which is still in the Passenger Preloader process).
I should also note that I put:
ActionCable.server.worker_pool
in my config.ru for ActionCable so that way the pool would be initialized on the loading of the Passenger Preloader. If you don't make the preloader start it up, every worker attempts to start its own websocket actor pool which isn't what we want. If that actually worked then with the 
dhaneshnm fork that uses live comments as the example would end up with some clients connected to one worker and other clients connected to another worker and the live comments wouldn't be shared between those two sets.

Unfortunately, as far as I can tell with my admittedly limited searching, changing Celluloid to support working across multiple processes will require quite a bit of work. Using Ruby's IO pipes would work except they require being integrated with the process forking since in one proces you have to close the reader end and in the other process you have to close the writer end.

I'm thinking the easiest fix might actually be to change ActionCable to no longer have the dependency on Celluloid.

Of course if I am correct, then Rails 5.0.0.alpha shouldn't have worked since as far as I can tell, ActionCable was dependent on Celluloid at that time too so I could definitely be wrong about the real cause.

Daniel Knoppel

ungelesen,
07.01.2016, 05:10:4207.01.16
an Phusion Passenger Discussions
Dan: that's very interesting insights, thanks! 

Based on what you said about the preloader I tried passenger with direct spawning rather than smart spawning (without any other changes), and then it suddenly started working. I'll continue testing to see what happens with multiple workers.

Btw. The alpha I tested depends on Celluloid 0.16.x, while the beta depends on Celluloid 0.17.2.

- Daniel

Tinco Andringa (Phusion)

ungelesen,
07.01.2016, 11:50:5107.01.16
an Phusion Passenger Discussions
Just a note: Celluloid does not need to work cross process in this situation. The solution to this is to have every Celluloid instance be fully separate as there is no need for actors to communicate with actors on other processes. Direct spawning mode in Passenger enables this separation by not using fork. Communication to actors on other processes is generally the same as actors on other machines, you would do that through Redis, or for example DCell.

Op donderdag 7 januari 2016 11:10:42 UTC+1 schreef Daniel Knoppel:

Dan Fox

ungelesen,
07.01.2016, 13:10:5107.01.16
an Phusion Passenger Discussions
Switching to direct spawning worked for me too. That's must be what ActionCable is using Redis for then. To broadcast to clients connected to different workers.

Dan Fox

ungelesen,
07.01.2016, 13:52:2007.01.16
an Phusion Passenger Discussions
Ok I tried out this fork someone made which swapped Celluloid for concurrent-ruby and it worked for me both in direct spawn and smart spawn. The only catch was that when I limited the number of workers to 1, connected 5 clients, and tried the live comment example in the dhaneshnm fork, only 1 of the other 4 clients was able to get the update and the passenger output included this:
[ 2016-01-07 12:37:58.8949 19995/7fd347f4b700 age/Cor/Con/InternalUtils.cpp:105 ]: [Client 1-3] Sending 502 response: application did not send a complete response
[ 2016-01-07 12:37:58.8963 19995/7fd347f4b700 age/Cor/Con/InternalUtils.cpp:105 ]: [Client 1-4] Sending 502 response: application did not send a complete response
[ 2016-01-07 12:37:58.9100 19995/7fd347f4b700 age/Cor/Con/InternalUtils.cpp:105 ]: [Client 1-6] Sending 502 response: application did not send a complete response
[ 2016-01-07 12:37:58.9159 19995/7fd347f4b700 age/Cor/Con/InternalUtils.cpp:105 ]: [Client 1-8] Sending 502 response: application did not send a complete response
[ 2016-01-07 12:37:58.9175 19995/7fd347f4b700 age/Cor/Con/InternalUtils.cpp:105 ]: [Client 1-9] Sending 502 response: application did not send a complete response

I also saw that as I connected each client while Passenger was in smart mode, that it launched a new worker for each connected client.

Testing out puma with only a single thread worked with all 5 clients so perhaps it's how ActionCable uses EventMachine? I don't know.

Daniel Knoppel

ungelesen,
07.01.2016, 18:18:4807.01.16
an Phusion Passenger Discussions
@Dan this is actually a known phenomenon; Passenger has a default concurrency of 1 for Rails apps so in order to get multiple websocket connections to a worker you can specify a higher or unlimited concurrency using the max concurrency override option

This achieves the desired result, but I've stumbled upon a null pointer (i.e. crash) issue that needs some work. 

- Daniel

Daniel Knoppel

ungelesen,
10.01.2016, 22:09:5610.01.16
an Phusion Passenger Discussions
Allen antworten
Antwort an Autor
Weiterleiten
0 neue Nachrichten