Hi, Hongli,
Thank you very much for taking the time to respond to me. I have read through the documentation you provided, and it has been helpful, however I am seeing what I believe to be somewhat of an unusual issue.
Basically, what I did was bound to the actual rack process (from the output of passenger-status), and reproduced the issue. I actually got some good visibility into what was going on. When the error presented itself I receievd the following system calls over and over again:
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = -1 EINTR (Interrupted system call)
read(10, 0x37810c0, 16384) = -1 EINTR (Interrupted system call)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = -1 EINTR (Interrupted system call)
read(10, 0x37810c0, 16384) = -1 EINTR (Interrupted system call)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = -1 EINTR (Interrupted system call)
read(10, 0x37810c0, 16384) = -1 EINTR (Interrupted system call)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = -1 EINTR (Interrupted system call)
read(10, 0x37810c0, 16384) = -1 EINTR (Interrupted system call)
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
rt_sigreturn(0x1a) = -1 EINTR (Interrupted system call)
read(10,
Based on the limited amount of research I did on this particular issue, I was experiencing the problem documented in the following URL (I am using RHEL 6.4 and verified that the default system ruby is ruby is ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux].).
http://timetobleed.com/ruby-threading-bugfix-small-fix-goes-a-long-way/I understand that puhsion develop REE
http://www.rubyenterpriseedition.com/download.html, but to be perfectly honest since 1.8.7 has fallen out of support, and since I am using a new version of Redmine, I figured it would be easiest to just not use 1.8.7 and instead use RVM to use a newer version of ruby in conjunction with Nginx. So what I did was build a new RVM as root (i.e. ruby-2.0.0-p247), and reference this in my Nginx configuration, i.e.
passenger_root /usr/local/rvm/gems/ruby-2.0.0-p247/gems/passenger-4.0.24;
passenger_ruby /usr/local/rvm/wrappers/ruby-2.0.0-p247/ruby;
Now, when I do this and look at the passenger-debug logs, it is clear to me, that the new phusion processes are using this particular gem. Unfortunately, the same problem persists (i.e. I let the browser sit for an hour and then a user clicks refresh and weird timeout-like errors occur).
Now, when I spawn passenger (via running Nginx) I confident my RVM ruby is loading, i.e. the following log entriess are in
Pool2/Spawner.h:411 ]: [App 4289 stdin >>] passenger_root: /usr/local/rvm/gems/ruby-2.0.0-p247/gems/passenger-4.0.24
Pool2/Spawner.h:411 ]: [App 4289 stdin >>] start_command: /usr/local/rvm/wrappers/ruby-2.0.0-p247/ruby /usr/local/rvm/gems/ruby-2.0.0-p247/gems/passenger-4.0.24/helper-scripts/rack-loader.rb
When I enumerate my passenger processes with passenger_stats, I see a few pids (because I have passenger_min_instances set). What I will do is trace several processes (i.e. all of the processes enumerated through passenger-stats), for example strace -p 4254 -p 4261 -p 4268 -p 4275 -p 4282 -p 4289 (from the output of the command below).
----------- Application groups -----------
/opt/rails/redmine-2.3.3#default:
App root: /opt/rails/redmine-2.3.3
Requests in queue: 0
* PID: 4254 Sessions: 0 Processed: 11 Uptime: 1h 20m 18s
CPU: 0% Memory : 65M Last used: 50m 32s
* PID: 4261 Sessions: 0 Processed: 0 Uptime: 1h 20m 18s
CPU: 0% Memory : 8M Last used: 1h 20m 1
* PID: 4268 Sessions: 0 Processed: 0 Uptime: 1h 20m 18s
CPU: 0% Memory : 7M Last used: 1h 20m 1
* PID: 4275 Sessions: 0 Processed: 0 Uptime: 1h 20m 18s
CPU: 0% Memory : 7M Last used: 1h 20m 1
* PID: 4282 Sessions: 0 Processed: 0 Uptime: 1h 20m 17s
CPU: 0% Memory : 7M Last used: 1h 20m 1
* PID: 4289 Sessions: 0 Processed: 0 Uptime: 1h 20m 17s
CPU: 0% Memory : 8M Last used: 1h 20m 1
#
What happens is when I do a trace of the processes, I get this (and nothing else), even after continuing to use the app. For some reason it appears that system tracing does not function properly when the app is spawned via RVM.
# strace -p 4254 -p 4261 -p 4268 -p 4275 -p 4282 -p 4289
Process 4254 attached - interrupt to quit
Process 4261 attached - interrupt to quit
Process 4268 attached - interrupt to quit
Process 4275 attached - interrupt to quit
Process 4282 attached - interrupt to quit
Process 4289 attached - interrupt to quit
[pid 4289] select(11, [0 10], NULL, NULL, NULL <unfinished ...>
[pid 4282] select(11, [0 10], NULL, NULL, NULL <unfinished ...>
[pid 4275] select(11, [0 10], NULL, NULL, NULL <unfinished ...>
[pid 4268] select(11, [0 10], NULL, NULL, NULL <unfinished ...>
[pid 4261] select(11, [0 10], NULL, NULL, NULL <unfinished ...>
[pid 4254] select(11, [0 10], NULL, NULL, NULL
As I said, at this point I can use the app and it functions fine, however I never see a single system call from any of these processes. The problem is of course, if I let it sit for a while, I see the exact same error message that I was seeing before. Has anybody seen an inability to trace passenger phusion processes when spawned from RVM? I can trace the processes fine using the system ruby (/usr/bin/ruby). Thank you so much for taking the time to answer my questions and to help me troubleshoot this problem.
In the meantime I am going to try REE, just to see what happens... Thank you again.
Dan Sullivan