Google Groups

Re: H12 errors, blocked dynos and Heroku website claims.


Oren Teich Feb 16, 2011 4:56 PM
Posted in group: Heroku Community
the timeout is only on the routing side.  

If you have 2 dynos, and 1 is running "forever", then 50% of your requests to your app will timeout.  

This is expected behavior on Heroku today.  

I strongly encourage you to use the rack-timeout gem to ensure that your dyno terminates after 30 seconds so you don't get this odd behavior.


-- 
Oren Teich

On Wednesday, February 16, 2011 at 12:13 PM, Tim W wrote:

Wow.. that does say the mesh will hold a request until an app is
available, but that is not how Heroku is currently operating..

I'll write this up a bit better and send it off to Heroku support...
but here is how to duplicate...

---
Created a new app http://h12-test2.heroku.com/
Simple sinatra app. Responds to two paths / and /wait (/wait sleeps
for 300 seconds)

=== h12-test2
Web URL: http://h12-test2.heroku.com/
Git Repo: g...@heroku.com:h12-test2.git
Dynos: 2
Workers: 0
Repo size: 1M
Slug size: 1M
Stack: bambooo-ree-1.8.7
Data size: (empty)
Addons: Expanded Logging, Shared Database 5MB


2 Dynos available...
If i have one tab open with /wait, according to Heroku.. I should not
get a time out on / if I just hit reload and reload in my browser...
but I do...


2011-02-16T12:07:03-08:00 heroku[web.2]: State changed from starting
to up
2011-02-16T12:07:04-08:00 heroku[web.1]: State changed from starting
to up
2011-02-16T12:09:44-08:00 heroku[router]: Error H12 (Request timeout) -
GET h12-test2.heroku.com/wait dyno=web.2 queue=0 wait=0ms
service=0ms bytes=0
2011-02-16T12:09:45-08:00 heroku[router]: GET h12-test2.heroku.com/
favicon.ico dyno=web.1 queue=0 wait=0ms service=7ms bytes=221
2011-02-16T12:09:45-08:00 app[web.1]: xx.223.127.66, 10.110.34.145 - -
[16/Feb/2011 12:09:45] "GET /favicon.ico HTTP/1.1" 404 18 0.0009
2011-02-16T12:09:45-08:00 heroku[nginx]: GET /favicon.ico HTTP/1.1 |
xx.223.127.66 | 252 | http | 404
2011-02-16T12:09:50-08:00 heroku[router]: Error H12 (Request timeout) -
GET h12-test2.heroku.com/ dyno=web.2 queue=0 wait=0ms service=0ms
bytes=0
2011-02-16T12:09:50-08:00 heroku[nginx]: GET / HTTP/1.1 | xx.
223.127.66 | 3364 | http | 502
2011-02-16T12:10:01-08:00 heroku[router]: GET h12-test2.heroku.com/
dyno=web.1 queue=0 wait=0ms service=1ms bytes=238
2011-02-16T12:10:01-08:00 app[web.1]: xx.223.127.66, 10.101.29.42 - -
[16/Feb/2011 12:10:01] "GET / HTTP/1.0" 200 59 0.0004
2011-02-16T12:10:01-08:00 heroku[nginx]: GET / HTTP/1.1 | xx.
223.127.66 | 269 | http | 200
2011-02-16T12:10:20-08:00 heroku[router]: Error H12 (Request timeout) -
GET h12-test2.heroku.com/favicon.ico dyno=web.2 queue=0 wait=0ms
service=0ms bytes=0
2011-02-16T12:10:20-08:00 heroku[router]: GET h12-test2.heroku.com/
favicon.ico dyno=web.1 queue=0 wait=0ms service=8ms bytes=221
2011-02-16T12:10:20-08:00 app[web.1]: xx.223.127.66, 10.110.34.145 - -
[16/Feb/2011 12:10:20] "GET /favicon.ico HTTP/1.0" 404 18 0.0005
2011-02-16T12:10:20-08:00 heroku[nginx]: GET /favicon.ico HTTP/1.1 |
xx.223.127.66 | 3364 | http | 502
2011-02-16T12:10:21-08:00 heroku[nginx]: GET /favicon.ico HTTP/1.1 |
xx.223.127.66 | 252 | http | 404
2011-02-16T12:10:33-08:00 heroku[router]: Error H12 (Request timeout) -
GET h12-test2.heroku.com/ dyno=web.2 queue=0 wait=0ms service=0ms
bytes=0
2011-02-16T12:10:33-08:00 heroku[nginx]: GET / HTTP/1.1 | xx.
223.127.66 | 3364 | http | 502
2011-02-16T12:10:44-08:00 app[web.1]: xx.223.127.66, 10.108.18.220 - -
[16/Feb/2011 12:10:44] "GET / HTTP/1.0" 200 59 0.0004

The dyno web.2 is busy, web.1 is open... yet requests get sent to web.
2 and H12 timeout. Why?


-tim


On Feb 16, 2:12 pm, Neil Middleton <neil.middle...@gmail.com> wrote:
Although the symptoms that you are seeing may not indicate it the two systems are the same. There is a queue backlog, and the dynos pick up the next request from that backlog when they become idle, as described here:http://devcenter.heroku.com/articles/key-concepts-performance(esp the part about backlog).

I've seen many instances on my applications which indicate this to be an accurate description.

Although I have no real clue about the application you have regarding number of dynos etc, it would appear that what you are seeing could be a different problem.

Neil Middletonhttp://about.me/neilmiddleton







On Wednesday, 16 February 2011 at 18:01, Tim W wrote:
It is not identical to what Heroku is providing.. The Heroku mesh
seems to blindly sends a request to a dyno, no matter the current
status of the dyno. The queue is at the dyno level. Passenger holds
back the request until a process is available..

With passenger you do not end up in the situation noted below, where
as with Heroku you do..
(Request Y gets served ok with passenger, with Heroku, request Y gets
the H12 error)

Quoted from passenger docs (this is what happens if you have that
feature turned off on passenger and what always happens with Heroku):
----------------------------------------------------------------------
The situation looks like this:

Backend process A: [* ] (1 request in queue)
Backend process B: [*** ] (3 requests in queue)
Backend process C: [*** ] (3 requests in queue)
Backend process D: [*** ] (3 requests in queue)
Each process is currently serving short-running requests.

Phusion Passenger will forward the next request to backend process A.
A will now have 2 items in its queue. We’ll mark this new request with
an X:

Backend process A: [*X ] (2 request in queue)
Backend process B: [*** ] (3 requests in queue)
Backend process C: [*** ] (3 requests in queue)
Backend process D: [*** ] (3 requests in queue)

Assuming that B, C and D still aren’t done with their current request,
the next HTTP request - let’s call this Y - will be forwarded to
backend process A as well, because it has the least number of items in
its queue:

Backend process A: [*XY ] (3 requests in queue)
Backend process B: [*** ] (3 requests in queue)
Backend process C: [*** ] (3 requests in queue)
Backend process D: [*** ] (3 requests in queue)

But if request X happens to be a long-running request that needs 60
seconds to complete, then we’ll have a problem. Y won’t be processed
for at least 60 seconds. It would have been a better idea if Y was
forward to processes B, C or D instead, because they only have short-
living requests in their queues.

On Feb 16, 12:50 pm, Neil Middleton <neil.middle...@gmail.com> wrote:
Is this not identical to what Heroku provides though? Your global queue is your applications dynos and the routing mesh will send requests to whichever dynos are idle. The wait being the backlog.

The only difference I can see is that Passenger won't, by default, spit back any requests that take longer than 30 seconds.

Neil Middletonhttp://about.me/neilmiddleton

On Wednesday, 16 February 2011 at 17:46, Tim W wrote:
Passenger... imho handles this better then Heroku


If global queuing is turned on, then Phusion Passenger will use a global queue that’s shared between all backend processes. If an HTTP request comes in, and all the backend processes are still busy, then Phusion Passenger will wait until at least one backend process is done, and will then forward the request to that process.<<

default is on

On Feb 16, 12:36 pm, Neil Middleton <neil.middle...@gmail.com> wrote:
AFAIK Passenger does have a similar concept with running processes (having a default of six running processes, which are comparable to 6 dynos).

The situation you describe should have the same results on Passenger as Heroku. More info on Passenger here:http://www.modrails.com/documentation/Users%20guide%20Apache.html#_re...

Neil Middletonhttp://about.me/neilmiddleton

On Wednesday, 16 February 2011 at 17:29, Tim W wrote:
I guess I am just used to using passenger which uses a global queue,
making a single long running request a non issue.

On Feb 16, 11:57 am, Neil Middleton <neil.middle...@gmail.com> wrote:
Is it, but you have a healthy dyno. If the dyno crashes, or hangs somehow, it gets removed.

Neil Middletonhttp://about.me/neilmiddleton

On Wednesday, 16 February 2011 at 16:55, Tim W wrote:
Thanks, I will give rack-timeout a try.

So what it seems like is that the routing mesh is not as sophisticated
as Heroku leads on?

On Feb 16, 11:45 am, Neil Middleton <neil.middle...@gmail.com> wrote:
The dyno is still running the long request, successfully. It's only the routing mesh that's returned the timeout error back to the user. Therefore, the dynos still in your 'grid' and ready for new requests.

I blogged about something very similar a couple of weeks back:http://neilmiddleton.com/avoiding-zombie-dynos-with-heroku

Neil Middletonhttp://about.me/neilmiddleton

On Wednesday, 16 February 2011 at 16:42, Tim W wrote:
The Heroku website claims:

http://heroku.com/how/dyno_grid_last#3
"If a dyno is unresponsive for any reason (user bugs, long requests,
or high load), other requests will be routed around it."

In my experience, this does not seem to be the case. We have several
admin features in our app that when requested with certain params, it
can take longer then 30s to run. (I am working on ways to get these in
check and in the background). When a user trips one of these long
running requests, Heroku appears to queue additional requests to this
dyno and those requests time out, even though there are plenty of
other dynos available to handle that request.

Is the statement on the Heroku website true or false? It does not
appear that Heroku actively monitors the dynos to see if they are busy
with a long running request. Is there a better way to handle this
situation?

Thanks..
-tim

--
You received this message because you are subscribed to the Google Groups "Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to heroku+un...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/heroku?hl=en.

--
You received this message because you are subscribed to the Google Groups "Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to heroku+un...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/heroku?hl=en.

--
You received this message because you are subscribed to the Google Groups "Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to heroku+un...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/heroku?hl=en.

--
You received this message because you are subscribed to the Google Groups "Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to heroku+un...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/heroku?hl=en.

--
You received this message because you are subscribed to the Google Groups "Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to heroku+un...@googlegroups.com.
For more options, visit this group athttp://groups.google.com/group/heroku?hl=en.

--
You received this message because you are subscribed to the Google Groups "Heroku" group.
To post to this group, send email to her...@googlegroups.com.
To unsubscribe from this group, send email to heroku+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/heroku?hl=en.