API timeouts

186 views
Skip to first unread message

Markus "Shorty" Uckelmann

unread,
Nov 28, 2016, 11:06:57 AM11/28/16
to salt-...@googlegroups.com
Hi all,

I'm running into a timeout of a runner started via Salt-API. I set the
timeout in the JSON post but it doesn't seem to be working. I'm not even
sure from which part the timeout is coming from.

Code and API-Return:
https://gist.github.com/bemeyert/0b85b300214b14321550be0caa31c544

Any help would be much appreciated.

Regards,

Shroty

Seth House

unread,
Nov 28, 2016, 2:15:52 PM11/28/16
to salt users list
Hi. Salt's RunnerClient API has an internal timeout [1]. It looks like
you're hitting that. There was a recent PR to catch that error to
return a better HTTP status code [2]. The syntax in your gist looks
exactly right to me. I would expect that to work. I tested that param
via salt-api a couple weeks ago with success on 2016.3 (IIRC) for a
runner that took around six minutes to run.

What Salt version are you running?
Do you actually need the full 30 minutes or is there any difference if
you lower that value a tad?

[1] https://docs.saltstack.com/en/latest/ref/clients/index.html#salt.runner.RunnerClient.cmd_sync
[2] https://github.com/saltstack/salt/pull/37533
> --
> You received this message because you are subscribed to the Google Groups
> "Salt-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to salt-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Markus "Shorty" Uckelmann

unread,
Nov 29, 2016, 6:38:37 AM11/29/16
to salt-...@googlegroups.com
Am 28.11.2016 um 20:15 schrieb Seth House:

Hi Seth,

Thanks for your answer.

> Hi. Salt's RunnerClient API has an internal timeout [1]. It looks like
> you're hitting that. There was a recent PR to catch that error to
> return a better HTTP status code [2]. The syntax in your gist looks
> exactly right to me. I would expect that to work. I tested that param
> via salt-api a couple weeks ago with success on 2016.3 (IIRC) for a
> runner that took around six minutes to run.

The docs say, that the timeout is set to "none" So there shouldn't be
any timeout?

The HTTP Code is 500. See the Gist, I updated it with more information.
I also made a mistake. My first tests were without the time. After I did
it right by using the timeout the log message changed. I update the Gist
accordingly. Sorry for the mixup.

The Orchestrator-Job is running fine and takes a little longer than 10
minutes. But I always get the 500/timeout error from the API.

> What Salt version are you running?

2016.3.4

> Do you actually need the full 30 minutes or is there any difference if
> you lower that value a tad?

I also tried it with 15 minutes. But it had the same outcome.

When the orchestrator is run on the CLI via "salt-run" it works just
fine. It even is about a minute faster.


Thanks, Shorty

Seth House

unread,
Nov 29, 2016, 12:40:17 PM11/29/16
to salt users list
On Tue, Nov 29, 2016 at 4:38 AM, Markus "Shorty" Uckelmann
<sho...@koeln.de> wrote:
> The docs say, that the timeout is set to "none" So there shouldn't be any
> timeout?

In most places in Salt the default value is set inside the method
rather than in the function signature.

> My first tests were without the time. After I did it
> right by using the timeout the log message changed.

Ah, cool. Glad you found that. Ok, so now that you've got the
RunnerClient timeout working you're hitting the default timeout for
the CherryPy HTTP server. Add the `expire_responses` setting to your
config to prevent that server from killing long-running (expired)
responses.

rest_cherrypy:
[...]
expire_responses: False

https://docs.saltstack.com/en/latest/ref/netapi/all/salt.netapi.rest_cherrypy.html

I expect that will you get back up and running.

If you find that keeping an HTTP connection open for ~10 minutes isn't
reliable over time you may want to start the job as a fire-and-forget
using the `runner_async` client and then watch for the returns via the
`/events` endpoint. That endpoint is specifically designed for
long-lived connections.

Markus "Shorty" Uckelmann

unread,
Nov 30, 2016, 3:37:17 AM11/30/16
to salt-...@googlegroups.com
Am 29.11.2016 um 18:40 schrieb Seth House:
> On Tue, Nov 29, 2016 at 4:38 AM, Markus "Shorty" Uckelmann
> <sho...@koeln.de> wrote:
>> The docs say, that the timeout is set to "none" So there shouldn't be any
>> timeout?
>
> In most places in Salt the default value is set inside the method
> rather than in the function signature.

Ok. I didn't know that. Maybe it could be useful to add the default
timeout to the docs?

>> My first tests were without the time. After I did it
>> right by using the timeout the log message changed.
>
> Ah, cool. Glad you found that. Ok, so now that you've got the
> RunnerClient timeout working you're hitting the default timeout for
> the CherryPy HTTP server. Add the `expire_responses` setting to your
> config to prevent that server from killing long-running (expired)
> responses.
>
> rest_cherrypy:
> [...]
> expire_responses: False
>
> https://docs.saltstack.com/en/latest/ref/netapi/all/salt.netapi.rest_cherrypy.html
>
> I expect that will you get back up and running.

Thanks a lot for taking me by the hand :) I was a bit lost there...

> If you find that keeping an HTTP connection open for ~10 minutes isn't
> reliable over time you may want to start the job as a fire-and-forget
> using the `runner_async` client and then watch for the returns via the
> `/events` endpoint. That endpoint is specifically designed for
> long-lived connections.

Jep. I guess sooner or the later we will use the async client. For the
moment the synchronous mode is sufficent for us.


Seth, thanks a lot for your help,

Shorty

Reply all
Reply to author
Forward
0 new messages