Question about three timeout params for starting app

1,016 views
Skip to first unread message

Makoto

unread,
Mar 19, 2015, 2:42:47 AM3/19/15
to vcap...@cloudfoundry.org

Hello,

I'm trying to understand following three timeout parameters used for push:
(a) -t option in cf cli
(b) timeout param in manifest.yml
(c) CF_STARTING_TIMEOUT env variable

I have read following page and did some tests on my CF env (CF190, cf cli 6.9).
http://docs.cloudfoundry.org/devguide/deploy-apps/large-app-deploy.html


First of all, 'push' consists of following three phase (according to the doc):
1. upload
2. stage
3. start
I understand all these 3 parameters (a,b,c) apply to the phase 3 (starting application).


My understanding about (a) and (b) are below:
- (a) and (b) correspond to 'health_check_timeout' parameter in CC.
- These timeout values are used in Cloud Controller(?) (which is waiting starting application), because these values are passed from cf cli to CC.
CF_TRACE=true cf push:
------
REQUEST: [2015-03-19T04:18:15Z]
POST /v2/apps?async=true HTTP/1.1
  :
{"name":"abc","space_guid":"1471fe18-bd3e-4d4a-8f17-e8622a5840c7","disk_quota":600,"environment_json":{},"health_check_timeout":99}
-----
- In my CF's CC config file, following values are specified:
  default_health_check_timeout: 60
  maximum_health_check_timeout: 180
  So, in this case, end-users can specify the timeout value up to 180(sec), using (a) or (b).

 
If I'm wrong something, please correct me.
 
 

What I don't understand is about (c) CF_STARTING_TIMEOUT env variable.
When I specified this env variable, I couldn't see any value set for 'health_check_timeout' in the REST call. Also this value is specified in minutes (not seconds), so this looks quite different from the other 2 params.


Could anyone please explain how/where this CF_STARTING_TIMEOUT is used and the difference between (a)/(b) and (c)?

Thanks,

Makoto

Ken Krueger

unread,
Mar 19, 2015, 6:02:14 PM3/19/15
to vcap...@cloudfoundry.org
Makato,

My understanding (unverified by experiment) is this:

1.  the push -t option and the timeout option in the manifest are the same thing.  Any option provided in the CLI trumps the same option specified in the manifest. If specified, this controls how many SECONDS the CLI will wait for the app to start (not upload or stage).  The default is actually tricky; it is stated as 60 seconds but you can quickly illustrate this is not the case (see below)
2.  Upload can take up to 15 MINUTES and is controlled via administration.  
3.  The CF_STAGING_TIMEOUT cf environment variable controls how long the staging process can take.  15 MINUTES default.
4.  The CF_STARTUP_TIMEOUT cf environment variable controls how long the application is allowed to take to start up.  5 MINUTES default (not 60 seconds).
5.  If you specify push -t option, or specify timeout via manifest, it overrides CF_STARTUP_TIMEOUT.  You can go up 180 SECONDS max.

I'm afraid I haven't heard of a CF_STARTING_TIMEOUT.  I think the internal variables you've found in the CC are pertaining to different behavior not necessarily related to startup.  My personal point of confusion has always been the stated 60 second startup default bit - if you are specifying a value there is no default, and if you are not specifying a value CF_STARTUP_TIMEOUT controls this.

If anyone out there knows differently, please, please, please correct me.

thx,
k

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/2da8013b-7147-4db8-a0d9-be672cd37bd3%40cloudfoundry.org.

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.



--
Ken Krueger  
Manager, Global Education Delivery
407 256 9737 Mobile
kenkrueger65 Skype

Education questions?  educ...@pivotal.io

Makoto

unread,
Mar 19, 2015, 10:10:58 PM3/19/15
to vcap...@cloudfoundry.org
Hi Ken,

Thank you for the reply.

Sorry, CF_STARTING_TIMEOUT was my typo. That should be CF_STARTUP_TIMEOUT as you wrote.

About 2 and 3 (upload and staging), I have same understanding. Also, about 1, I understand push -t option and timeout option in the manifest control how many CLI will wait for the app to start (not CC).

Still I'm not sure about 4.
> 4. The CF_STARTUP_TIMEOUT cf environment variable controls how long the application is allowed to take to start up.
Can I ask what you mean by 'how long the application is allowed to take to start up'?
This means how many CLI will wait for the app to start (i.e., same as push -t option and timeout option)?
Or, does this mean differently (e.g., this value controls when CC will give up app to start, and let app restart)?

> My personal point of confusion has always been the stated 60 second startup default bit - if you are specifying a value there is no default, and if you are not specifying a value CF_STARTUP_TIMEOUT controls this.
Yes, if push -t option, timeout option in the manifest, and CF_STARTUP_TIMEOUT control same thing, different default value (5 min vs 60 sec) does not make sense for me. Also if these 3 timeout settings have the same meaning, I'm confused why we can specify, for example, 5 min to CF_STARTUP_TIMEOUT, because we cannot specify 5 min (max 180 sec) to cf push -t option or timeout in manifest. (For example, 'cf push -t 300' returns error.)

I tried CF_STARTUP_TIMEOUT again and saw REST API. But, I still saw difference between CF_STARTUP_TIMEOUT and -t/timeout. (When using CF_STARTUP_TIMEOUT, 'health_check_timeout' param is not specified in POST /v2/apps API call. So there is at least a difference.)

Any comments would be helpful.

Regards,
Makoto


2015年3月19日木曜日 17時42分47秒 UTC+11 Makoto:

Ken Krueger

unread,
Mar 20, 2015, 9:51:11 AM3/20/15
to vcap...@cloudfoundry.org
Hi Makoto,

For the statement "CT_STARTUP_TIMEOUT cf environment variable controls how long the application is allowed to take to start up" - This refers to (my understanding) to actual app startup time, NOT push time, not upload time, not staging time.  From the time the Warden/Garden container gets the droplet, how long does it have before the app is expected to respond favorably to health checking.

You and I both have the same understanding / confusion on the 3 ways to control the app startup timeout.  I expect that since different teams work on CLI vs CC, sometimes the end result policy of an item like this can be a bit awkward.

When you turn CLI debugging on you are seeing the detailed conversation between the CLI and the CC, but this doesn't reveal all of the inner workings of CF - like listening into a conversation at the edge of a large party.  So I wouldn't expect the CF_STARTUP_TIMEOUT to be visible there.

If anyone else would like to correct my understanding of any of this, please, please do so!

thx,
k

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Makoto

unread,
Mar 23, 2015, 4:44:56 AM3/23/15
to vcap...@cloudfoundry.org
Hi Ken, everyone,


Thank you for your reply.

To confirm the behavior in more detail, I was investigating cf cli source code and tried some experiments.


From the investigation, currently my understanding is like below:
  • CF_STARTUP_TIMEOUT applies to cf cli only.
  • -t option of cf push and timeout of manifest apply to both cf cli and CC.
  • When we see cf cli side only, those 3 timeout parameters have same meaning. The difference is whether they apply to CC or not.


I tried 3 cases using an app with starting up command 'sleep 1000'.

(Case A) If nothing is specified (-t/manifest/CF_STARTUP_TIMEOUT), 
  • cli waits up to 5 minutes (by default).
  • CC waits up to 60 seconds (by default).
I got below:

$ cf push app -c 'sleep 1000'

 
:
-----> Uploading droplet (50M)


0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 failing
FAILED
Start unsuccessful


TIP
: use 'cf logs app --recent' for more information

Please note that the interval of the messages like '0 of 1 instances running, 1 starting' is 5 seconds + some execution time (according to source code (start.go). The default 5 sec is defined by DefaultPingerThrottle constant).

After 60 seconds (after 10 'starting' messages), the application was shown as 'down' once. However, cli was still waiting.
After 60 seconds (after additional 9 'starting' messages), the application was shown as 'down' again. However, cli was still waiting.
I expected cli finishes after 5 minutes timeout, but in this example, it looks the execution was concluded as 'failure' and finished before 5 min timeout.


(Case B) When -t option only is specified,
  • cf cli waits up to the specified time
  • CC waits up to the specified time
I got below:

$ cf push app -t 10 -c 'sleep 1000'
:
-----> Uploading droplet (50M)


0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
FAILED
Start app timeout


TIP
: use 'cf logs app --recent' for more information

In this case, I believe the specified timeout ('10') was stored in CCDB. Therefore when I tried 'cf restart' the app, the specified timeout ('10') was still used like below:

$ cf restart app

Stopping app app in org apps / space dev as admin...
OK


Starting app app in org apps / space dev as admin...


0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 down
0 of 1 instances running, 1 failing
FAILED
Start unsuccessful


TIP
: use 'cf logs app --recent' for more information


(c) When CF_STARTUP_TIMEOUT only is specified, 
  • cli waits up to the specified timeout
  • CC waits up to 60 seconds (by defalut)
I got below:

$ CF_STARTUP_TIMEOUT=2 cf push app -c 'sleep 1000'

:
-----> Uploading droplet (50M)


0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 down
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
0 of 1 instances running, 1 starting
FAILED
Start app timeout


TIP
: use 'cf logs app --recent' for more information

After 60 seconds (after 9 'starting' messages), the application was shown as 'down' once. However, cli was still waiting.
After 120 seconds, the cli failed with timeout error.



If there is something wrong here, please correct me.


Thanks,
Makoto


2015年3月19日木曜日 17時42分47秒 UTC+11 Makoto:

Hello,

Reply all
Reply to author
Forward
0 new messages