Understanding out of memory errors in crashed apps

3,831 views
Skip to first unread message

Guillaume Berche

unread,
Sep 5, 2013, 11:31:42 AM9/5/13
to vcap-dev
My java app is experiencing frequent non determinist crashes. The cf events commands reports that it crashed because of out of memory  (with either 255 or 137 exit statuses).

The app has a 512 MB memory limit configured, and the JVM has the java-buildpack heuristics applied with 385 MB -Xmx max heap option [1], so theoretically it should never reach limit.

The cf stats cmd [4] reports a low memory usage, and the app itself (when it manages to start and take http requests) properly reports the JVM max heap of 385 MB.

I'm running my tests against run.pivotal.io so I can't log into the warden containers for deeper debugging.

What could be consumming this memory? Is there OS overhead into the warden memory consumption ?

The warden documentation indicates that when memory exceeds the quota, then one of the processes is killed and an OOM event is sent to the DEA. How does warden chooses which process to kill ? How can CF users know which process was killed ?

My app is launched with an (ugly) shell script [1]. I know the last cmd (jonas) has started since it sometimes serves HTTP traffic. Other commands don't spawn new processes. So I would not expect more than 2 processes to be running: bash + the jvm.

I indeed see a trace "Killed" in stderr but no more details [2]. One explanation could be that the two processes (deployme and jonas) be running concurrently instead of sequentially, but I don't understand why according to the bash syntax.

That would be useful to have a way for a CF user to have a view on processes within the warden container and their associated memory usage. One could imagine adding a small agent into the buildpack (e.g. listening on the CONSOLE or DEBUG port) so serve that over http. Anyone done that yet ?

This is weird as this is not determinist, sometimes a new push of same app does not trigger OOM and runs fine, and sometimes it crashes.

I'm kind of stuck. The only next ideas I can see are brutally increase limit to 1024 MB or is changing the start script command to move the deployme command into the buildpack exec, and have a simpler start script (sed conf file to change $PORT && source setenv && jonas start -fg). Any other idea ?

Thanks in advance,

Guillaume.

[1]:

$ cf files myapp staging_info.yml
Getting file contents... OK

---
detected_buildpack:
start_command: JAVA_HOME=.java JAVA_OPTS="-Dhttp.port=$PORT -XX:MaxPermSize=52428K
  -XX:OnOutOfMemoryError=.buildpack-diagnostics/killjava -Xmx384M -Xss1M" && export
  JAVA_HOME JAVA_OPTS && (if test ! -d .jonas_base/deploy/app.war ; then JONAS_ROOT=.jonas_root
  JONAS_BASE=.jonas_base && export JONAS_ROOT JONAS_BASE && erb .jonas_root/deployme/topology.xml.erb
  > .jonas_root/deployme/topology.xml && $JAVA_HOME/bin/java -jar .jonas_root/deployme/deployme.jar
  -topologyFile=.jonas_root/deployme/topology.xml -domainName=singleDomain -serverName=singleServerName
  && mkdir -p .jonas_base/deploy/app.war && cp -r --dereference * .jonas_base/deploy/app.war/;
  else echo "skipping jonas_base config as already present"; fi) && source .jonas_base/setenv
  && jonas start -fg

[2]

logs/stderr.log=Sep 05, 2013 2:46:50 PM org.ow2.jasmine.deployme.configApply.Configurator setAConfiguration
INFO: Starting configuration set up for server singleDomain.singleServerName in directory /home/vcap/app/.jonas_base
Sep 05, 2013 2:46:50 PM org.ow2.jonas.tools.configurator.impl.NotApplicableHelper notApplicable
WARNING: Parameter GlobalParameters.Jdk is not supported by the target server
Sep 05, 2013 2:46:51 PM org.ow2.jonas.tools.configurator.impl.NotApplicableHelper notApplicable
WARNING: Parameter HTTP.maxSpareThreads is not supported by the target server
Sep 05, 2013 2:46:51 PM org.ow2.jonas.tools.configurator.impl.NotApplicableHelper notApplicable
WARNING: Parameter HTTP.maxThreads is not supported by the target server
Sep 05, 2013 2:46:51 PM org.ow2.jonas.tools.configurator.impl.NotApplicableHelper notApplicable
WARNING: Parameter HTTP.minSpareThreads is not supported by the target server
Sep 05, 2013 2:46:51 PM org.ow2.jasmine.deployme.configApply.Configurator setAConfiguration
INFO: Server singleDomain.singleServerName configured successfully
Killed

[3]
MaxHeapSize 385,875,968 bytes
UsedHeapSize 86,126,752 bytes


[4]
$ cf stats [...]
Getting stats ...

instance   cpu    memory           disk
#0         0.0%   284.9M of 512M   298.8M of 1G

[5]

$ cf events myapp
Getting events for myapp... OK

time                        instance index   description     exit status
2013-09-05 14:47:01 +0000   0                out of memory   Failure (137)

[6]

http://www.gnu.org/software/bash/manual/bashref.html#Conditional-Constructs
expression1 && expression2

True if both expression1 and expression2 are true.

The && and || operators do not evaluate expression2 if the value of expression1 is sufficient to determine the return value of the entire conditional expression.

Daniel Mikusa

unread,
Sep 5, 2013, 12:05:25 PM9/5/13
to vcap...@cloudfoundry.org
On Sep 5, 2013, at 11:31 AM, Guillaume Berche <ber...@gmail.com> wrote:

> My java app is experiencing frequent non determinist crashes. The cf events commands reports that it crashed because of out of memory (with either 255 or 137 exit statuses).
>
> The app has a 512 MB memory limit configured, and the JVM has the java-buildpack heuristics applied with 385 MB -Xmx max heap option [1], so theoretically it should never reach limit.

Don't forget that your Java process uses more memory than just the heap. There's also the PermGen, thread stacks, native code and some other things. All of these must add up to be less than your limit of 512M.

> The cf stats cmd [4] reports a low memory usage, and the app itself (when it manages to start and take http requests) properly reports the JVM max heap of 385 MB.
>
> I'm running my tests against run.pivotal.io so I can't log into the warden containers for deeper debugging.
>
> What could be consumming this memory? Is there OS overhead into the warden memory consumption ?

My guess is one of the other memory consumers I listed above.

> The warden documentation indicates that when memory exceeds the quota, then one of the processes is killed and an OOM event is sent to the DEA. How does warden chooses which process to kill ? How can CF users know which process was killed ?
>
> My app is launched with an (ugly) shell script [1]. I know the last cmd (jonas) has started since it sometimes serves HTTP traffic. Other commands don't spawn new processes. So I would not expect more than 2 processes to be running: bash + the jvm.

If you wan to clean up your shell script, create a template shell script that you include with your build pack. Merge the template with data during the compile script and output the path to your script from your build pack's release script.

> I indeed see a trace "Killed" in stderr but no more details [2].

What else are you expecting to see here?

> That would be useful to have a way for a CF user to have a view on processes within the warden container and their associated memory usage. One could imagine adding a small agent into the buildpack (e.g. listening on the CONSOLE or DEBUG port) so serve that over http. Anyone done that yet ?

+1 for this feature

>
> This is weird as this is not determinist, sometimes a new push of same app does not trigger OOM and runs fine, and sometimes it crashes.
>
> I'm kind of stuck. The only next ideas I can see are brutally increase limit to 1024 MB or is changing the start script command to move the deployme command into the buildpack exec, and have a simpler start script (sed conf file to change $PORT && source setenv && jonas start -fg). Any other idea ?
>
> Thanks in advance,
>
> Guillaume.
>
> [1]:
>
> $ cf files myapp staging_info.yml
> Getting file contents... OK
>
> ---
> detected_buildpack:
> start_command: JAVA_HOME=.java JAVA_OPTS="-Dhttp.port=$PORT -XX:MaxPermSize=52428K
> -XX:OnOutOfMemoryError=.buildpack-diagnostics/killjava -Xmx384M -Xss1M"

Your -Xss value is huge. Each thread created by your application is going to consume 1M of thread stack, which counts against your total memory limit. In most Java apps you can get away with a quarter of that or less. Try something like 192k or 256k.

Dan
> To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Guillaume Berche

unread,
Sep 5, 2013, 3:50:23 PM9/5/13
to vcap...@cloudfoundry.org
Thanks Daniel for your response.


> The app has a 512 MB memory limit configured, and the JVM has the java-buildpack heuristics applied with 385 MB -Xmx max heap option [1], so theoretically it should never reach limit.

Don't forget that your Java process uses more memory than just the heap.  There's also the PermGen, thread stacks, native code and some other things.  All of these must add up to be less than your limit of 512M.

Well, I'm assuming the heuristics of the java_buildpack (see [7] and [8]) add up those numbers correctly. Besides, this would not explain why cf stats cmd [4] reports a low memory usage.

[7] https://github.com/Orange-OpenSource/java-buildpack/blob/master/docs/jre-openjdk.md#default-memory-sizes
[8] https://github.com/Orange-OpenSource/java-buildpack/blob/master/config/openjdk.yml#L21

 
If you wan to clean up your shell script, create a template shell script that you include with your build pack.  Merge the template with data during the compile script and output the path to your script from your build pack's release script.


Not sure to understand what you mean, in terms of displaying the start command, that's what [1] displays already. Can you please re-explain what your suggestion ?
 
> I indeed see a trace "Killed" in stderr but no more details [2].

What else are you expecting to see here?


I'ld like/expect the process ID and command line to be printed "Killed PID=25 ./java -jar ...."

 
>
> [1]:
>
> $ cf files myapp staging_info.yml
> Getting file contents... OK
>
> ---
> detected_buildpack:
> start_command: JAVA_HOME=.java JAVA_OPTS="-Dhttp.port=$PORT -XX:MaxPermSize=52428K
>   -XX:OnOutOfMemoryError=.buildpack-diagnostics/killjava -Xmx384M -Xss1M"

Your -Xss value is huge.  Each thread created by your application is going to consume 1M of thread stack, which counts against your total memory limit.  In most Java apps you can get away with a quarter of that or less.  Try something like 192k or 256k.


Again, I'm assuming the java_buildpack heuristics are correct, but my app (running jonas container) might instanciate more threads than tomcat so that could be an explanation, I'll try reducing that. Weird though that "cf stats" does not show a larger memory usages for the few cases when the app properly starts

 Thanks,

Guillaume.

Guillaume Berche

unread,
Sep 5, 2013, 4:05:34 PM9/5/13
to vcap...@cloudfoundry.org
Another hypothesis I just thought about:
 
Lets say my start is starting with the following process tree

pid 25: sh (start command )
   pid 26: sh (.jonas_root/jonas)
      pid 27: java -classpath ....

Then the jvm is slow to bind the TCP port, and falls into bug "If app does not bind to port fast enough, it gets flagged as flapping" documented at http://cftracker.cfapps.io/runtime

Warden is instructed to kill pid 25 as the app is flagged as not starting properly. However, for some reason, the child processes (say pid 27) is not killed and keeps running, eventually binding the TCP port.

Dea then rerun the start commands, which creates a new process. We end up with the following:
pid 27: java -classpath ....
pid 28: sh (start command )
   pid 29: sh (.jonas_root/jonas)
      pid 30: java -classpath ....

Then warden notices the memory limit is reached and kills 28 which is just starting. As this is not sufficient, it then kills 27 or 30.

HM notices the app is not binding the TCP port, so it asks a new start to DEA, which is running another start command. And so on...

The observed non deterministic behavior could be explained by
1)  the randomness that warden uses when selecting processes to kill on oom.
2) elapsed time for HM to flag the app as not starting and asking to kill it.

Is there a way to know the HM sampling rate, and get some logs on HM activity ? I did not find them into the events endpoint.

Does this sound a realistic explanation ? Can you think of a way to verify it on run.pivotal.io ? I could leave a flapping app running if someone with operator privileges is able to have a look, and if this may help uncover a weird bug hidden somewhere.

Guillaume.

Balachander Keelapudi

unread,
Sep 5, 2013, 5:44:42 PM9/5/13
to vcap...@cloudfoundry.org
Greetings!
Can i understand how many instances you are running on run.pivotal.io ? If you are running just one instance then " welcome to cloud era ". In cloud foundry, you need to have at the minimum of 2 instances of an app running to avoid a restart, if not when a cf deploy / resource issue occurs your app will be restarted. That is as per design. When apps get moved around across DEA's , we make sure that at least one instance of an app is always running and available to handle the requests.

thanks, 
Bala

glyn.no...@gmail.com

unread,
Sep 6, 2013, 7:59:39 AM9/6/13
to vcap...@cloudfoundry.org


On Thursday, 5 September 2013 20:50:23 UTC+1, Guillaume Berche wrote:

Your -Xss value is huge.  Each thread created by your application is going to consume 1M of thread stack, which counts against your total memory limit.  In most Java apps you can get away with a quarter of that or less.  Try something like 192k or 256k.


Again, I'm assuming the java_buildpack heuristics are correct, but my app (running jonas container) might instanciate more threads than tomcat so that could be an explanation, I'll try reducing that. Weird though that "cf stats" does not show a larger memory usages for the few cases when the app properly starts

We chose 1M for default thread stack size since that's the default applied by OpenJDK but we wanted to be explicit. You could try forking the buildpack and configuring a stack size in config/openjdk.yml:

...
memory_sizes:
  stack: 256k
memory_heuristics:
  ...

(Note that if the JVM was to hit an OutOfMemoryError when creating a thread, then the buildpack's killjava script should be driven which would also dump additional diagnostics into app/.buildpack-diagnostics/buildpack.log including the process tree before and after the kill command. That doesn't seem to be relevant here because warden is handling excessive memory consumption.)

Alternatively, perhaps the JVM is exhausting native memory. If that's the case, then you could try forking the buildpack and increasing the weighting of native memory in config/openjdk.yml. The default is 10% of the application's memory limit, which we thought was fairly generous.

Daniel Mikusa

unread,
Sep 6, 2013, 9:10:15 AM9/6/13
to vcap...@cloudfoundry.org
On Sep 5, 2013, at 3:50 PM, Guillaume Berche <ber...@gmail.com> wrote:

> Thanks Daniel for your response.
>
> > The app has a 512 MB memory limit configured, and the JVM has the java-buildpack heuristics applied with 385 MB -Xmx max heap option [1], so theoretically it should never reach limit.
>
> Don't forget that your Java process uses more memory than just the heap. There's also the PermGen, thread stacks, native code and some other things. All of these must add up to be less than your limit of 512M.
>
> Well, I'm assuming the heuristics of the java_buildpack (see [7] and [8]) add up those numbers correctly.

The key word here is heuristic. The build pack is going to do it's best to calculate these numbers correctly, but it's just making an educated guess. Sometimes you need to manually adjust further.

> Besides, this would not explain why cf stats cmd [4] reports a low memory usage.

All this tells us is what the memory usage was when you ran "cf stats". Because "cf events" is telling us that your app was killed for exceeding it's memory limit, we know that at some point after you ran "cf stats" the app increased it's memory usage and eventually exceeded 512M.

>
> If you wan to clean up your shell script, create a template shell script that you include with your build pack. Merge the template with data during the compile script and output the path to your script from your build pack's release script.
>
>
> Not sure to understand what you mean, in terms of displaying the start command, that's what [1] displays already. Can you please re-explain what your suggestion ?
>
> > I indeed see a trace "Killed" in stderr but no more details [2].
>
> What else are you expecting to see here?
>
> I'ld like/expect the process ID and command line to be printed "Killed PID=25 ./java -jar ...."
>
>
> >
> > [1]:
> >
> > $ cf files myapp staging_info.yml
> > Getting file contents... OK
> >
> > ---
> > detected_buildpack:
> > start_command: JAVA_HOME=.java JAVA_OPTS="-Dhttp.port=$PORT -XX:MaxPermSize=52428K
> > -XX:OnOutOfMemoryError=.buildpack-diagnostics/killjava -Xmx384M -Xss1M"
>
> Your -Xss value is huge. Each thread created by your application is going to consume 1M of thread stack, which counts against your total memory limit. In most Java apps you can get away with a quarter of that or less. Try something like 192k or 256k.
>
>
> Again, I'm assuming the java_buildpack heuristics are correct,

Again, it's just a heuristic, it's not going to be perfect for every situation. You know your application container and the applications you deploy to it best. If you can get away with a smaller thread stack size, then you can lower the value and save yourself some memory.

> but my app (running jonas container) might instanciate more threads than tomcat so that could be an explanation, I'll try reducing that. Weird though that "cf stats" does not show a larger memory usages for the few cases when the app properly starts

Threads can be created at any time. It's entirely possible that your container or application are creating more threads after you run "cf stats", pushing you over the limit.

Dan

Guillaume Berche

unread,
Sep 6, 2013, 10:17:22 AM9/6/13
to vcap...@cloudfoundry.org
Thanks Bala, this specific test was indeed running a single instance. Can you please point me to documentation / ruby specs related to the app restart with a single instance ? How is that different with multiple instances, HM would still try to restart instances that are thought to be down.

I tried the same app push with 3 instances of each 512 mb to illustrate the non determinist behavior:

I observe that with multiple instances, I still see OOM failures being reported by distinct instances, while others are fine[9], and still low memory usage in cf command [10]. Still no JVM-level OOM detected.

Guillaume.

[9]

$ cf stats my app
Getting stats for myapp ... OK

instance   cpu    memory           disk
#0         down
#1         0.0%   291.5M of 512M   297.4M of 1G
#2         0.0%   295.9M of 512M   297.4M of 1G
Fri Sep  6 13:48:57 UTC 2013

[10]


$ cf events myapp
Getting events for myapp... OK

time                        instance index   description     exit status
2013-09-06T13:46:06+00:00   0                out of memory   Failure (137)
2013-09-06T13:48:02+00:00   1                out of memory   Failure (137)

[11]

$ cf files myapp app/.buildpack-diagnostics/
Getting file contents... OK

buildpack.log                               0B
killjava                                  1.8K

Guillaume Berche

unread,
Sep 6, 2013, 10:31:52 AM9/6/13
to vcap-dev
Thanks Daniel for your reponse.

I understand the JVM currently never releases to the operating system majors parts of the memory it has allocated (the heap). This seems to be being worked on but not yet merged, see [12].

Also, I've sampled the cf stats mesurements every 3s and never seen the memory usage decrease nor go above 295 MB.

The more minor parts of JVM memory usage besides heaps (threads, jndi ...) are indeed more dynamic and could trigger the memory limit. I'll try with a more conservative heap ratio of 0.50 instead of 0.75 and will report that back. Given than I currently observe heap space usage of 185 MB out of 385 MB max heap, it seems unlikely to me that it reaches the 512 MB memory limit with threads and small parts, but we never know...

Guillaume.

The key word here is heuristic.  The build pack is going to do it's best to calculate these numbers correctly, but it's just making an educated guess.  Sometimes you need to manually adjust further.

> Besides, this would not explain why cf stats cmd [4] reports a low memory usage.

All this tells us is what the memory usage was when you ran "cf stats".  Because "cf events" is telling us that your app was killed for exceeding it's memory limit, we know that at some point after you ran "cf stats" the app increased it's memory usage and eventually exceeded 512M.


 

Daniel Mikusa

unread,
Sep 6, 2013, 10:50:51 AM9/6/13
to vcap...@cloudfoundry.org
On Sep 6, 2013, at 10:31 AM, Guillaume Berche <ber...@gmail.com> wrote:

> Thanks Daniel for your reponse.
>
> I understand the JVM currently never releases to the operating system majors parts of the memory it has allocated (the heap). This seems to be being worked on but not yet merged, see [12].
>
> Also, I've sampled the cf stats mesurements every 3s and never seen the memory usage decrease nor go above 295 MB.

I'm not sure why you would see a discrepancy between "cf stats" and what warden sees as your application's memory usage. Perhaps someone more knowledgable about CF can comment on that.

>
> The more minor parts of JVM memory usage besides heaps (threads, jndi ...) are indeed more dynamic and could trigger the memory limit. I'll try with a more conservative heap ratio of 0.50 instead of 0.75 and will report that back. Given than I currently observe heap space usage of 185 MB out of 385 MB max heap, it seems unlikely to me that it reaches the 512 MB memory limit with threads and small parts, but we never know…

You'd be surprised how the "minor" parts can add up. Especially when you're trying to keep within 512M.

Ex:

+ 64M for PermGen (not sure what your using for this, but 64m is common)
+ 50M (1M X 50 threads) for Thread Stack (again, not sure how many threads your app server will create, but 50 doesn't seem too crazy)
+ 385M Heap
------------------------
499M

That doesn't leave a whole lot of room for native code, GC and JVM internals to keep within 512M. Plus that assumes you'll never go above 50 threads.

Dan


>
> Guillaume.
>
> [12] http://hiroshiyamauchi.blogspot.fr/2013/06/making-jvm-release-memory.html
>
>
> The key word here is heuristic. The build pack is going to do it's best to calculate these numbers correctly, but it's just making an educated guess. Sometimes you need to manually adjust further.
>
> > Besides, this would not explain why cf stats cmd [4] reports a low memory usage.
>
> All this tells us is what the memory usage was when you ran "cf stats". Because "cf events" is telling us that your app was killed for exceeding it's memory limit, we know that at some point after you ran "cf stats" the app increased it's memory usage and eventually exceeded 512M.
>
>
>
>
> Again, it's just a heuristic, it's not going to be perfect for every situation. You know your application container and the applications you deploy to it best. If you can get away with a smaller thread stack size, then you can lower the value and save yourself some memory.
>
> > but my app (running jonas container) might instanciate more threads than tomcat so that could be an explanation, I'll try reducing that. Weird though that "cf stats" does not show a larger memory usages for the few cases when the app properly starts
>
> Threads can be created at any time. It's entirely possible that your container or application are creating more threads after you run "cf stats", pushing you over the limit.
>
> Dan
>
>

James Bayer

unread,
Sep 7, 2013, 4:09:23 AM9/7/13
to vcap...@cloudfoundry.org
i believe 137 is always out of memory while 255 can be file descriptors, disk, and I/O buffers not keeping up (because you may be logging too much for example).

your theory about starting multiple processes could be interesting.

you can see at cftracker.cfapps.io that we just finished this story:
Warden should provide more verbose exit errors

Once that code is used the user behavior will be better:

Push an app that reads a huge file to STDOUT. In cf events you should see: "command exceeded maximum output". If you push an app that runs out of memory (read a huge file) you should see "out of memory". "Disk quota exceeded" and "File descriptors exceeded" don't cause container to crash. They can cause an error when running a specific command in app. And if app catches that error it can cope with it. For example, sinatra catches such error. If app does not handle such error and app process exits 'cf events' will show 'app instance exited'. These errors will be visible in stderr.log of the app.
--
Thank you,

James Bayer

Guillaume Berche

unread,
Sep 9, 2013, 8:12:03 AM9/9/13
to vcap-dev
Thanks Dan for the precisions, I a tried a run with a heap of 128m (heap: 0.25 in openjdk.xml) on 512MB app with 3 instances, and I"m still oberving the same OOM on instance #0, while other instances seem fine [13]. I don't favor the hypothesis of the JVM alone exceeding the 512 MB limit:

  + 52M for PermGen
  + 50M (1M X 50 threads) for Thread Stack (not sure how many threads my app server is creating, but 50 doesn't seem too crazy)
  + 128M Heap
------------------------
      230 MB out of 512 MB limit

I'll try next to add some debugging traces in my app to try to understand what's wrong. I'm thinking to sample the processes within the container using ps command post them on the network (onto gist using a ruby script). This would workaround the fact that files written by the app does not seem reliable. I was also thinking about running an interactive shell prompt within an spawned process binding the CONSOLE or DEBUG port. If anyone done that I'd be interested in reusing that into my buildpack.

Thanks in advance,

Guillaume.

[13]


$ cf files myapp staging_info.yml
Getting file contents... OK

---
start_command: JAVA_HOME=.java JAVA_OPTS="-Dhttp.port=$PORT -Djava.io.tmpdir=$TMPDIR
  -XX:MaxPermSize=52428K -XX:OnOutOfMemoryError=./.buildpack-diagnostics/killjava
  -Xmx128M -Xss1M" && export JAVA_HOME JAVA_OPTS && (if test ! -d .jonas_base/deploy/app.war

  ; then JONAS_ROOT=.jonas_root JONAS_BASE=.jonas_base && export JONAS_ROOT JONAS_BASE
  && erb .jonas_root/deployme/topology.xml.erb > .jonas_root/deployme/topology.xml
  && $JAVA_HOME/bin/java -jar .jonas_root/deployme/deployme.jar -topologyFile=.jonas_root/deployme/topology.xml
  -domainName=singleDomain -serverName=singleServerName && mkdir -p .jonas_base/deploy/app.war
  && cp -r --dereference * .jonas_base/deploy/app.war/; else echo "skipping jonas_base
  config as already present"; fi) && source .jonas_base/setenv && jonas start -fg
(

$ cf stats myapp
Warning: url is not a valid manifest attribute. Please remove this attribute from your manifest to get rid of this warning
Getting stats for myapp... OK

instance   cpu    memory           disk
#0         down
#1         0.0%   218.5M of 512M   297.4M of 1G
#2         0.0%   207.0M of 512M   297.4M of 1G

e$ cf events myapp
Warning: url is not a valid manifest attribute. Please remove this attribute from your manifest to get rid of this warning

Getting events for myapp... OK

time                        instance index   description     exit status
2013-09-09T11:55:31+00:00   0                out of memory   Failure (137)



$ cf files myapp app/.buildpack-diagnostics/
Getting file contents... OK

buildpack.log                               0B
killjava                                  1.8K

Guillaume Berche

unread,
Sep 9, 2013, 9:01:58 AM9/9/13
to vcap-dev
Thanks James for pointing this story and giving us its description. That would be great if the "out-of-memory" event could be enriched with the PID(s), the command line and total heap oberved (or better the cgroups memory stats file [14]). Not sure how to capture traces from oom killer though other than disabling the OOM killer and have warden kill the suspended processes himself.

From a CF user perspective, I would much better have the operating system reject a malloc system call made by the JVM (and result into a java OutOfMemoryError), rather than having the OOM killer kill one of my processes at random. However cgroups don't yet seem to support such feature, as I understand [15]

BTW, any plans for publishing story contents beyond their title, I don't seem to be able to read any of them from http://cftracker.cfapps.io/runtime ?

In particular, can you please share details about "If app does not bind to port fast enough, it gets flagged as flapping": What is the default time after which the unbounded port triggers an app stop ? Can this be observed in events ?

> your theory about starting multiple processes could be interesting.

What would you suggest to try to understand and debug this ?

Thanks in advance,

Guillaume.

James Bayer

unread,
Sep 9, 2013, 10:28:04 AM9/9/13
to vcap...@cloudfoundry.org
We have no plans to expose additional story detail in cftracker currently. Engineers often post sensitive stack traces and other information in the story descriptions and comments, so we wanted to strike a balance between sharing visibility into our work, without sharing details that could leak sensitive information on accident. here are the relevant commits [3].

I suspect it is the multiple processes that is causing your app to go over memory. I'd suggest boosting app memory up much higher to validate that.

If an app has a route mapped, then when starting, it must listen on the assigned port within 60secs [1] or it fails that health check.

If you find this is your problem, then you can decide not to map the route at the beginning and use a mechanism like the state_file to indicate when your app is running [2].

Daniel Mikusa

unread,
Sep 9, 2013, 4:39:47 PM9/9/13
to vcap...@cloudfoundry.org
On Sep 9, 2013, at 8:12 AM, Guillaume Berche <ber...@gmail.com> wrote:

> Thanks Dan for the precisions, I a tried a run with a heap of 128m (heap: 0.25 in openjdk.xml) on 512MB app with 3 instances, and I"m still oberving the same OOM on instance #0, while other instances seem fine [13]. I don't favor the hypothesis of the JVM alone exceeding the 512 MB limit:
>
> + 52M for PermGen
> + 50M (1M X 50 threads) for Thread Stack (not sure how many threads my app server is creating, but 50 doesn't seem too crazy)
> + 128M Heap
> ------------------------
> 230 MB out of 512 MB limit

At those levels, I would agree.

> I'll try next to add some debugging traces in my app to try to understand what's wrong. I'm thinking to sample the processes within the container using ps command post them on the network (onto gist using a ruby script). This would workaround the fact that files written by the app does not seem reliable. I was also thinking about running an interactive shell prompt within an spawned process binding the CONSOLE or DEBUG port. If anyone done that I'd be interested in reusing that into my build pack.

I thought about this at one point, but never got around to it. I'll be interested to see what you end up doing here.

Dan

Daniel Mikusa

unread,
Sep 9, 2013, 4:42:12 PM9/9/13
to vcap...@cloudfoundry.org
On Sep 9, 2013, at 10:28 AM, James Bayer <jba...@gopivotal.com> wrote:

> We have no plans to expose additional story detail in cftracker currently. Engineers often post sensitive stack traces and other information in the story descriptions and comments, so we wanted to strike a balance between sharing visibility into our work, without sharing details that could leak sensitive information on accident. here are the relevant commits [3].
>
> I suspect it is the multiple processes that is causing your app to go over memory.

If a build pack starts multiple processes, will "cf stats" show memory usage from all the processes? or just one process in particular?

Dan

James Bayer

unread,
Sep 10, 2013, 3:33:53 AM9/10/13
to vcap...@cloudfoundry.org
If a build pack starts multiple processes, will "cf stats" show memory usage from all the processes?  or just one process in particular?

Guillaume Berche

unread,
Sep 10, 2013, 8:06:23 AM9/10/13
to vcap...@cloudfoundry.org
I'm working on a diagnostic script invoked by the java-buildpack that would post regular output of some cmd (e.g "vmstat; ps -ef" ) to a gist. The app-ops team can then easily debug/troubleshoot by looking at the gist history updated periodically. If there are other people interested in it, see specs below [21]

For that, I'm trying to use the gist gem [20] into a standalone ruby script that my jonas buildpack [16] would execute in the background.
Since I'm not using the ruby buildpack, how can I have the gist gem installed and staged ? getting-started [17] shows that the ruby buildpack installs gems with bundler into ./vendor/bundle. Is there an easy way for the java-buildpack to invoke the same commands as the ruby buildpack, at compile and runtime time (i.e. RUBY_OPT [18]) ? I'm still a ruby newcomer and learning on that. Otherwise, I was considering embedding the gist source in my script, and hope for its dependencies ('net/https', 'cgi', 'json', 'uri') to be installed by default in the 1.9.3 ruby stack [19].

Any better suggestion ?

Thanks in advance for your help,

Guillaume.


[16] https://github.com/Orange-OpenSource/java-buildpack
[17] http://docs.cloudfoundry.com/docs/dotcom/getting-started.html
[18] http://docs.cloudfoundry.com/docs/using/deploying-apps/environment-variable.html#RUBYOPT
[19] http://docs.cloudfoundry.com/docs/running/architecture/stacks.html
[20] https://rubygems.org/gems/gist

[21]
The diagnostic mechanism would be controlled by a mix of ENV vars accessible at runtime by the droplet, and config.yml options accessible at compile time within the buildpack if env vars are not available at that time.

At compile time, the buildpack would create an initial gist and print its url into the staging logs, so that it can then be monitored externally. It check for DEBUG_TOGIST_TOKEN env var to use a gist token and create private gists or would otherwise publish public gists.

At runtime, within the droplet, the diagnostic script would run at bakground as the first command.

1. The script would look for the DEBUG_TOGIST option variable being defined, otherwise would exit immediately.
2. It will then use the DEBUG_TOGIST_CMD to execute a shell command (e.g. "vmstat; ps -ef") and update th gist with the output of this command along with stdout/stderr, prefixed with date and its PID.
3. It will then use the DEBUG_TOGIST_WAIT to sleep for n seconds
4. It will then loop to step #2 for max n times indivated into the DEBUG_TOGIST_MAX

Guillaume Berche

unread,
Sep 10, 2013, 8:31:58 AM9/10/13
to vcap-dev
I'm probably missing something: reading through code and documentation appear as a mismatch: the stats_collector multiplies the rss cgroups metrics with 1024, whereas the cgroups documentation describe them as bytes already.

https://github.com/cloudfoundry/dea_ng/blob/master/lib/dea/stat_collector.rb#L33
@used_memory_in_bytes = info.memory_stat.rss * 1024


https://www.kernel.org/doc/Documentation/cgroups/memory.txt

rss - # of bytes of anonymous and swap cache memory (includes transparent hugepages).

James Bayer

unread,
Sep 10, 2013, 9:59:36 AM9/10/13
to vcap...@cloudfoundry.org
crazy idea: have you considered using jruby since you already have the jvm in your droplet?

Guillaume Berche

unread,
Sep 10, 2013, 10:21:08 AM9/10/13
to vcap-dev
Well, I'm trying to have a diagnostic script which has a low memory consumption to be a bad candidate for the oom killer ( [22] seems to describe this process). Therefore, I'd prefer keeping it outside of the JVM which I suspect i being killed by OOM ou DEA (being flagged as flappy).

[22] http://linux-mm.org/OOM_Killer

James Bayer

unread,
Sep 10, 2013, 11:28:23 AM9/10/13
to vcap...@cloudfoundry.org
my advice, just use the jvm and boost the memory up large for the app instance. you'll get the diagnostics you need and then you can stop using the jvm monitoring script once you're finished troubleshooting.

of course it's perfectly valid to package ruby in your droplet also, it's probably just more work.

the absolute highest fidelity, you could also just get a vagrant box with a local dev type of setup, change the warden config to not destroy the container [1] and then wsh into the container using warden shell and run the startup command yourself and see the impact that way [2].



To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Guillaume Berche

unread,
Sep 12, 2013, 4:17:38 PM9/12/13
to vcap-dev
I completed the diagnostic script [19] which collects process state from within the buildpack sampled every second or so and posts it to gist. According to these measurements, the jvm appears to be killed (by oom killer?) while the expected 512MB memory limit does not seem reached. This forces me to the oversize the instances to 1024MB or higher to start without failure. I either suspect something related to swap, or a bug in warden/cgroups. Any analysis and help is welcome. Full details follow.


The diagnostic script collects process state using the command "vmstat;ps -AFH --cols=2000;free".
An external script stops the app when all 3 instances are reported as running, this consequently stops the sampling at that time.

The gist naming format contains the instance index, instance id and start variables

In this run on run.pivotal.io, these are 3 instances of 512MB started. Both instance 0 and 1 experience OOMs [20] while "cf stats" qtill report a low memory usage. The last version of following gists show the last sample of these instances, while their history preserve older recorded samples:


diagnostics-Index0-Id7a194f92d1796d75400f46eb0e49f98c-Start2013-09-12 17:37:17 +0000

diagnostics-Index1-Id7a28ea1ef09f4a8802c43149aedc217d-Start2013-09-12 17:38:07 +0000
diagnostics-Index2-Id779a9d9408251c182bb0bbde2eeb6c15-Start2013-09-12 17:38:07 +0000

both instance 0 and 1 are started later on a second time into distinct instances (suposidely on HM request):
diagnostics-Index0-Id85d1dd33c9f7452d56d63bcca8f0f01f-Start2013-09-12 17:41:39 +0000
For the first instance 0 (Id 7a194), we see that the jvm (PID 137) gets killed at sample #7, i.e. approx 8 seconds after startup:
https://gist.github.com/gberche-orange/8360aebb72d0421861a4/revisions
In sample #6, just before being killed the jvm was using 233MB rss, and all processes were summing up to 252 MB rss.

The warden limit are set on memory.memsw.limit_in_bytes and memory.limit_in_bytes which includes memory+swap according to [22]. I'm having a hard time to interpret the ps man page [21] to properly measure the swap memory for my processes. Maybe [23] could help. Let me know if you have suggestions. The free and vmstat reported metrics don't seem to take be scoped to the current cgroup, but they however show a steady swap usage.

[19] https://github.com/Orange-OpenSource/java-buildpack/blob/master/resources/jonas/diagnostics/diagnostics.rb

[20]

$ date;cf instances myapp; cf events myapp ;cf stats myapp
Thu Sep 12 17:42:05 UTC 2013

Warning: url is not a valid manifest attribute. Please remove this attribute from your manifest to get rid of this warning
Getting instances for myapp... OK


instance #0: running
  started: 2013-09-12 05:42:02 PM

instance #1: running
  started: 2013-09-12 05:38:48 PM

instance #2: running
  started: 2013-09-12 05:38:31 PM

Warning: url is not a valid manifest attribute. Please remove this attribute from your manifest to get rid of this warning
Getting events for myapp... OK

time                        instance index   description     exit status
2013-09-12T17:37:33+00:00   0                out of memory   Failure (137)
2013-09-12T17:38:30+00:00   1                out of memory   Failure (137)

Warning: url is not a valid manifest attribute. Please remove this attribute from your manifest to get rid of this warning
Getting stats for myapp... OK

instance   cpu    memory           disk
#0         0.8%   268.8M of 512M   297.4M of 1G
#1         0.1%   318.3M of 512M   297.5M of 1G
#2         0.1%   302.5M of 512M   297.5M of 1G

[21] ps man page extracts on lucid 64:

Note:
        Only anonymous and swap cache memory is listed as part of 'rss' stat.
        This should not be confused with the true 'resident set size' or the
        amount of physical memory used by the cgroup.
        'rss + file_mapped" will give you resident set size of cgroup.
        (Note: file and shmem may be shared among other cgroups. In that case,
         file_mapped is accounted only when the memory cgroup is owner of page
         cache.)


       rss       RSS    resident set size, the non-swapped physical memory that a
                        task has used (in kiloBytes). (alias rssize, rsz).

      sz        SZ     size in physical pages of the core image of the process.
                        This includes text, data, and stack space. Device mappings
                        are currently excluded; this is subject to change. See vsz
                        and rss.
      size      SZ     approximate amount of swap space that would be required if
                        the process were to dirty all writable pages and then be
                        swapped out. This number is very rough!
                 
       vsz       VSZ    virtual memory size of the process in KiB
                        (1024-byte units). Device mappings are currently excluded;
                        this is subject to change. (alias vsize).

[23] http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/

James Bayer

unread,
Sep 13, 2013, 2:06:37 AM9/13/13
to vcap...@cloudfoundry.org
thanks for the very detailed write-up guillaume!

i opened a bug for this. the team is going to be fairly tied up for about a week or so with some other work, so i hope you're able to continue using the overprovisioning work-around for now.

Guillaume Berche

unread,
Sep 13, 2013, 6:44:55 AM9/13/13
to vcap-dev
Thanks James. The overprovisioning workaround is fine with me.

Any possibility to be in CC of the bug to participate in related discussions ?

Out of interest, I'll try to follow your suggestion about wsh-ing into warden onto a vagrant box.

By the way, I wonder if yeti tests cover some of these instance memory limit use-cases. Is there somewhere public yeti tests output/results as they used to be on http://ci.cloudfoundry.org/ ? Would it be feasible to get that running on travis-ci ? I wish I could made it to platform last week-end to have such discussions but travel restrictions froze it on my side.

Glyn Normington

unread,
Sep 13, 2013, 8:00:39 AM9/13/13
to vcap...@cloudfoundry.org
Hi Guillaume

This is probably not relevant, but I want be sure you are aware of it since you are going to such lengths in diagnosing this problem: the new Java buildpack sets up a script which the JVM will invoke when an out of memory error is encountered. The script dumps info into app/.buildpack-diagnostics/.buildpack.log before and after it issues a kill request against the JVM.

Regards,
Glyn

James Bayer

unread,
Sep 13, 2013, 9:21:59 AM9/13/13
to vcap...@cloudfoundry.org
guilluame it's not possible at this time to put you on the story directly in our tracker because the sharing our tracker shares too much private information, very similar to sharing a private github respository that may have credentials and other environment information mixed in.

once it is prioritized in the runtime team backlog by Mark Kropf, you should see a story that is currently named: 
"warden oom kills app before memory threshold crossed"

cloud foundry's mechanism for discussion and public tracking is a GitHub issue. So I opened an issue on Warden that I've asked the team to use when commenting on their findings.

public CI is under active discussion. travis-ci currently does not support us using our own linux containerization because i believe it interferes with how they are containerizing their own apps. so that CI environment is currently still private i believe for similar reasons as the tracker i mentioned above. we would like to make it public, it's just a matter of when it is the right priority vs other important work we have in-flight that we need to finish.

i hope glyn's recommendation could be used in your scenario for additional detail.

Guillaume Berche

unread,
Sep 13, 2013, 10:00:45 AM9/13/13
to vcap-dev
Hi Glyn,

Thanks for the reminder about the killjava script. In my specific case, the JVM is not experiencing a java.lang.OutOfMemoryError. Rather, for some yet unknown reason, the kernel oom_killer detects the jvm as exceeding the cgroup memory limit and kills the os process.

Guillaume.

Guillaume Berche

unread,
Sep 13, 2013, 10:02:48 AM9/13/13
to vcap-dev
Thanks a lot James for the github issue and the update on public ci.

Glyn Normington

unread,
Sep 13, 2013, 10:12:59 AM9/13/13
to vcap...@cloudfoundry.org
Understood.
Reply all
Reply to author
Forward
0 new messages