OutOfMemoryError considered fatal?

402 views
Skip to first unread message

Aleh Aleshka

unread,
May 21, 2013, 7:39:25 PM5/21/13
to scala-l...@googlegroups.com
I wonder why this kind of code is not handling the exception
scala.util.Try{ Array.ofDim(1000000000)}

Surely the vm is not in unrepairable state after OOME

Thanks, Aleh

Alex Cruise

unread,
May 21, 2013, 7:50:59 PM5/21/13
to scala-l...@googlegroups.com
http://stackoverflow.com/questions/2679330/catching-java-lang-outofmemoryerror is a good place to start with this question, which should probably be on scala-user. :)

-0xe1a


--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

√iktor Ҡlang

unread,
May 21, 2013, 8:07:17 PM5/21/13
to scala-l...@googlegroups.com
Some references:

"An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch." - http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/Error.html

"Thrown to indicate that the Java Virtual Machine is broken or has run out of resources necessary for it to continue operating." - http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/VirtualMachineError.html

"Thrown when the Java Virtual Machine cannot allocate an object because it is out of memory, and no more memory could be made available by the garbage collector." -


Cheers,


--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Viktor Klang
Director of Engineering

Twitter: @viktorklang

Aleh Aleshka

unread,
May 21, 2013, 8:08:24 PM5/21/13
to scala-l...@googlegroups.com, Alex Cruise
Hey Alex

I'm talking specifically about scala.util.Try and the decision made in it.
OOME in most cases doesn't mean a death to vm, so why is it treated as
fatal?

Aleh

Aleh Aleshka

unread,
May 21, 2013, 8:23:31 PM5/21/13
to scala-l...@googlegroups.com, √iktor Ҡlang
Hey Victor

As you can see from my example those docs are not true (just like for
StackOverflowError)
Talking about OutOfMemoryError specifically, it does not indicate that
"Java Virtual Machine is broken or has run out of resources necessary for
it to continue operating". In a lot of cases OOME just indicates a problem
with a particular piece of code.
What harm would it cause if OOME was treated as NonFatal?

Aleh

On Wed, 22 May 2013 03:07:17 +0300, √iktor Ҡlang <viktor...@gmail.com>
wrote:

> Some references:
>
> "An Error is a subclass of Throwable that* indicates serious problems
> *that
> a reasonable application* should not try to catch*." -
> http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/Error.html
>
> "Thrown to indicate tha*t the Java Virtual Machine is broken* or *has run
> out of resources necessary for it to continue operating.*" -
> http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/VirtualMachineError.html
>
> "Thrown when the *Java Virtual Machine cannot allocate* an object because
> it is* out of memory*, and* no more memory could be made available* by

√iktor Ҡlang

unread,
May 21, 2013, 8:35:15 PM5/21/13
to Aleh Aleshka, scala-l...@googlegroups.com
Hi Aleh,


On Wed, May 22, 2013 at 2:23 AM, Aleh Aleshka <oleg...@gmail.com> wrote:
Hey Victor

As you can see from my example those docs are not true (just like for StackOverflowError)

At least SSE is Thread-local and proper cleanup is done as the stack unwinds (releasing monitors etc)
 
Talking about OutOfMemoryError specifically, it does not indicate that "Java Virtual Machine is broken or has run out of resources necessary for it to continue operating". In a lot of cases OOME just indicates a problem with a particular piece of code.

Yes, but that it not safe to assume. One thread may have been hogging tons of memory and another thread tries to allocate a smallish object it gets an OOME simply because memory has been exhausted by others.
 
What harm would it cause if OOME was treated as NonFatal?

How would you proceed? How do you know that it is safe to proceed?

The general consensus is that one should bail out if one gets an OOME, or even -XX:OnOutOfMemoryError=kill -9 %p
 
Cheers,

For more options, visit https://groups.google.com/groups/opt_out.



Sassa Nf

unread,
May 22, 2013, 6:37:10 AM5/22/13
to scala-l...@googlegroups.com
This is the wrong assumption. StackOverflowError and OOME being "repairable" is only the result of great effort from the JVM. No one is guaranteed to recover after that, and several JVMs crash.

You will find occurrences of StackOverflowError leaving locks locked, monitors acquired, and "finally" block not invoked, even if JVM doesn't crash.


Alex

√iktor Ҡlang

unread,
May 22, 2013, 7:04:24 AM5/22/13
to scala-l...@googlegroups.com

Do you have any sources regarding SOE?
I talked to multiple JVM vendors who said SOE was clean.

Cheers, V

Sassa Nf

unread,
May 22, 2013, 7:44:47 AM5/22/13
to scala-l...@googlegroups.com
Personal experience with several JVMs. We wanted to file bugs, but when I spoke to JVM guys, they said even though they do their best to return cleanly, they can't guarantee it and won't fix it, since it is a permitted behaviour by the JVM spec.

JVM spec permits the VM to do undefined things if a Error is to be thrown. One of such strange permissions is GC being optional, by the way.

As to SOE, in order to recover from it, one needs to work out the state of the stack in order to figure out where to return control. This is not always possible, when SOE occurs. Because monitor release happens in "finally", which is the catch (Throwable _), that is not guaranteed.

I'll see if I can find a reproducer.


Alex


2013/5/22 √iktor Ҡlang <viktor...@gmail.com>

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/eC9dqTTBYHg/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to scala-languag...@googlegroups.com.

√iktor Ҡlang

unread,
May 22, 2013, 7:56:36 AM5/22/13
to scala-l...@googlegroups.com
On Wed, May 22, 2013 at 1:44 PM, Sassa Nf <sass...@gmail.com> wrote:
Personal experience with several JVMs. We wanted to file bugs, but when I spoke to JVM guys, they said even though they do their best to return cleanly, they can't guarantee it and won't fix it, since it is a permitted behaviour by the JVM spec.

What time-frame was this in? (i.e. what Java/Jvm version?)
 

JVM spec permits the VM to do undefined things if a Error is to be thrown. One of such strange permissions is GC being optional, by the way.

As to SOE, in order to recover from it, one needs to work out the state of the stack in order to figure out where to return control. This is not always possible, when SOE occurs. Because monitor release happens in "finally", which is the catch (Throwable _), that is not guaranteed.

But an SOE can be thrown by _any_ call so there's no way of protecting oneself from it.
That being said, if SoE indeed is unsafe on Java6 VMs we need to make it fatal.
 

I'll see if I can find a reproducer.

That'd be great, thanks!

Cheers,



--

Sassa Nf

unread,
May 22, 2013, 8:21:33 AM5/22/13
to scala-l...@googlegroups.com
public class a implements Runnable {
  static long n;
  static long max;
  String name;

  static volatile Object lock = a.class;

  public a (String name) {
    this.name = name;
  }

  public static void main (String[] a) throws Exception {
    Thread t1 = new Thread (new a ("One"));
    Thread t2 = new Thread (new a ("Two"));
    t1.start ();
    t2.start ();
    t1.join();
    t2.join();
  }

  public void run() {
    for(int i=0; i < 100000; ++i) {
      try {
        System.out.println (name + " entering sub: iteration " + i);
        sub (name, false, i);
      }
      catch (Throwable e) {
        System.out.println (e.getClass().getName() + " in " + name);
      }
    }
  }

  private static void sub (String name, boolean acquired, int i) {
    synchronized( lock )
    {
      try
      {
        if (!acquired) {
          System.out.println ("Acquired by " + name + " n: " + n);
          if ( n != 0 ) i = 1;
        }
        n++;
        if ( n > max ) max = n;
        if ( lock == a.class || n < 5000 ) sub (name, true, i);
      }
      catch( Throwable th )
      {
        System.out.println( "SOE in " + name + " @ level " + max + " caught @ " + n );
      }
      finally
      {
        n--;
      }
    }
  }
}


JVM-1:

One entering sub: iteration 0
Two entering sub: iteration 0
Acquired by One n: 0
SOE in One @ level 3900 caught @ 3882
One entering sub: iteration 1
Acquired by Two n: 0
...

JVM-2:

Two entering sub: iteration 0
One entering sub: iteration 0
Acquired by Two n: 0
SOE in Two @ level 10391 caught @ 10391
Two entering sub: iteration 1
Acquired by Two n: 341
...

What the reproducer does:

- loop in two threads, trying to invoke a method with a synchronized in it
- once sub() is called, it recursively calls sub until the stack overflows
- max keeps track of n; max is the evidence of how deep the stack got
- SOE in <thread name> @ level <how deep it went> caught @ <where the first catch Throwable worked> can tell us if some catch() were skipped

JVM-1 shows us that n:0 on the second acquire - that means all finally worked, but SOE .... output says some catch() were missed

JVM-2 shows us that n:341 on the second acquire - that means some finally didn't work, and SOE ... output says some catch() were missed

Both JVMs are 1.6 and above.

Alex

√iktor Ҡlang

unread,
May 22, 2013, 8:25:26 AM5/22/13
to scala-l...@googlegroups.com
Could you open a ticket here: https://issues.scala-lang.org/secure/Dashboard.jspa
Attach the steps to reproduce etc and I can take a stab at it.

Thanks!

Sassa Nf

unread,
May 22, 2013, 8:53:17 AM5/22/13
to scala-l...@googlegroups.com
I don't know what you want me to write in that ticket.


Replacing one line in the reproducer:

        if ( (i > 50000) && (lock == a.class || n < 5000) ) sub (name, true, i);

makes catch(Throwable _) is never executed by JVM-1. Removing catch(Throwable _) in the loop makes the threads crash eventually.


Alex

√iktor Ҡlang

unread,
May 22, 2013, 8:57:15 AM5/22/13
to scala-l...@googlegroups.com
"StackOverflowError ought to be considered Fatal when it comes to scala.util.NonFatal, here is what happens if it isn't considered fatal <insert code>"

Or am I missing something?

Björn Antonsson

unread,
May 22, 2013, 9:05:23 AM5/22/13
to scala-l...@googlegroups.com
Hi Alex,

I have a few questions regarding your test code.
Won't this code will always recurse since lock is a.class? Is the if vital to the test failing?
 
      }
      catch( Throwable th )
      {
        System.out.println( "SOE in " + name + " @ level " + max + " caught @ " + n );

This println might actually do a stack overflow when trying to print, so that even if you catch it here you might not get an output until another catch on a lower level. Also the max is static, so it will only be valid for the first run, right?
So the output from JVM2 seems very fishy. Did "One" ever get to run there? The synchronized method is implemented by the JVM as a:

try {
  monitorEnter(object);
  // more code here
} finally {
  monitorExit(object);
}

If the JVM was indeed missing finally blocks, then the lock could end up in an unmatched state (still locked), and the other thread would never get to acquire it.

/Björn

-- 
Björn Antonsson
Typesafe – The software stack for applications that scale
twitter: @bantonsson

Sassa Nf

unread,
May 22, 2013, 9:25:34 AM5/22/13
to scala-l...@googlegroups.com
Correct on all points.

The "if" you are referring to is not important - it is left from the older reproducer with finer detail, but here it doesn't matter.

Yes, that's how synchronized() is implemented, and JVM-2 does eventually hang. That's how we started to look into that anyway.

Yes, the println in catch may be failing, but a modification of reproducer with if (i > 50000 ...) demonstrates catch(_) never worked.

Alex



2013/5/22 Björn Antonsson <bjorn.a...@typesafe.com>

Sassa Nf

unread,
May 22, 2013, 11:29:02 AM5/22/13
to scala-l...@googlegroups.com
aha! finally I found a reproducer that works on both JVMs that I tried.

Acquired by One n: 94 -- must be zero
SOE in Two @ level 178 caught @ 178; sum: 1275
Two entering sub: iteration 34783
Acquired by Two n: 44
SOE in One @ level 228 caught @ 228; sum: 1275
One entering sub: iteration 34783
Acquired by One n: 93
SOE in Two @ level 179 caught @ 179; sum: 1275
Two entering sub: iteration 34784
Acquired by Two n: 44

The trick is to make the lock on which synchronization occurs fat.

If you are curious, I can paste the reproducer; but I wanted just to make a point that no JVM is likely to be immune to this problem and we can't say it is a bug in one JVM.


Alex



2013/5/22 Björn Antonsson <bjorn.a...@typesafe.com>
Hi Alex,

Simon Ochsenreither

unread,
May 22, 2013, 11:55:43 AM5/22/13
to scala-l...@googlegroups.com
Can you give the name and the versions of the VMs in question?

Aleh Aleshka

unread,
May 22, 2013, 3:18:53 PM5/22/13
to scala-l...@googlegroups.com, Aleh Aleshka
I guess OOME and SOE could leave vm in inconsistent state in some rare cases.
Obviously even Exception thrown from user code can potentially leave it in this state and we can't assume that it's safe to proceed..
But should we penalize every use case of scala.util.Try ?
What is the worst thing that could happen if Try (and other users of NonFatal) was catching OOME and SOE?

Thanks, Aleh.
Hi Aleh,



For more options, visit https://groups.google.com/groups/opt_out.



√iktor Ҡlang

unread,
May 22, 2013, 3:26:54 PM5/22/13
to scala-l...@googlegroups.com, Aleh Aleshka
On Wed, May 22, 2013 at 9:18 PM, Aleh Aleshka <oleg...@gmail.com> wrote:
I guess OOME and SOE could leave vm in inconsistent state in some rare cases.

Define rare.
 
Obviously even Exception thrown from user code can potentially leave it in this state and we can't assume that it's safe to proceed.. 
But should we penalize every use case of scala.util.Try ?

Penalize in what way? Penalize heap and stack abuse?
 
What is the worst thing that could happen if Try (and other users of NonFatal) was catching OOME and SOE?

Data corruption?

Cheers,

Aleh Aleshka

unread,
May 24, 2013, 11:08:13 AM5/24/13
to scala-l...@googlegroups.com, Aleh Aleshka
Hi Victor

I've seen catching regular Exceptions leading to data corruption, but haven't ever seen someone terminating the vm because of SOE or OOME (except the mentioned kill -9 %p trick for hadoop workers)
On the other hand user input in a regular web application may lead to OOME or even SOE while handling e.g. http request and it is unsound to assume that we should abandon processing the request without a regular exception handling or kill the vm.

Aleh

Simon Ochsenreither

unread,
May 24, 2013, 11:29:52 AM5/24/13
to scala-l...@googlegroups.com, Aleh Aleshka

... and it is unsound to assume that we should abandon processing the request without a regular exception handling or kill the vm.

Well, you have seen the failing tests and can look into the implementation of HotSpot if you like. I'm also not happy with the current situation, but my unhappiness won't fix the issues.

Lex Spoon

unread,
May 24, 2013, 12:00:23 PM5/24/13
to scala-l...@googlegroups.com
The GWT team went around and around about this for OutOfMemory errors
in its compiler. My first instinct was not to bother handling OOM, but
we tried a few ways anyway, and it's just insanely difficult. We
ultimately gave up. Java is not designed for to support manual memory
management. On the contrary, Java is expressly designed for you *not*
to control memory usage.

One of the problems is that your OOM handler cannot itself allocate
any memory. This is because you don't know that the handler is running
on the thread that is the memory hog. How do you write an OOM handler,
though, without allocating memory? That's an unreasonable constraint
for programming on the JVM. You can't add to a list. You can't
allocate an array. You can't do string appends. You can't call into
library routines.

Another consequence of not knowing which thread will get the OOM is
that it might be a thread you don't control. In such a case, the JVM
is just going to exit and you can't do anything about it. As such,
from a management point of view, it's hard to justify putting
developer resources onto the development of sophisticated OOM
handlers. You have no idea if the resulting code will ever be used.

Another platform I notice having OOM problems is Squeak Smalltalk. In
older versions of the system, going back to the early 80s, the
graphics system was built around cooperative multi-tasking. What
Squeak used to do is switch to an OOM dialog when it got low on
memory. Due to the threading model, the OOM dialog got full use of the
remaining memory in the system. That was a clever and effective
solution, but it was undermined in the late 90s when Squeak switched
to a modern event-driven graphics framework. The new framework is an
improvement in many ways, but it means there was no effective way to
lock out threads (and event handlers) that are allocating too much
memory.

Sometimes the best answer is, "let it crash". Out of memory is such a
case. You can't handle it well, and if you try, all you will do is
handle it complicated.

Lex Spoon

Aleh Aleshka

unread,
May 24, 2013, 12:30:16 PM5/24/13
to scala-l...@googlegroups.com, Lex Spoon
Hi Lex

This makes sense for a disposable app like a compiler instance or worker.
But for a long running app you just can't let it crash because of user
submitting a large number in his input.
OOME and SOE are the ways to prevent hard crash. Do you suggest to halt
the vm immediately like if it was a segfault instead of Error? Bypassing
shutdown hooks as well because the vm is possibly broken?

> One of the problems is that your OOM handler cannot itself allocate
> any memory. This is because you don't know that the handler is running
> on the thread that is the memory hog.
If the handler crashes with OOM then fine, we did our best. That is still
better than crashing right away.

Thanks for the info on Squeak. If i understand correctly it is mainly used
for single-user systems so there is probably not much use for protection
against malicious actions.

Thanks, Aleh

Bardur Arantsson

unread,
May 24, 2013, 12:34:28 PM5/24/13
to scala-l...@googlegroups.com
On 05/24/2013 05:08 PM, Aleh Aleshka wrote:

> On the other hand user input in a regular web application may lead to OOME
> or even SOE while handling e.g. http request and it is unsound to assume
> that we should abandon processing the request without a regular exception
> handling or kill the vm.

Are you serious?

If so, that's a security issue. You should have bounded request sizes --
above which requests are rejected before any parsing happens. If your
requests require unbounded memory or processing then that's a problem at
an even deeper level, but I digress...



Nils Kilden-Pedersen

unread,
May 24, 2013, 12:52:22 PM5/24/13
to scala-l...@googlegroups.com, Aleh Aleshka
On Fri, May 24, 2013 at 10:08 AM, Aleh Aleshka <oleg...@gmail.com> wrote:
I've seen catching regular Exceptions leading to data corruption, but haven't ever seen someone terminating the vm because of SOE or OOME

For real? I've never seen anyone try to handle those explicitly, with the intent to "keep on trucking", because it indicates a bug or poorly configured JVM. The solution is the either fix the bug or configure the JVM properly.

√iktor Ҡlang

unread,
May 24, 2013, 3:01:21 PM5/24/13
to scala-l...@googlegroups.com, Aleh Aleshka
Alright, so to conclude, SOE should be fatal as well?
Please open a ticket.

Cheers,

Aleh Aleshka

unread,
May 24, 2013, 10:05:14 PM5/24/13
to scala-l...@googlegroups.com, √iktor Ҡlang
Hi Victor

How would this benefit users of Try, Future and NonFatal?
Decrease data corruption chances by a factor so small that nor you nor
twitter engineers considered to be greater than zero until this week?
Increase chances that scala users will see hanging Futures and Try
misbehaving when they happen to use foldRight or other method in a
third-party library which they not able to control?
Or are you suggesting that Try and Future should not use NonFatal?
(all the questions apply to OOM as well, i believe)

Thanks, Aleh

On Fri, 24 May 2013 22:01:21 +0300, √iktor Ҡlang <viktor...@gmail.com>
>>>>>> *Viktor Klang*
>>>>>> *Director of Engineering*
>>>>>> Typesafe <http://www.typesafe.com/>
>>>>>>
>>>>>> Twitter: @viktorklang
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "scala-language" group.
>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>> send
>>>>> an email to scala-languag...@googlegroups.com.
>>>>>
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Viktor Klang*
>>>> *Director of Engineering*
>>>> Typesafe <http://www.typesafe.com/>

√iktor Ҡlang

unread,
May 25, 2013, 5:48:13 AM5/25/13
to Aleh Aleshka, scala-l...@googlegroups.com
Hi Ale,


On Sat, May 25, 2013 at 4:05 AM, Aleh Aleshka <oleg...@gmail.com> wrote:
Hi Victor

How would this benefit users of Try, Future and NonFatal?

I think that's pretty clear by now.
 
Decrease data corruption chances by a factor so small that nor you nor twitter engineers considered to be greater than zero until this week?

Do you have any references for that statement?
 
Increase chances that scala users will see hanging Futures and Try misbehaving when they happen to use foldRight or other method in a third-party library which they not able to control?

As proven earlier, SOE is not safe to ignore, or to pretend like everything is ok. and there is no way of knowing if it is OK.
 
Or are you suggesting that Try and Future should not use NonFatal?

What is the argument that they shouldn't?
 
(all the questions apply to OOM as well, i believe)

All of my replies applies to OOM as well.

Cheers,
 

For more options, visit https://groups.google.com/groups/opt_out.




 --
You received this message because you are subscribed to the Google Groups
"scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an

For more options, visit https://groups.google.com/groups/opt_out.








--

Aleh Aleshka

unread,
May 28, 2013, 3:22:22 PM5/28/13
to scala-l...@googlegroups.com, Aleh Aleshka
On Saturday, May 25, 2013 12:48:13 PM UTC+3, √iktor Klang wrote:
Hi Ale,


On Sat, May 25, 2013 at 4:05 AM, Aleh Aleshka <oleg...@gmail.com> wrote:
Hi Victor

How would this benefit users of Try, Future and NonFatal?

I think that's pretty clear by now.
Care to elaborate? Don't you see negative consequences for users?
E.g. how is getting TimeoutException from Await is better than getting the error that caused the computation to fail?
I could probably see your point if you suggested to terminate the vm on SOE or OOME, but we don't even have a way to know if there was SOE/OOME if we just make Try and Future ignore them.
 
 
Decrease data corruption chances by a factor so small that nor you nor twitter engineers considered to be greater than zero until this week?

Do you have any references for that statement?
Citing you: "I talked to multiple JVM vendors who said SOE was clean."
Citing scaladoc:
" * ''Note'': only non-fatal exceptions are caught by the combinators on `Try` (see [[scala.util.control.NonFatal]]).
* Serious system errors, on the other hand, will be thrown.
* `Try` comes to the Scala standard library after years of use as an integral part of Twitter's stack.
 
 
Increase chances that scala users will see hanging Futures and Try misbehaving when they happen to use foldRight or other method in a third-party library which they not able to control?

As proven earlier, SOE is not safe to ignore, or to pretend like everything is ok. and there is no way of knowing if it is OK.
 
Or are you suggesting that Try and Future should not use NonFatal?

What is the argument that they shouldn't?
 
(all the questions apply to OOM as well, i believe)

All of my replies applies to OOM as well.

Can we perhaps have configurable definition of NonFatal for Try? Something like scala.util.control.Exception#handling 

Thanks, Aleh

√iktor Ҡlang

unread,
May 28, 2013, 3:34:24 PM5/28/13
to scala-l...@googlegroups.com, Aleh Aleshka
On Tue, May 28, 2013 at 9:22 PM, Aleh Aleshka <oleg...@gmail.com> wrote:
On Saturday, May 25, 2013 12:48:13 PM UTC+3, √iktor Klang wrote:
Hi Ale,



On Sat, May 25, 2013 at 4:05 AM, Aleh Aleshka <oleg...@gmail.com> wrote:
Hi Victor

How would this benefit users of Try, Future and NonFatal?

I think that's pretty clear by now.
Care to elaborate? Don't you see negative consequences for users?
E.g. how is getting TimeoutException from Await is better than getting the error that caused the computation to fail?
I could probably see your point if you suggested to terminate the vm on SOE or OOME, but we don't even have a way to know if there was SOE/OOME if we just make Try and Future ignore them.
 

A properly implemented unhandledException handler is all it takes: https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/Future.scala#L29

(This is what Akka does)
 
 
Decrease data corruption chances by a factor so small that nor you nor twitter engineers considered to be greater than zero until this week?

Do you have any references for that statement?
Citing you: "I talked to multiple JVM vendors who said SOE was clean."
Citing scaladoc:
" * ''Note'': only non-fatal exceptions are caught by the combinators on `Try` (see [[scala.util.control.NonFatal]]).
* Serious system errors, on the other hand, will be thrown.
* `Try` comes to the Scala standard library after years of use as an integral part of Twitter's stack.
 

But that was before there was any proof to the contrary. (I have still to analyze them tho)
 
 
Increase chances that scala users will see hanging Futures and Try misbehaving when they happen to use foldRight or other method in a third-party library which they not able to control?

As proven earlier, SOE is not safe to ignore, or to pretend like everything is ok. and there is no way of knowing if it is OK.
 
Or are you suggesting that Try and Future should not use NonFatal?

What is the argument that they shouldn't?
 
(all the questions apply to OOM as well, i believe)

All of my replies applies to OOM as well.

Can we perhaps have configurable definition of NonFatal for Try? Something like scala.util.control.Exception#handling 


What does that mean?

Cheers,

Aleh Aleshka

unread,
May 28, 2013, 3:57:39 PM5/28/13
to scala-l...@googlegroups.com, √iktor Ҡlang
On Tue, 28 May 2013 22:34:24 +0300, √iktor Ҡlang <viktor...@gmail.com>
wrote:

> On Tue, May 28, 2013 at 9:22 PM, Aleh Aleshka <oleg...@gmail.com> wrote:
>
>> On Saturday, May 25, 2013 12:48:13 PM UTC+3, √iktor Klang wrote:
>>
>>> Hi Ale,
>>>
>>>
>>>
>>> On Sat, May 25, 2013 at 4:05 AM, Aleh Aleshka <oleg...@gmail.com>
>>> wrote:
>>>
>>>> Hi Victor
>>>>
>>>> How would this benefit users of Try, Future and NonFatal?
>>>>
>>>
>>> I think that's pretty clear by now.
>>>
>> Care to elaborate? Don't you see negative consequences for users?
>> E.g. how is getting TimeoutException from Await is better than getting
>> the
>> error that caused the computation to fail?
>> I could probably see your point if you suggested to terminate the vm on
>> SOE or OOME, but we don't even have a way to know if there was SOE/OOME
>> if
>> we just make Try and Future ignore them.
>>
>>
>
> A properly implemented unhandledException handler is all it takes:
> https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/Future.scala#L29
>
> (This is what Akka does)
>

You just said that default unhandledException handler is not implemented
properly.
And suggested that every user should reimplement it and catch Error
themselves.
Why do this instead of handling properly inside of library?
And what do you suggest for Try? Wrap it in try catch?

And as far as i can see akka uses
scala.concurrent.ExecutionContext$#defaultReporter which just prints
stacktrace everywhere, please point to me the place where it reimplements
unhandledException handler.

Not sure what you were trying to say with your link btw.

Thanks, Aleh

√iktor Ҡlang

unread,
May 28, 2013, 4:09:54 PM5/28/13
to Aleh Aleshka, scala-l...@googlegroups.com
On Tue, May 28, 2013 at 9:57 PM, Aleh Aleshka <oleg...@gmail.com> wrote:
On Tue, 28 May 2013 22:34:24 +0300, √iktor Ҡlang <viktor...@gmail.com> wrote:

On Tue, May 28, 2013 at 9:22 PM, Aleh Aleshka <oleg...@gmail.com> wrote:

On Saturday, May 25, 2013 12:48:13 PM UTC+3, √iktor Klang wrote:

Hi Ale,



On Sat, May 25, 2013 at 4:05 AM, Aleh Aleshka <oleg...@gmail.com> wrote:

Hi Victor

How would this benefit users of Try, Future and NonFatal?


I think that's pretty clear by now.

Care to elaborate? Don't you see negative consequences for users?
E.g. how is getting TimeoutException from Await is better than getting the
error that caused the computation to fail?
I could probably see your point if you suggested to terminate the vm on
SOE or OOME, but we don't even have a way to know if there was SOE/OOME if
we just make Try and Future ignore them.



A properly implemented unhandledException handler is all it takes:
https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/Future.scala#L29

(This is what Akka does)


You just said that default unhandledException handler is not implemented properly.
And suggested that every user should reimplement it and catch Error themselves.
Why do this instead of handling properly inside of library?
And what do you suggest for Try? Wrap it in try catch?

WDYM? The idea is that no-one touches the fatal exceptions so it bubbles to the top of the thread and hits the UEH.

 

And as far as i can see akka uses scala.concurrent.ExecutionContext$#defaultReporter which just prints stacktrace everywhere, please point to me the place where it reimplements unhandledException handler.

Not sure what you were trying to say with your link btw.

Perhaps we're miscommunicationg, are you suggesting completing future (wrapped) and try (wrapped) AND rethrowing in Future.apply and Try.apply? If so, that's definitely something that could be considered. The key being that it needs to travel to the UEH.

Cheers,
 

Thanks, Aleh

Aleh Aleshka

unread,
May 28, 2013, 5:24:41 PM5/28/13
to √iktor Ҡlang, scala-l...@googlegroups.com
On Tue, 28 May 2013 23:09:54 +0300, √iktor Ҡlang <viktor...@gmail.com>
>>> https://github.com/scala/**scala/blob/v2.10.1/src/**
>>> library/scala/concurrent/impl/**Future.scala#L29<https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/Future.scala#L29>
>>>
>>> (This is what Akka does)
>>>
>>>
>> You just said that default unhandledException handler is not implemented
>> properly.
>> And suggested that every user should reimplement it and catch Error
>> themselves.
>> Why do this instead of handling properly inside of library?
>> And what do you suggest for Try? Wrap it in try catch?
>>
>
> WDYM? The idea is that no-one touches the fatal exceptions so it bubbles
> to
> the top of the thread and hits the UEH.
>
> Here is the Akka UEH:
> https://github.com/akka/akka/blob/master/akka-actor/src/main/scala/akka/actor/ActorSystem.scala#L466
>
>
Thanks, that makes sense.
However i bet there would be surprised users upgrading to 2.10.x and
seeing their actorsystem shutdown on SOE.
I was surprised seeing OOME not being handled by Try. And I don't see any
way to override that behavior.

>>
>> And as far as i can see akka uses
>> scala.concurrent.**ExecutionContext$#**defaultReporter
>> which just prints stacktrace everywhere, please point to me the place
>> where
>> it reimplements unhandledException handler.
>>
>> Not sure what you were trying to say with your link btw.
>>
>
> Perhaps we're miscommunicationg, are you suggesting completing future
> (wrapped) and try (wrapped) AND rethrowing in Future.apply and Try.apply?
> If so, that's definitely something that could be considered. The key
> being
> that it needs to travel to the UEH.
>
Not sure i follow. Can you rethrow anything in Future.apply?

> Cheers,
> √
>

Thanks, Aleh

√iktor Ҡlang

unread,
May 28, 2013, 5:32:35 PM5/28/13
to Aleh Aleshka, scala-l...@googlegroups.com
There are quite a few finally-clauses in the Akka sources, and if they are not properly handled on SOE it means that we can't continue.
 
I was surprised seeing OOME not being handled by Try. And I don't see any way to override that behavior.

But the point is that you shouldn't do anything but abort.
 


And as far as i can see akka uses scala.concurrent.**ExecutionContext$#**defaultReporter

which just prints stacktrace everywhere, please point to me the place where
it reimplements unhandledException handler.

Not sure what you were trying to say with your link btw.


Perhaps we're miscommunicationg, are you suggesting completing future
(wrapped) and try (wrapped) AND rethrowing in Future.apply and Try.apply?
If so, that's definitely something that could be considered. The key being
that it needs to travel to the UEH.

Not sure i follow. Can you rethrow anything in Future.apply?

Ah, poor choice of words, I meant ec.reportFailure(...)
 

Cheers,



Thanks, Aleh

Scott Carey

unread,
May 29, 2013, 3:04:19 AM5/29/13
to scala-l...@googlegroups.com, Aleh Aleshka
I have had a few cases where catching OOME is necessary. 

Although OOME can never be _guaranteed_ to be the fault of your thread, it often is.  If you try and allocate a 1GB array and get one, it definitely is.  It is generally safe to assume that if you are allocating a large array (~100K or more) and it fails, that it is safe to catch and do some work -- such as attempt a safe shutdown on an application to persist some state to disk.  The OOME is saying either "you're screwed" or "you attempted to allocate something massive".  The response to those two conditions is not the same.

When reading binary serialized data, some formats prefix their strings or arrays with the length of the string or array.  If that data is being read from network or disk, but is corrupted, it can easily trigger an attempt to allocate a very large array.  A strategy to handle the situation by catching OOME and examining the data stream more carefully to decide how to proceed is perfectly valid strategy, but you must be able to catch OOME.

My conclusion is simple:  although SOE and OOME can not be relied upon to be safe or consistent in behavior, there are (very rare) times when one will wish to at least attempt to handle them.

OOME is easier -- if your catch executes, it executes, if it leads to another OOME, you're very likely toast.

√iktor Ҡlang

unread,
May 29, 2013, 4:33:22 AM5/29/13
to scala-l...@googlegroups.com, Aleh Aleshka
Only if your app only has 1 thread, since other threads could get OOMEs simultaneously.
 
  It is generally safe to assume that if you are allocating a large array (~100K or more) and it fails, that it is safe to catch and do some work -- such as attempt a safe shutdown on an application to persist some state to disk. 

So in that case you do your work in your own try-catch.
 
The OOME is saying either "you're screwed" or "you attempted to allocate something massive".  The response to those two conditions is not 
the same.

It's unfortunate that there is no difference between the two on the JVM level.
 

When reading binary serialized data, some formats prefix their strings or arrays with the length of the string or array.  If that data is being read from network or disk, but is corrupted, it can easily trigger an attempt to allocate a very large array.  A strategy to handle the situation by catching OOME and examining the data stream more carefully to decide how to proceed is perfectly valid strategy, but you must be able to catch OOME.

You can still do your own try-catches. For Futures I hope things are pretty clear, as the OOME has happened on _another thread_.
 

My conclusion is simple:  although SOE and OOME can not be relied upon to be safe or consistent in behavior, there are (very rare) times when one will wish to at least attempt to handle them.

And for those cases you use your own try-catch.
 

OOME is easier -- if your catch executes, it executes, if it leads to another OOME, you're very likely toast.

Yep, see above.


Cheers,
 

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Aleh Aleshka

unread,
May 29, 2013, 8:52:27 AM5/29/13
to scala-l...@googlegroups.com, √iktor Ҡlang
Just to be sure we are on the same page.
The fact that a thread got OOME does not mean that there is less memory in
the system after it.
In other words available memory is not affected by failed allocations.
Actually after OOME was thrown you are likely to get some additional free
memory because it is likely that previously allocated stuff in this thread
is now unreachable.

Using a wrapper around Try and Future might be a solution to overcome
NonFatal deficiencies, but meh...

Aleh

On Wed, 29 May 2013 11:33:22 +0300, √iktor Ҡlang <viktor...@gmail.com>
>>>>>>> https://github.com/scala/****sca**la/blob/v2.10.1/src/**<https://github.com/scala/**scala/blob/v2.10.1/src/**>
>>>>>>> library/scala/concurrent/impl/******Future.scala#L29<https://**git**
>>>>>>> hub.com/scala/scala/blob/**v2.**10.1/src/library/scala/**concurr**
>>>>>>> ent/impl/Future.scala#**L29<https://github.com/scala/scala/blob/v2.10.1/src/library/scala/concurrent/impl/Future.scala#L29>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>> (This is what Akka does)
>>>>>>>
>>>>>>>
>>>>>>> You just said that default unhandledException handler is not
>>>>>> implemented
>>>>>> properly.
>>>>>> And suggested that every user should reimplement it and catch Error
>>>>>> themselves.
>>>>>> Why do this instead of handling properly inside of library?
>>>>>> And what do you suggest for Try? Wrap it in try catch?
>>>>>>
>>>>>>
>>>>> WDYM? The idea is that no-one touches the fatal exceptions so it
>>>>> bubbles to
>>>>> the top of the thread and hits the UEH.
>>>>>
>>>>> Here is the Akka UEH:
>>>>> https://github.com/akka/akka/**b**lob/master/akka-actor/src/**main**
>>>>> /scala/akka/actor/**ActorSystem.**scala#L466<https://github.com/akka/akka/blob/master/akka-actor/src/main/scala/akka/actor/ActorSystem.scala#L466>

√iktor Ҡlang

unread,
May 29, 2013, 9:15:48 AM5/29/13
to Aleh Aleshka, scala-l...@googlegroups.com
On Wed, May 29, 2013 at 2:52 PM, Aleh Aleshka <oleg...@gmail.com> wrote:
Just to be sure we are on the same page.
The fact that a thread got OOME does not mean that there is less memory in the system after it.

The exact same amount of memory. Free memory on the other hand is unknown, since we cannot assume a single thread of execution.
 
In other words available memory is not affected by failed allocations.

Available or free?
 
Actually after OOME was thrown you are likely to get some additional free memory because it is likely that previously allocated stuff in this thread is now unreachable.

Also assuming a single thread.
 

Using a wrapper around Try and Future might be a solution to overcome NonFatal deficiencies, but meh...

If you have places in your program where _you_ can guarantee that if an OOME happens you can do something about it, it's easy enough for you to deal with it.

def unsafe[T](block: => T): Try[T] = Try(try block catch { case e: OutOfMemoryError => throw new RuntimeException("I swallowed an OOME", e) })


val IWantItAll = unsafe(Array[Int](Int.MaxValue)) 

Cheers,
 

For more options, visit https://groups.google.com/groups/opt_out.



Sassa Nf

unread,
May 29, 2013, 10:50:56 AM5/29/13
to scala-l...@googlegroups.com
OOME is thrown in many other cases. In fact, when any other resource is unavailable, for which there is no special Error.

Eg try to spawn more threads than /etc/security/limits.conf lets you, or try to contend on more objects than JVM (or OS) limit on condvars.


Alex



2013/5/29 Aleh Aleshka <oleg...@gmail.com>

For more options, visit https://groups.google.com/groups/opt_out.






--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/eC9dqTTBYHg/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to scala-language+unsubscribe@googlegroups.com.

Scott Carey

unread,
May 30, 2013, 3:35:46 AM5/30/13
to scala-l...@googlegroups.com, Aleh Aleshka


On Wednesday, May 29, 2013 1:33:22 AM UTC-7, √iktor Klang wrote:



On Wed, May 29, 2013 at 9:04 AM, Scott Carey <scott...@gmail.com> wrote:



I have had a few cases where catching OOME is necessary. 

Although OOME can never be _guaranteed_ to be the fault of your thread, it often is.  If you try and allocate a 1GB array and get one, it definitely is.

Only if your app only has 1 thread, since other threads could get OOMEs simultaneously.

The situation I have in mind is multi-threaded.  At least Sun/Oracle's JVM will only throw the error on the one thread where the failed allocation occurs.  The app I have in mind where I ran into this was a high throughput multi-threaded application, which was occasionally reading corrupt serialized data and attempting large allocations.  The only OOMEs we saw (even under stress tests with 100's of concurrent threads) were on the threads attempting these big allocations.   What is more interesting is that the JVM wasn't smart enough to see that its max heap was not large enough for the allocation, and in each case it triggered, and completed, a full GC before throwing the error, even though the GC could never have succeeded.
This was a huge performance problem for this app, even if it recovered from the OOME the full GC killed throughput and caused large outliers in latency.

At least Sun's JVM lives up to the OOME documentation and only throws the OOME after trying rather hard to free up the memory with a full GC. When after that GC there is still no room for the allocation, the OOME is thrown.  If the OOME is for a large allocation, it is extremely unlikely that any other threads doing small allocations will fail as a result of the large allocation.
 
 
  It is generally safe to assume that if you are allocating a large array (~100K or more) and it fails, that it is safe to catch and do some work -- such as attempt a safe shutdown on an application to persist some state to disk. 

So in that case you do your work in your own try-catch.

I am not commenting on what the right approach in Scala is, only providing examples from my experience where the "Never catch an Error" rule does not apply.

Ben Hutchison

unread,
May 30, 2013, 3:54:46 AM5/30/13
to scala-l...@googlegroups.com
If his debate makes one thing clear, no one policy is going to satisfy
everyone. To me, that is a limitation with Try as it stands: it's
policy on what's Fatal vs Nonfatal is hard-coded.

Aleh (at least) wants a non-default policy. One of my favorite aspects
of the "Scala Way" is that, generally, it tries where practical to
enable programmers to make these kinds of decisions for themselves. So
whether he's "wrong" or "right", he's got a right to choose.

I don't have a good proposal for how to customize Try's policy on what
is/is-not Fatal, but I think it's reasonable to ask for such a
facility.

-Ben

On Wed, May 22, 2013 at 9:39 AM, Aleh Aleshka <oleg...@gmail.com> wrote:
> I wonder why this kind of code is not handling the exception
> scala.util.Try{ Array.ofDim(1000000000)}
>
> Surely the vm is not in unrepairable state after OOME
>
> Thanks, Aleh
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-language" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scala-languag...@googlegroups.com.

√iktor Ҡlang

unread,
May 30, 2013, 5:03:25 AM5/30/13
to scala-l...@googlegroups.com, Aleh Aleshka
On Thu, May 30, 2013 at 9:35 AM, Scott Carey <scott...@gmail.com> wrote:


On Wednesday, May 29, 2013 1:33:22 AM UTC-7, √iktor Klang wrote:



On Wed, May 29, 2013 at 9:04 AM, Scott Carey <scott...@gmail.com> wrote:



I have had a few cases where catching OOME is necessary. 

Although OOME can never be _guaranteed_ to be the fault of your thread, it often is.  If you try and allocate a 1GB array and get one, it definitely is.

Only if your app only has 1 thread, since other threads could get OOMEs simultaneously.

The situation I have in mind is multi-threaded.  At least Sun/Oracle's JVM will only throw the error on the one thread where the failed allocation occurs. 
 
The app I have in mind where I ran into this was a high throughput multi-threaded application, which was occasionally reading corrupt serialized data and attempting large allocations.  The only OOMEs we saw (even under stress tests with 100's of concurrent threads) were on the threads attempting these big allocations.   What is more interesting is that the JVM wasn't smart enough to see that its max heap was not large enough for the allocation, and in each case it triggered, and completed, a full GC before throwing the error, even though the GC could never have succeeded.
This was a huge performance problem for this app, even if it recovered from the OOME the full GC killed throughput and caused large outliers in latency.

At least Sun's JVM lives up to the OOME documentation and only throws the OOME after trying rather hard to free up the memory with a full GC. When after that GC there is still no room for the allocation, the OOME is thrown.  If the OOME is for a large allocation, it is extremely unlikely that any other threads doing small allocations will fail as a result of the large allocation.

So you have a perfect use-case where it makes sense to wrap that dangerous potentially-huge allocation in your own try-catch block and guard against OOME and falls back to do something else, what I'm arguing is that it's a poor default behavior.
 
 
 
  It is generally safe to assume that if you are allocating a large array (~100K or more) and it fails, that it is safe to catch and do some work -- such as attempt a safe shutdown on an application to persist some state to disk. 

So in that case you do your work in your own try-catch.

I am not commenting on what the right approach in Scala is, only providing examples from my experience where the "Never catch an Error" rule does not apply. 

--
You received this message because you are subscribed to the Google Groups "scala-language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-languag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Cheers,

Sassa Nf

unread,
May 30, 2013, 6:20:04 AM5/30/13
to scala-l...@googlegroups.com
I think you don't have a case for catching OOME, rather a case of a crap serialization design.

Corruption of array length can lead to negative size arrays, or arrays that are larger than on input (2G where 2K was serialized) (even if not OOMEing)

Alex



2013/5/30 Scott Carey <scott...@gmail.com>

--
You received this message because you are subscribed to a topic in the Google Groups "scala-language" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scala-language/eC9dqTTBYHg/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to scala-languag...@googlegroups.com.

Nils Kilden-Pedersen

unread,
May 30, 2013, 11:53:23 AM5/30/13
to scala-l...@googlegroups.com, Aleh Aleshka
But why only SOE and OOME?

I've worked in the trading industry and there's a clear need to try to get out of positions when encountering unexpected exceptions. But in such a case, it just doesn't make sense to discriminate SOE and OOME. You want to attempt to exit positions regardless of the specific cause.
 

OOME is easier -- if your catch executes, it executes, if it leads to another OOME, you're very likely toast.

Reply all
Reply to author
Forward
0 new messages