Free the Lisp test!

40 views
Skip to first unread message

Seth Tisue

unread,
Jul 26, 2011, 7:09:04 PM7/26/11
to scala-i...@googlegroups.com

Every time the "Lisp test" gets disabled again, I shed a tear:
https://lampsvn.epfl.ch/trac/scala-old/changeset/25060

At Scalathon I got interested in this and started working on it.

https://issues.scala-lang.org/browse/SI-4512 makes reference to
excessive heap usage, but on trunk if I move it out of disabled the test
runs just fine with -Xmx64M which isn't much:

% mv test/disabled/run/lisp.check test/disabled/run/lisp.scala test/files/run

% JAVA_OPTS="-Xms64M -Xmx64M" test/partest --grep lisp
Argument 'lisp' matched 1 test(s)

Testing individual files
testing: [...]/files/run/lisp.scala [ OK ]

All of 1 tests were successful (elapsed time: 00:00:09)

Paul provided me with some actual logs, from Jenkins, of the failures
that prompted the removal of the test:
https://scala-webapps.epfl.ch/jenkins/job/scala-nightly-windows/935/consoleText
https://scala-webapps.epfl.ch/jenkins/job/scala-nightly-auxjvm/jdk=sun7,label=linux/1003/console
and he commented: "Both these failures have lisp.scala TIMOUT-ing, but
notice they also both cite lists-run.scala (the alphabetically next
test, I presume) as the reason. I think it's off-by-one in its
reporting. Of course, this doesn't explain why disabling lisp.scala
brings the build back."

I investigated further and found that partest compiles all the run tests
in the original JVM, but then launches separate JVM's to actually run
each compiled test, one JVM per test. And the jenkins logs show the
build failing because of failure during compilation of a test, not as
the test actually runs. This can be checked by doing:

JAVA_OPTS="-Xmx64M" test/partest --run

and seeing that the resulting failures look like the ones in the log.

So it seems like a good guess that under some circumstances there are
one too many instances of scalac running at the same time in a single
VM, and one of them runs out of heap.

In PartestDefaults.scala we see:

def numActors = propOrElse("partest.actors", "8").toInt
def poolSize = wrapAccessControl(propOrNone("actors.corePoolSize"))

And in nest/DirectRunner.scala we see:

if (PartestDefaults.poolSize.isEmpty) {
scala.actors.Debug.info("actors.corePoolSize not defined")
setProp("actors.corePoolSize", "16")
}

Given that test/partest has -Xmx1024M which is not enormous, 8 and 16
seem like rather high numbers to me given that the Scala compiler
uses more heap than Tammy Faye Bakker uses eye makeup (sorry, my pop
culture references are 25 years out of date).

In an attempt to force the failure on my Mac I tried:

JAVA_OPTS="-Dactors.corePoolSize=64 -Dpartest.actors=64 -verbose:gc" \
test/partest --run --verbose

Many "compiling foo.scala" messages were printed in quick succession,
and then eventually it failed with heap errors like the lisp test
failures. (verbose:gc showed "Full GC"'s nearly continuously for a
couple minutes before the final eventual failure -- a hazard when
testing this sort of thing.)

Conclusion: the lisp test is innocent! FREE THE LISP TEST!! It should
be restored, and if it fails in any jenkins runs, the response should be
to lower the default values for partest.actors and/or
actors.corePoolSize until the problem stops. (I wouldn't think they'd
need to be lowered far, given that at present the failure seems fairly
difficult to trigger.)

What do y'all think?

--
Seth Tisue | Northwestern University | http://tisue.net
lead developer, NetLogo: http://ccl.northwestern.edu/netlogo/

Johannes Rudolph

unread,
Jul 27, 2011, 1:52:29 AM7/27/11
to scala-i...@googlegroups.com
On Wed, Jul 27, 2011 at 1:09 AM, Seth Tisue <se...@tisue.net> wrote:
> Conclusion: the lisp test is innocent!  FREE THE LISP TEST!!  It should
> be restored, and if it fails in any jenkins runs, the response should be
> to lower the default values for partest.actors and/or
> actors.corePoolSize until the problem stops.  (I wouldn't think they'd
> need to be lowered far, given that at present the failure seems fairly
> difficult to trigger.)
>
> What do y'all think?

+1


--
Johannes

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net

Jason Zaugg

unread,
Jul 27, 2011, 2:06:38 AM7/27/11
to scala-i...@googlegroups.com
On Wed, Jul 27, 2011 at 1:09 AM, Seth Tisue <se...@tisue.net> wrote:

> Conclusion: the lisp test is innocent!  FREE THE LISP TEST!!  It should
> be restored, and if it fails in any jenkins runs, the response should be
> to lower the default values for partest.actors and/or
> actors.corePoolSize until the problem stops.  (I wouldn't think they'd
> need to be lowered far, given that at present the failure seems fairly
> difficult to trigger.)
>
> What do y'all think?

You could also add -XX:+HeapDumpOnOutOfMemoryError to the JVM options
and archive the heap dumps as build artifact for forensics.

-jason

Reply all
Reply to author
Forward
0 new messages