0.3.11 showstopping hang

32 views
Skip to first unread message

Cedric Greevey

unread,
Nov 11, 2012, 9:38:13 PM11/11/12
to cl...@googlegroups.com
I don't understand why, after two months, there's no discussion on the clooj list of the fact that clooj 0.3.11 is completely unusable.

If I start it with "C:\Program Files\Java\jdk1.6.0_13\bin\javaw.exe" -server -Xmx1100m -jar D:\clooj\clooj-0.3.11-standalone.jar", everything seems peachy right up until I either a) click any pre-existing source file in the navigation pane or b) submit something (even something as simple and obviously harmless as (+ 2 3)) in the REPL. At that point, it hangs. If I clicked a source file, it does me the courtesy of displaying it in the source pane first, at least, though obviously there's no way to scroll, let alone do anything else with the file, since the UI wedges immediately at that point.

The hang does not spontaneously resolve, at least not if one waits a couple of full minutes but no longer. The javaw process can be killed, but one or two other java processes that start up become zombie processes that can't be killed from the task manager or ProcessExplorer, even when logged in as administrator.

And this happens consistently -- it wasn't a one-off. Start clooj, do pretty much anything, lockup.

I'm now reverting back to 0.2.8. Not that 0.2.8 doesn't have its own problems -- one reason I just tried 0.3.11 was in the hopes that it resolved a severe bug in 0.2.8 that I'd never seen before but that I'm encountering lately, in which 0.2.8 will just spontaneously exit without any error message (not even an hs_err_pidNNNN.log file) while trying to load and work with some large PNGs. That one didn't reproduce predictably until recently -- now I have an 8192x3072 PNG that seems to set it off consistently if I run anything from the REPL that uses it. Indeed I can make 0.2.8 explode reproducibly just by running this in the REPL without loading any project sources at all:

(import 'java.io.File)
(import 'javax.imageio.ImageIO)
(defn load-image [filename]
  (let [^File f (if (instance? File filename) filename (File. ^String filename))]
    (ImageIO/read f)))
(def imgs (for [x (range 14 19)] (load-image (str "<dir with large pngs>\\<file name prefix>" x ".png"))))
(do (first imgs) nil)

where <dir with large pngs>\<file name prefix>14.png is a 20MB 8192x3002 PNG file. (I'm guessing you could point at any group of sequentially-numbered large images and get the same result, rather than it depending on the details of the images' contents.)

Oddly, I can call

(javax.imageio.ImageIO/read (java.io.File. "<dir with large pngs>\\<file name prefix>14.png"))

and then

(javax.imageio.ImageIO/read (java.io.File. "<dir with large pngs>\\<file name prefix>15.png"))

and etc. without anything untoward happening. That includes OOME; there's obviously enough memory to hold several of them (bound to *1, *2, etc.) so def'ing imgs as above and then realizing the first element should not blow anything up even with OOME, let alone with the clearly incorrect behavior of exiting without an error message of any kind or an explicit user command to exit.

(In the original context, the images are being held not in a simple seq but in a map, and are held there via Soft/WeakReferences so OOME should be impossible as long as only a small number are strongly referenced elsewhere at any given time. In fact, only one is strongly referenced at a time. Despite this, it's provoking the clooj crashes, even though strongly referencing more than one at a time via *1 and *2, for example, isn't doing so. Something about holding them via a clojure collection instead of directly in a Var seems to be making a difference, even though it certainly shouldn't.)

Long story short: if a few large (20MB on disk, maybe 100 in memory) images are simultaneously held (directly or indirectly) inside of a Clojure seq or map (but not directly in separate Vars!), clooj 0.2.8 behaves as if something's calling System/exit somewhere; this behavior used to be intermittent but has suddenly and inexplicably become consistently reproducible, without my having changed anything. Meanwhile, clooj 0.3.11 hangs as soon as any source file is opened or any expression is evaluated in the REPL.

Unfortunately, my current project is at a standstill until I have a version of clooj that actually works, which sadly doesn't seem to be the case for 0.2.8 anymore and doesn't seem to be the case for 0.3.11 either.

Lee Spector

unread,
Nov 11, 2012, 10:24:40 PM11/11/12
to cl...@googlegroups.com

FWIW I've been using 0.3.11, as have my students. Most of us run it under Mac OS X, so maybe there's a difference there. OTOH I *have* run into several problems, including cases of hang on launch before seeing the project pane (must force quit and then relaunch) and also problems selecting projects in the Project > Open dialog. I've just been too busy to replicate them carefully and report. And they haven't been quite show stoppers, although some (especially the project opening thing) are pretty close. So I do think there are some problematic rough edges at the moment.

-Lee
--
Lee Spector, Professor of Computer Science
Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspe...@hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Cedric Greevey

unread,
Nov 11, 2012, 11:10:39 PM11/11/12
to cl...@googlegroups.com
On Sun, Nov 11, 2012 at 10:24 PM, Lee Spector <lspe...@hampshire.edu> wrote:

FWIW I've been using 0.3.11, as have my students. Most of us run it under Mac OS X, so maybe there's a difference there. OTOH I *have* run into several problems, including cases of hang on launch before seeing the project pane (must force quit and then relaunch) and also problems selecting projects in the Project > Open dialog. I've just been too busy to replicate them carefully and report. And they haven't been quite show stoppers, although some (especially the project opening thing) are pretty close. So I do think there are some problematic rough edges at the moment.

???

How's that possible? Is there even any platform-specific code *in* clooj?

Regardless, 0.3.11 doesn't work at all on Windoze Vista Home Premium with Java 1.6.0_13, and probably with many other combinations of Windows and Java version. The hang is 100% reproducible, along with the zombie java processes that outlive killing the javaw process and that silently fail to actually disappear when killed from ProcessExplorer by Administrator. (No error message, but the java process stubbornly persists in the task list.)

Lee Spector

unread,
Nov 11, 2012, 11:14:13 PM11/11/12
to cl...@googlegroups.com

I get zombie processes sometimes too, but I can kill them from the Mac OS Activity Monitor.

I don't know if there's any platform specific code in clooj, but I assume that differences in the JVMs and/or JVM/OS interaction could be responsible for the differences that we're seeing.

-Lee

Cedric Greevey

unread,
Nov 11, 2012, 11:20:41 PM11/11/12
to cl...@googlegroups.com
Hang = application bug, unless you're suggesting there's not merely platform differences but JVM bugs involved here.

Lee Spector

unread,
Nov 12, 2012, 12:32:30 AM11/12/12
to cl...@googlegroups.com

I don't mean to be suggesting anything specific. I don't get the hang that you report with 0.3.11. So I suppose that the difference must stem from other things in our setups, e.g. our OSes and JVMs.

-Lee
Reply all
Reply to author
Forward
0 new messages