The attached code hangs quite a bit with the 1.3 jdk when jvmpi is used. It works fine on the Solaris 1.3 jdk. With the jit enabled it hangs much less often. I have never got it to run to completion (should take around 30 seconds) with java -Dcompiler=NONE and have got it to complete once using java_g -Dcompiler=NONE (I must have tried at least 20 times).
Here is how I am running the class: /usr/java/IBMJava2-13/bin/java -Djava.compiler=NONE -Xrunhprof:cpu=samples,depth=6 gcing2
One significant point is the runhprof option. Without this it works. As I am sure you are aware, runhprof invokes some jvmpi profiling that dumps information to a file (by default). I notice that this profiling does not work properly when the jit is enabled. The jit seems to effectively cause some of the profiling not to be done so another question is: is it a documented limitation that jvmpi does not work properly with the jit enabled?
I have been unsuccessful in producing a javacore file for the hang - javacore hangs and CTRL-C and CTRL-Z won't get me out - something to do with gcing maybe (obviously whoever put javacore into the 1.3 jdk didn't test it properly! ;-). I have also tried gdb but even with java_g it is giving me no information on the threads (I'm using gdb v5.0).
I have attached a javacore file from another app running on the same system so you can see my setup.
cheers, Allan.
-- Allan Boyd
al...@volantis.com Volantis Systems www.volantis.com tel: +44 (0) 1344 631828
[
gcing2.java < 1K ] public class gcing2 extends Thread { static int iters=100; static int sCount=1000; static int threadCount=10;
public void run() { int count=0; long start, end, total=0; while(count<iters) { start=System.currentTimeMillis(); for(int i=0;i<sCount;i++) { String s = new String(this.toString()); } System.gc(); end=System.currentTimeMillis(); total += end-start; count++; System.out.println(getName() + ": loop=" + count + ", looptime=" + (end-start) + ", totaltime=" + total); } }
SIGQUIT received at 40029e39 in /lib/libpthread.so.0. J2RE 1.3.0 IBM build cx130-20001124 /usr/java/IBMJava2-13/jre/bin/exe/java -Djava.compiler=NONE TestHprof
System Properties ----------------- Java Home Dir: /usr/java/IBMJava2-13/jre Java DLL Dir: /usr/java/IBMJava2-13/jre/bin Sys Classpath: /usr/java/IBMJava2-13/jre/lib/rt.jar:/usr/java/IBMJava2-13/jre/lib/i18n.jar :/usr/java/IBMJava2-13/jre/classes User Args: -Djava.class.path=.:/opt/oracle/product/8.1.5/jdbc/lib/classes111.zip:/opt/ oracle/product/8.1.5/jdbc/lib/nls_chaset11.zip:/opt/Jakarta/Tomcat/lib/serv let.jar:/opt/Jakarta/Tomcat/lib/xerces.jar:/opt/rex/rex.jar:/home/aboyd/wor k/voyager:/home/aboyd/work/voyager/Test:/usr/java/hat/bin/hat.zip -Djava.compiler=NONE
Current Thread Details ----------------------
"Finalizer" (TID:0x40328708, sys_thread_t:0x80d2c00, state:CW, native ID:0xc04) prio=8 at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:114) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:129) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:168)
----- Native Stack ----- killpg at 0x4007c320 in libc.so.6 pthread_setconcurrency at 0x400294b9 in libpthread.so.0 pthread_cond_wait at 0x40025a59 in libpthread.so.0 condvarWait at 0x402da82a in libhpi.so sysMonitorWait at 0x402dc277 in libhpi.so lkMonitorWait at 0x40226ed4 in libjvm.so JVM_MonitorWait at 0x401ef41e in libjvm.so mmipSysInvokeJni at 0x4026e314 in libjvm.so mmisInvokeJniMethodHelper at 0x4026defd in libjvm.so mmipInvokeJniMethod at 0x4026e853 in libjvm.so ivq_doinvoke_I__ at 0x40247b71 in libjvm.so ivq_doinvoke_I__ at 0x40247b71 in libjvm.so EJivq_doinvoke_V__ at 0x40242d29 in libjvm.so ?? ------------------------------------------------------------------------- Operating Environment --------------------- Host : bluemarlin.uk.volantis.com. OS Level : 2.2.16-22.#2 Mon Oct 16 13:15:51 BST 2000 glibc Version : 2.1.92 No. of Procs : 1 Memory Info: total: used: free: shared: buffers: cached: Mem: 329981952 325705728 4276224 89255936 86495232 102457344 Swap: 271392768 6148096 265244672 MemTotal: 322248 kB MemFree: 4176 kB MemShared: 87164 kB Buffers: 84468 kB Cached: 100056 kB BigTotal: 0 kB BigFree: 0 kB SwapTotal: 265032 kB SwapFree: 259028 kB
User Limits (in bytes except for NOFILE and NPROC) - RLIMIT_FSIZE : infinity RLIMIT_DATA : infinity RLIMIT_STACK : 2093056 RLIMIT_CORE : 1024000000 RLIMIT_NOFILE : 1024 RLIMIT_NPROC : 2048
"Finalizer" (TID:0x40328708, sys_thread_t:0x80d2c00, state:CW, native ID:0xc04) prio=8 at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:114) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:129) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:168)
----- Native Stack ----- killpg at 0x4007c320 in libc.so.6 pthread_setconcurrency at 0x400294b9 in libpthread.so.0 pthread_cond_wait at 0x40025a59 in libpthread.so.0 condvarWait at 0x402da82a in libhpi.so sysMonitorWait at 0x402dc277 in libhpi.so lkMonitorWait at 0x40226ed4 in libjvm.so JVM_MonitorWait at 0x401ef41e in libjvm.so mmipSysInvokeJni at 0x4026e314 in libjvm.so mmisInvokeJniMethodHelper at 0x4026defd in libjvm.so mmipInvokeJniMethod at 0x4026e853 in libjvm.so ivq_doinvoke_I__ at 0x40247b71 in libjvm.so ivq_doinvoke_I__ at 0x40247b71 in libjvm.so EJivq_doinvoke_V__ at 0x40242d29 in libjvm.so ?? -------------------------------------------------------------------------
"Reference Handler" (TID:0x40328750, sys_thread_t:0x80cf0b0, state:S, native ID:0x803) prio=10 at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:421) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
----- Native Stack ----- sysMonitorWait at 0x402dc277 in libhpi.so lkMonitorWait at 0x40226ed4 in libjvm.so JVM_MonitorWait at 0x401ef41e in libjvm.so mmipSysInvokeJni at 0x4026e314 in libjvm.so mmisInvokeJniMethodHelper at 0x4026defd in libjvm.so mmipInvokeJniMethod at 0x4026e853 in libjvm.so ivoq_doinvoke_V__ at 0x402480c1 in libjvm.so EJivq_doinvoke_V__ at 0x40242d29 in libjvm.so ?? -------------------------------------------------------------------------
----- Native Stack ----- pthread_setconcurrency at 0x400294b9 in libpthread.so.0 pthread_cond_wait at 0x40025a59 in libpthread.so.0 condvarWait at 0x402da82a in libhpi.so intrUnlock at 0x402dafa3 in libhpi.so sysSignalWait at 0x402db125 in libhpi.so xmSetProtectionDomain at 0x40278c26 in libjvm.so xmExecuteThread at 0x4027a64a in libjvm.so double2ll at 0x4027500e in libjvm.so sysThreadRegs at 0x402df135 in libhpi.so pthread_detach at 0x4002760e in libpthread.so.0 __clone at 0x4012ddaa in libc.so.6 -------------------------------------------------------------------------
"main" (TID:0x403287e0, sys_thread_t:0x804fb80, state:S, native ID:0x400) prio=5 at java.lang.System.arraycopy(Native Method) at java.lang.String.getChars(String.java:551) at java.lang.StringBuffer.append(StringBuffer.java:408) at TestHprof.addToCat(TestHprof.java:15) at TestHprof.makeString(TestHprof.java:10) at TestHprof.main(TestHprof.java:54)
Monitor pool info: Initial monitor count: 32 Minimum number of free monitors before expansion: 5 Pool will next be expanded by: 16 Current total number of monitors: 32 C
> The attached code hangs quite a bit with the 1.3 jdk when jvmpi is > used. It works fine on the Solaris 1.3 jdk. With the jit enabled it > hangs much less often. I have never got it to run to completion > (should take around 30 seconds) with java -Dcompiler=NONE and have got > it to complete once using java_g -Dcompiler=NONE (I must have tried at > least 20 times).
> Here is how I am running the class: > /usr/java/IBMJava2-13/bin/java -Djava.compiler=NONE > -Xrunhprof:cpu=samples,depth=6 gcing2
> One significant point is the runhprof option. Without this it works. > As I am sure you are aware, runhprof invokes some jvmpi profiling that > dumps information to a file (by default). I notice that this profiling > does not work properly when the jit is enabled. The jit seems to > effectively cause some of the profiling not to be done so another > question is: is it a documented limitation that jvmpi does not work > properly with the jit enabled?
I have not seen any such document, so I guess it's supposed to work.
> I have been unsuccessful in producing a javacore file for the hang - > javacore hangs and CTRL-C and CTRL-Z won't get me out - something to > do with gcing maybe (obviously whoever put javacore into the 1.3 jdk > didn't test it properly! ;-).
Probably the same programmer who, just before he left to join Volantis, reset the SEGV signal handler to give a javacore, ignoring the fact that the MMI uses SEGVs to detect NULL objects, causing all our test cases to fail. ;-)
> I have also tried gdb but even with java_g it is giving me no > information on the threads (I'm using gdb v5.0).
All the threads end up suspended. It could be that jvmpi puts some extra stress on the threading and signalling processing which has been causing us problems. A lot of development work has been going on in this area, so I'll run your testcase on the JVM when this work is completed.
Thanks for looking into that. I will look forward to the updated threading model.
I received an e-mail from one of the JProbe engineers (you may have heard of these guys since we had to do stuff in the JDKs so that their product would work). JProbe uses jvmpi. He said that the jit causes performance analysis problems because it does method inlining (hmm, maybe this can be disabled). FYI: JProbe tell me that they cannot support their product on Solaris because HotSpot breaks jvmpi even more than the IBM jit - at least you can switch the jit off!
As for SIGSEGV: I explained what the problem was to Tom before I left but I ran out of time. Anyway I knew you would be up to the task of fixing it - well done! ;-)
> > The attached code hangs quite a bit with the 1.3 jdk when jvmpi is > > used. It works fine on the Solaris 1.3 jdk. With the jit enabled it > > hangs much less often. I have never got it to run to completion > > (should take around 30 seconds) with java -Dcompiler=NONE and have got > > it to complete once using java_g -Dcompiler=NONE (I must have tried at > > least 20 times).
> > Here is how I am running the class: > > /usr/java/IBMJava2-13/bin/java -Djava.compiler=NONE > > -Xrunhprof:cpu=samples,depth=6 gcing2
> > One significant point is the runhprof option. Without this it works. > > As I am sure you are aware, runhprof invokes some jvmpi profiling that > > dumps information to a file (by default). I notice that this profiling > > does not work properly when the jit is enabled. The jit seems to > > effectively cause some of the profiling not to be done so another > > question is: is it a documented limitation that jvmpi does not work > > properly with the jit enabled?
> I have not seen any such document, so I guess it's supposed to work.
> > I have been unsuccessful in producing a javacore file for the hang - > > javacore hangs and CTRL-C and CTRL-Z won't get me out - something to > > do with gcing maybe (obviously whoever put javacore into the 1.3 jdk > > didn't test it properly! ;-).
> Probably the same programmer who, just before he left to join Volantis, > reset the SEGV signal handler to > give a javacore, ignoring the fact that the MMI uses SEGVs to detect > NULL objects, causing all our test > cases to fail. ;-)
> > I have also tried gdb but even with java_g it is giving me no > > information on the threads (I'm using gdb v5.0).
> All the threads end up suspended. It could be that jvmpi puts some > extra stress on the threading and > signalling processing which has been causing us problems. A lot of > development work has been going > on in this area, so I'll run your testcase on the JVM when this work is > completed.
> Neil
-- Allan Boyd Volantis Systems www.volantis.com tel: +44 (0) 1344 631828
> Thanks for looking into that. I will look forward to the updated > threading model.
> I received an e-mail from one of the JProbe engineers (you may have > heard of these guys since we had to do stuff in the JDKs so that their > product would work). JProbe uses jvmpi. He said that the jit causes > performance analysis problems because it does method inlining (hmm, > maybe this can be disabled). FYI: JProbe tell me that they cannot > support their product on Solaris because HotSpot breaks jvmpi even > more than the IBM jit - at least you can switch the jit off!
> As for SIGSEGV: I explained what the problem was to Tom before I left > but I ran out of time. Anyway I knew you would be up to the task of > fixing it - well done! ;-)
export JITC_COMPILEOPT=NINLINING to disable method inling.
Hey Allan, Congrats on the new job... wish you'd told me you were leaving! Some poor chap who also has the name "Allan Boyd" and also works in the IBMGB zone must be wondering "who is this yankee lunatic, and what is this javadoc defect he's yammering about? maybe if I bin the notes he'll go away." So who was *supposed* to inherit all your old defects? Best of luck -=Chris
-- (note: I don't speak for IBM, they don't speak for me; it's better that way. )
> All the threads end up suspended. It could be that jvmpi puts some > extra stress on the threading and > signalling processing which has been causing us problems. A lot of > development work has been going > on in this area, so I'll run your testcase on the JVM when this work is > completed.
Neil, it sounds like you boys are busy banging on some Thread related problems in the Linux JDK1.3 code. Does that mean I should not submit my test cases for two fully repeatable bugs with the 11/24 build relating to threads? These bugs are:
1) An issue that calling interrupt() on a thread object does NOT interrupt the thread. The test case works great in all other JVMs, except for IBM JVM JDK1.3.
2) The issue I sent you a while back relating to repeated XMLRPC calls using multiple threads.. The entire IBM JVM deadlocks, and all threads (even threads doing nothing related to XMLRPC) freeze dead.
Should I not bother submitting these test cases until I can run them myself with the new Threads code? When is that code going to be released, and can/should I grab a pre-release version to see if the new code resolves the two issues?
Unfortunately with these two bugs in the IBM JVM we can't really use the ibm jvm for any production stuff, since it would crash every couple hours. The IBM JVM seems to run about twice as fast as Sun's JDK1.3 with hotspot. IBM's done some really super work here.
Joe Kislo wrote: > > All the threads end up suspended. It could be that jvmpi puts some > > extra stress on the threading and > > signalling processing which has been causing us problems. A lot of > > development work has been going > > on in this area, so I'll run your testcase on the JVM when this work is > > completed.
> Neil, it sounds like you boys are busy banging on some Thread related > problems in the Linux JDK1.3 code. Does that mean I should not submit > my test cases for two fully repeatable bugs with the 11/24 build > relating to threads? These bugs are:
> 1) An issue that calling interrupt() on a thread object does NOT > interrupt the thread. The test case works great in all other JVMs, > except for IBM JVM JDK1.3.
> 2) The issue I sent you a while back relating to repeated XMLRPC calls > using multiple threads.. The entire IBM JVM deadlocks, and all threads > (even threads doing nothing related to XMLRPC) freeze dead.
> Should I not bother submitting these test cases until I can run them > myself with the new Threads code? When is that code going to be > released, and can/should I grab a pre-release version to see if the new > code resolves the two issues?
> Unfortunately with these two bugs in the IBM JVM we can't really use > the ibm jvm for any production stuff, since it would crash every couple > hours. The IBM JVM seems to run about twice as fast as Sun's JDK1.3 > with hotspot. IBM's done some really super work here.
> Thanks, > -Joe
Joe,
All test cases are gratefully received. I have used your XMLRPC testcase to verify that the threading has improved in the latest internal delivery. It also highlighted a problem when trace is switched on which is being worked on.
I'll let you have a pre-release if I can, but there are practical difficulties; we aim to have a short turnaround once a product has entered testing. and these threading changes have been integrated into the Service code line at the last minute. There won't be much time to deliver a pre-release, get your feedback and make any changes. The target refresh cycles of 6-8 weeks also make it difficult to delay a product without impacting other deliver dates. I'll see what I can do though, because it is a good time to raise problems while Development are still interested in the problem :-).
Thanks for your kind comments.
Please don't everybody ask for a pre-release - the real one will be along soon!
> All test cases are gratefully received. I have used your XMLRPC testcase > to verify that the threading has improved in the latest internal delivery. > It also highlighted a problem when trace is switched on which is being > worked on.
Okay, I'm attaching a second test case. This one has to do with the interrupt() signal not being received by threads. I've put a writeup at the beginning of the java file. It works on other JVMs. It doesn't matter if the JIT is on or off.
> I'll let you have a pre-release if I can, but there are practical difficulties; > we aim to have a short turnaround once a product has entered testing. > and these threading changes have been integrated into the Service code > line at the last minute. There won't be much time to deliver a pre-release, > get your feedback and make any changes. The target refresh cycles of > 6-8 weeks also make it difficult to delay a product without impacting > other deliver dates. I'll see what I can do though, because it is a good time > to raise problems while Development are still interested in the problem :-).
Okay, send me an email if the pre-release works out, otherwise I'll hit it next time around. I've got a suite of tests I bang a JVM against before I allow it to make it to one of our production boxes. Most of the stuff is internal, and (like the XMLRPC thing) a pain in the butt to turn into a unit test.
/** * Test case for IBM JDK1.3 Thread signaling failure under Linux. * * Threads don't seem to be signaled properly when interrupt() is called on them. As a result, you can interrupt * a thread, but it will fail to acctually interrupt. The symptom of this is when your application attempts to terminate, there will be * unterminated threads, and the JVM will not quit. * * I noticed that if all of the threads are in the wait state (and not on the ready_queue), they appear to be signaled properly * * if you run: * java Worker 0 * * You will see that the IBM JVM does not terminate at the end of the test. You will also see that one or more of the threads * did not print "Terminating!" to the screen. These are the threads which did not die, even though they were interrupted. You will * notice that the threads which do not Terminate, *DID* print "Starting!" to the screen. Meaning they *ARE* inside the exception handler * for the interrupt signal. * -- You might need to run this test a couple times before it will happen. * * * If you run: * java Worker 1 * * You will see that the IBM JVM -does- function properly. This shows that the IBM JVM Thread signaling code isn't a total loss :) * If you examine the code you will see why this mode does not crash the IBM_JVM (because all the threads should have safely made it to the wait(); line) * * If you run this test on any other JVM, everything works dandy. * * * Email me if you need any help, ki...@athenium.com * * @author <a href="mailto:ki...@athenium.com "Joe Kislo</a> * @version */
public Worker (String workerID){ this.workerID=workerID; }
public boolean getReady() { return ready; }
public String getWorkerID() { return workerID; }
public void poke() { synchronized(this) { notifyAll(); } }
public void run() { try { System.out.println(workerID+": Starting!");
while (true) { synchronized(this) { ready=true; wait(); } System.out.println(workerID+": Ouch!"); } } catch (InterruptedException ie) { System.out.println(workerID+": Terminating!"); } } static public void main(String[] str) { if (str.length==0) { System.out.println("Usage: java Worker [0|1]"); System.out.println(""); System.out.println("State 0 busts IBM_JVM"); System.out.println("State 1 proves IBM_JVM Threads signal properly when they are not in the ready_queue"); System.exit(-1); } boolean bust_jvm = str[0].equals("0"); int numThreads=20; Vector theThreads = new Vector(); System.out.println("Making threads!"); for (int i=0;i<numThreads;i++) { Worker w = new Worker(Integer.toString(i)); theThreads.addElement(w); w.start(); } System.out.println("Poking threads!"); for (Enumeration e = theThreads.elements();e.hasMoreElements();) { Worker w = (Worker) e.nextElement(); if (!bust_jvm) { while (!w.getReady()){} } w.poke(); } System.out.println("Interrupting threads!");
for (Enumeration e = theThreads.elements();e.hasMoreElements();) { ((Worker)e.nextElement()).interrupt(); } System.out.println("I should terminate now"); }
1. You should use while(!isInterrupted()) in your while loop. The 'try' block catches only interrupts which are fired while the program is in the 'wait' call. I succeeded (but just once) to run a "non-busting" "java Worker 1" which actually busted because of this problem. 2. When you start your program with "java Worker 0", you assume that all threads are running, before an interrupt signal occurs. This does not need to be the case. With many threads started, it WILL not be the case. With "java Worker 1" you wait till all threads are started and the program functions properly.
> > All test cases are gratefully received. I have used your XMLRPC testcase > > to verify that the threading has improved in the latest internal delivery. > > It also highlighted a problem when trace is switched on which is being > > worked on.
> Okay, I'm attaching a second test case. This one has to do with the > interrupt() signal not being received by threads. I've put a writeup at > the beginning of the java file. It works on other JVMs. It doesn't > matter if the JIT is on or off.
> > I'll let you have a pre-release if I can, but there are practical difficulties; > > we aim to have a short turnaround once a product has entered testing. > > and these threading changes have been integrated into the Service code > > line at the last minute. There won't be much time to deliver a pre-release, > > get your feedback and make any changes. The target refresh cycles of > > 6-8 weeks also make it difficult to delay a product without impacting > > other deliver dates. I'll see what I can do though, because it is a good time > > to raise problems while Development are still interested in the problem :-).
> Okay, send me an email if the pre-release works out, otherwise I'll hit > it next time around. I've got a suite of tests I bang a JVM against > before I allow it to make it to one of our production boxes. Most of > the stuff is internal, and (like the XMLRPC thing) a pain in the butt to > turn into a unit test.
> /** > * Test case for IBM JDK1.3 Thread signaling failure under Linux. > * > * Threads don't seem to be signaled properly when interrupt() is called
on them. As a result, you can interrupt
> * a thread, but it will fail to acctually interrupt. The symptom of this
is when your application attempts to terminate, there will be
> * unterminated threads, and the JVM will not quit. > * > * I noticed that if all of the threads are in the wait state (and not on
the ready_queue), they appear to be signaled properly
> * > * if you run: > * java Worker 0 > * > * You will see that the IBM JVM does not terminate at the end of the
test. You will also see that one or more of the threads
> * did not print "Terminating!" to the screen. These are the threads
which did not die, even though they were interrupted. You will
> * notice that the threads which do not Terminate, *DID* print "Starting!"
to the screen. Meaning they *ARE* inside the exception handler
> * for the interrupt signal. > * -- You might need to run this test a couple times before it will happen. > * > * > * If you run: > * java Worker 1 > * > * You will see that the IBM JVM -does- function properly. This shows
that the IBM JVM Thread signaling code isn't a total loss :)
> * If you examine the code you will see why this mode does not crash the
IBM_JVM (because all the threads should have safely made it to the wait(); line)
> * > * If you run this test on any other JVM, everything works dandy. > * > * > * Email me if you need any help, ki...@athenium.com > * > * @author <a href="mailto:ki...@athenium.com "Joe Kislo</a> > * @version > */
> 1. You should use while(!isInterrupted()) in your while loop. The 'try' > block catches only interrupts which are fired while the program is in the > 'wait' call. I succeeded (but just once) to run a "non-busting" "java Worker > 1" which actually busted because of this problem. > 2. When you start your program with "java Worker 0", you assume that all > threads are running, before an interrupt signal occurs. This does not need > to be the case. With many threads started, it WILL not be the case. With > "java Worker 1" you wait till all threads are started and the program > functions properly.
Hmm, I don' tthink there's a bug in the example. I will admit it's bad code, but it still illustrates my point. When you run Worker 0, you will get: 20 threads say "Starting!", 19 threads say "Terminating!"
It's that thread that didn't terminate which is the problem. So lets figure out what it's doing. It said "Starting!", and has NOT said "Ouch!", which means it is either BEFORE or INSIDE the wait(). Lets take the case where it is INSIDE the wait(). If it is inside the wait, it should throw the InterruptedException when interrupted, and say "Terminating". It does not. I think we agree here.
What I think we don't agree on is if the thread is NOT in the wait() yet. So lets say it printed "Starting!" then yielded. It gets interrupted. Then it waits(). Should wait() then at that point, since the thread is already interrupted, immediately throw the InterruptedException, or wait until the thread is interrupted again. I don't have the JLS infront of me, so I'll have to do this by simply seeing what happens in a JVM. So if wait() only throws an InterruptedException if the thread is interrupted during the wait(), then:
this.currentThread().interrupt(); wait();
Should ALWAYS HALT indefinately. (Assuming nothing else interrupts it).
However, attached is a quick piece of java code, which shows that if you interrupt a thread, then wait(), wait throws the InterruptedException, even though the thread was interrupted outside the wait().
So that ultimately closes the second possibility in the Worker example. If the thread was interrupted BEFORE wait() was executed, it *still* should have throw an InterruptedException and terminated. Yet it did not. Perhaps there is a race condition in your wait(), maybe it's checking the current interrupted state, then pulling itself off the ready queue. When that action acctually needs to be atomic.
Lemme know if you have any other questions... My little worker example illustrates I problem I have with almost all my thread pooled applications. Except the wierd thing is, usually all the workers in my pool *ARE* in the wait()... Yet they still fail to terminate. And, ofcourse, all this works just fine and dandy on any other JVM.
And as for the threads not being started yet, since they have all printed "Starting!", I know they are started. Yes there is a possibility that they might not have started, and in a real application there would need to be a test.. But since each thread prints to the screen when it starts, we know they're all started before the interrupt signal comes.
> > 1. You should use while(!isInterrupted()) in your while loop. The 'try' > > block catches only interrupts which are fired while the program is in the > > 'wait' call. I succeeded (but just once) to run a "non-busting" "java Worker > > 1" which actually busted because of this problem. > > 2. When you start your program with "java Worker 0", you assume that all > > threads are running, before an interrupt signal occurs. This does not need > > to be the case. With many threads started, it WILL not be the case. With > > "java Worker 1" you wait till all threads are started and the program > > functions properly.
> Hmm, I don' tthink there's a bug in the example. I will admit it's bad > code, but it still illustrates my point. When you run Worker 0, you > will get: > 20 threads say "Starting!", > 19 threads say "Terminating!"
> It's that thread that didn't terminate which is the problem. So lets > figure out what it's doing. It said "Starting!", and has NOT said > "Ouch!", which means it is either BEFORE or INSIDE the wait(). Lets > take the case where it is INSIDE the wait(). If it is inside the wait, > it should throw the InterruptedException when interrupted, and say > "Terminating". It does not. I think we agree here.
> What I think we don't agree on is if the thread is NOT in the wait() > yet. So lets say it printed "Starting!" then yielded. It gets > interrupted. Then it waits(). Should wait() then at that point, since > the thread is already interrupted, immediately throw the > InterruptedException, or wait until the thread is interrupted again. I > don't have the JLS infront of me, so I'll have to do this by simply > seeing what happens in a JVM. So if wait() only throws an > InterruptedException if the thread is interrupted during the wait(), > then:
> this.currentThread().interrupt(); > wait();
> Should ALWAYS HALT indefinately. (Assuming nothing else interrupts > it).
> However, attached is a quick piece of java code, which shows that if you > interrupt a thread, then wait(), wait throws the InterruptedException, > even though the thread was interrupted outside the wait().
> So that ultimately closes the second possibility in the Worker example. > If the thread was interrupted BEFORE wait() was executed, it *still* > should have throw an InterruptedException and terminated. Yet it did > not. Perhaps there is a race condition in your wait(), maybe it's > checking the current interrupted state, then pulling itself off the > ready queue. When that action acctually needs to be atomic.
> Lemme know if you have any other questions... My little worker example > illustrates I problem I have with almost all my thread pooled > applications. Except the wierd thing is, usually all the workers in my > pool *ARE* in the wait()... Yet they still fail to terminate. And, > ofcourse, all this works just fine and dandy on any other JVM.
> And as for the threads not being started yet, since they have all > printed "Starting!", I know they are started. Yes there is a > possibility that they might not have started, and in a real application > there would need to be a test.. But since each thread prints to the > screen when it starts, we know they're all started before the interrupt > signal comes.
> public Tester() { > Thread.currentThread().interrupt(); > synchronized(this) { > try { > wait(); > } catch (InterruptedException ie) { > System.out.println("Wait terminated due to interrupt"); > } > } > } > static final public void main(String[] s) { > new Tester(); > } > }
Correct it is a definite BUG. I give it one more try and then put my foot in my mouth.
I tested your Worker example with only one Worker and the program occasionally hangs. With some extra debug messages, I think I found the reason why: the interupted status flag does not seem to be volatile. The debug messages show that the thread is waiting for a lock in the "synchronize(this)" statement, it then gets an interrupt (isInterupted() returns true for the interrupting thread), but then when the lock is acquired, isInterrupted() returns FALSE.