NullPointerException in org.xadisk.filesystem.workers.GatheringDiskWriter.writeBuffersToTransactionLog(GatheringDiskWriter.java:137)

148 views
Skip to first unread message

my.ra...@gmail.com

unread,
Feb 19, 2014, 1:17:52 AM2/19/14
to xad...@googlegroups.com
Today I received NullPointerException from org.xadisk.filesystem.workers.GatheringDiskWriter.writeBuffersToTransactionLog(GatheringDiskWriter.java:137) class.

Complete exception log is below

16:24:29,161 ERROR [org.jboss.threads.executor] (default-threads - 5) Task execution failed for task WorkWrapper@27b51b05[workManger=org.jboss.as.connector.services.workmanager.NamedWorkManager@249c017c[id=default name=default specCompliant=true shortRunningExecutor=org.jboss.as.threads.ManagedQueueExecutorService@7e2586aa longRunningExecutor=org.jboss.as.threads.ManagedQueueExecutorService@51ac4399 xaTerminator=org.jboss.jca.core.tx.jbossts.XATerminatorImpl@48a1bb3 validatedWork=[org.xadisk.filesystem.workers.FileSystemEventDelegator, org.hornetq.ra.inflow.HornetQActivation$SetupActivation, org.xadisk.filesystem.workers.TransactionTimeoutDetector, org.xadisk.filesystem.workers.GatheringDiskWriter, org.xadisk.filesystem.workers.ObjectPoolReliever, org.xadisk.filesystem.workers.CrashRecoveryWorker] callbackSecurity=null resourceAdapter=org.xadisk.connector.XADiskResourceAdapter@5c621950 shutdown=false activeWorkWrappers=[WorkWrapper@2aa0f4af, WorkWrapper@6cd22fd7, WorkWrapper@
425f3a3c, WorkWrapper@17ae0bc8]] work=org.xadisk.filesystem.workers.GatheringDiskWriter@7208719 workListener=org.xadisk.filesystem.workers.observers.CriticalWorkersListener@68857df6 workContexts=null exception=javax.resource.spi.work.WorkCompletedException: The XADisk instance has encoutered a critial issue and is no more available. Such a condition is very rare. If you think you have setup everything right for XADisk to work, please consider discussing in XADisk forums, or raising a bug with details]: org.xadisk.filesystem.exceptions.XASystemNoMoreAvailableException: The XADisk instance has encoutered a critial issue and is no more available. Such a condition is very rare. If you think you have setup everything right for XADisk to work, please consider discussing in XADisk forums, or raising a bug with details
at org.xadisk.filesystem.NativeXAFileSystem.notifySystemFailure(NativeXAFileSystem.java:528) [xadisk-SNAPSHOT.jar:]
at org.xadisk.filesystem.workers.observers.CriticalWorkersListener.workCompleted(CriticalWorkersListener.java:31) [xadisk-SNAPSHOT.jar:]
at org.jboss.jca.core.workmanager.WorkWrapper.run(WorkWrapper.java:236) [ironjacamar-core-impl-1.0.19.Final-redhat-2.jar:1.0.19.Final-redhat-2]
at org.jboss.threads.SimpleDirectExecutor.execute(SimpleDirectExecutor.java:33)
at org.jboss.threads.QueueExecutor.runTask(QueueExecutor.java:806)
at org.jboss.threads.QueueExecutor.access$100(QueueExecutor.java:45)
at org.jboss.threads.QueueExecutor$Worker.run(QueueExecutor.java:826)
at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_40]
at org.jboss.threads.JBossThread.run(JBossThread.java:122)
Caused by: javax.resource.spi.work.WorkCompletedException: The XADisk instance has encoutered a critial issue and is no more available. Such a condition is very rare. If you think you have setup everything right for XADisk to work, please consider discussing in XADisk forums, or raising a bug with details
at org.jboss.jca.core.workmanager.WorkWrapper.run(WorkWrapper.java:224) [ironjacamar-core-impl-1.0.19.Final-redhat-2.jar:1.0.19.Final-redhat-2]
... 6 more
Caused by: org.xadisk.filesystem.exceptions.XASystemNoMoreAvailableException: The XADisk instance has encoutered a critial issue and is no more available. Such a condition is very rare. If you think you have setup everything right for XADisk to work, please consider discussing in XADisk forums, or raising a bug with details
at org.xadisk.filesystem.NativeXAFileSystem.notifySystemFailure(NativeXAFileSystem.java:528) [xadisk-SNAPSHOT.jar:]
at org.xadisk.filesystem.workers.GatheringDiskWriter.processEvent(GatheringDiskWriter.java:105) [xadisk-SNAPSHOT.jar:]
at org.xadisk.filesystem.workers.EventWorker.run(EventWorker.java:50) [xadisk-SNAPSHOT.jar:]
at org.xadisk.filesystem.workers.GatheringDiskWriter.run(GatheringDiskWriter.java:386) [xadisk-SNAPSHOT.jar:]
at org.jboss.jca.core.workmanager.WorkWrapper.run(WorkWrapper.java:218) [ironjacamar-core-impl-1.0.19.Final-redhat-2.jar:1.0.19.Final-redhat-2]
... 6 more
Caused by: java.lang.NullPointerException
at org.xadisk.filesystem.workers.GatheringDiskWriter.writeBuffersToTransactionLog(GatheringDiskWriter.java:137) [xadisk-SNAPSHOT.jar:]
at org.xadisk.filesystem.workers.GatheringDiskWriter.processEvent(GatheringDiskWriter.java:103) [xadisk-SNAPSHOT.jar:]
... 9 more




Pl. Help me to fix this issue

Ravi

Nitin Verma

unread,
Feb 19, 2014, 3:09:31 AM2/19/14
to xad...@googlegroups.com, my.ra...@gmail.com
Hello Ravi,

Do you have some idea of the circumstances in which this happens? Anything specific, for example, a certain kind of test with xadisk api, or during high concurrency? Also, how frequently do you this issue?

If you can share more details, it will help me in the diagnosis.

Thanks,
Nitin

my.ra...@gmail.com

unread,
May 7, 2014, 11:15:49 AM5/7/14
to xad...@googlegroups.com, my.ra...@gmail.com
I dont know any exact situation in which this NPE throws, but we see this many times, this stop XADISK instance and all system stop working.

need to find solution of this NPE.

Ravi

Nitin Verma

unread,
Jun 15, 2014, 5:15:47 AM6/15/14
to xad...@googlegroups.com, my.ra...@gmail.com
Hi Ravi,

This may be an xadisk bug, but without being able to reproduce it or know of any patterns in which this issue occurs, it seems not possible to diagnose it. Can you try and narrow down the xadisk api usages or specific conditions when this is happening. For example, you can try new fresh xadisk setup and write a simple application (trying to use xadisk in a manner similar to the actual application), and may be put it under load (many application threads) and see if that helps. Basically, you may give a hard try to look for what is not so common (load, multi-threading, api usage pattern) in the application, as this bug is not already known. Please let me know if I can be of any help.

Good luck.

Thanks,
Nitin

my.ra...@gmail.com

unread,
Jul 2, 2014, 5:51:12 AM7/2/14
to xad...@googlegroups.com, my.ra...@gmail.com
I am trying to get details to reproduce in testing env. and to prepare sample apps, but seems this case is complex and don't know what is causing this issue, mean while i got some more exception related to this problem, see xadisk-NPE-1.txt,xadisk-NPE-2.txt,xadisk-NPE-3.txt attached in this post.

also I encounter one more issue related to Transaction logs keep growing issue, and at one point transaction los folder size reached to 39G in production. i took fix attached in https://groups.google.com/forum/#!topic/xadisk/oJBK47CiyJM and trying to see how it works.

Also my one server reached to high number of open file descriptor to 20000+ and due to this I received many xadisk related errors.

Pl. review attached log files and help to fix it.

Thanks,
Ravi
xadisk-NPE-3.txt
xadisk-NPE-2.txt
xadisk-NPE-1.txt

Nitin Verma

unread,
Jul 27, 2014, 6:43:46 AM7/27/14
to xad...@googlegroups.com, my.ra...@gmail.com
Hi Ravi,

Unfortunately I could not figure out yet, by doing static inspection of the code, the reason behind the two kinds of NPEs, one from xadisk-NPE-2.txt and the other from xadisk-NPE-3.txt (xadisk-NPE-3.txt contains the same trace as the first post on this thread). xadisk-NPE-2.txt reports "BindException: Cannot assign requested address", which seems to be due to stress as I could read from this forum thread:  https://community.oracle.com/message/8603479.

Please do share if you discover any hints regarding any of these.

Thanks,
Nitin

Ravi Soni

unread,
Sep 12, 2014, 6:19:34 AM9/12/14
to xad...@googlegroups.com
Today I found this issue one more time, and logs looks interesting and give more details. this logs show when this issue reported first and after xadisk gets fail due to this.

I am saving too many files per seconds using xadisk on highly Multithreading application. I believe that xadisk writes all data to its log file and after commit transaction it move to its actual file, this process modify xadisk log file in threads, this could be the reason on this issue. i am not sure completely, still investigate on it.

Thanks,
Ravi
Error_XADISK.txt

Nitin Verma

unread,
Sep 21, 2014, 6:13:34 AM9/21/14
to xad...@googlegroups.com
Hi Ravi,

I did some analysis of the code and could realize that one of the problems reported above, that is the NPE in GatheringDiskWriter.submitBuffer, can arise. I have logged the bug and checked-in a fix. Please refer to https://java.net/jira/browse/XADISK-168.

Good luck...

Thanks,
Nitin

Nitin Verma

unread,
Sep 27, 2014, 12:16:07 PM9/27/14
to xad...@googlegroups.com
I tried to associate the NPE discussed at issue https://java.net/jira/browse/XADISK-168 with the other exceptions reported in this thread, by doing some analysis of the code. Some findings:

-The NPE seen during GatheringDiskWriter (GDW)'s submitBuffer is caught by the application code, which was writing to a file using XAFileOutputStream's (XAFOS) write (this write method submits the buffer when it is full to GDW).

-From the logs, it appears that the same application code, when having received the NPE, continues to write further data or retries the failed write operation. Ravi needs to confirm this.

-In normal conditons, a new Buffer/byteBuffer is allocated after above submission of the existing buffer to GDW. But as the submission failed, the logic to allocate new buffer in XAFOS did not trigger, and hence these further write operations kept using the older "buffer/byteBuffer". Note that though buffer.getBuffer() would return null (as mentioned in the jira issue), but XAFOS is still holding the reference to the old "buffer" and "byteBuffer".

-So, the application code in effect keeps calling write operations while XAFOS is (incorrectly) holding on to the older buffer/byteBuffer. And these write operations keep resulting in submission of the same buffer again and again and failing inside GDW.submitBuffer due to buffer.getBuffer() being null. This is the reason we keep encountering NPE during XAFOS.write, as seen in the logs.

-This also implies that there would be multiple copies of the same buffer instance in the transaction's buffers list in the GDW. Though, GDW could have written only the first of these (and marked its byteBuffer to null using makeOnDisk); while trying to write the second one in writeBuffersToTransactionLog, it would fail with an NPE. This NPE is what was reported in the logs:


Caused by: java.lang.NullPointerException
    at org.xadisk.filesystem.workers.GatheringDiskWriter.writeBuffersToTransactionLog(GatheringDiskWriter.java:137) [xadisk-SNAPSHOT.jar:]
    at org.xadisk.filesystem.workers.GatheringDiskWriter.processEvent(GatheringDiskWriter.java:103) [xadisk-SNAPSHOT.jar:]
    ... 9 more

-One hypothesis regarding the occurrence of IndexOutOfBoundsException in the logs, is the concurrent use of the same byteBuffer by the XAFOS during its close operation and the GDW possibly executing its code to write this buffer (the first occurrence among the repetitive entries in the transaction's buffers) using transactionLogChannel.write.

I have written this post mostly for the purpose of self-notes, but Ravi please let me know if you want to relate it yourself and have questions/feedback.

Thanks,
Nitin

Ravi Soni

unread,
Nov 12, 2014, 6:07:37 AM11/12/14
to xad...@googlegroups.com
Hi,

After this fix, we are monitoring xadisk performance and for this issue, now i can say that this fix make xadisk stable, we are now not getting that error any more and its working well.

your explanation on this issue is correct and accurate, xadisk now more stable and we are not seeing any instance failure after running long time (at least 2 months) with milleions of file saved in application using xadisk. its looking very stable now.

Thanks for fixing this issue.

Ravi

Nitin Verma

unread,
Nov 14, 2014, 1:45:02 PM11/14/14
to xad...@googlegroups.com
Hi Ravi,

Thanks for your response.

So you applied the fix for https://java.net/jira/browse/XADISK-168? And had the successful run for around 1.5 months (just to avoid any confusion: you said atleast 2 months with this fix, but this fix is not 2 months old ?).

Regards,
Nitin

Ravi Soni

unread,
Nov 14, 2014, 11:32:51 PM11/14/14
to xad...@googlegroups.com
Hi Nitin,

Yes, its running after you provided fix for this issue, i just wrote approx time, :) did not recollect exact time of your fix.

Yes, its running good and very stable way, we have not observe this issue any more after applying your fix.

Ravi

monog...@gmail.com

unread,
May 21, 2019, 10:24:15 AM5/21/19
to XADisk
Hello everyone,

We are facing the GDW issue in our application. Due to unavailability of java.net, we are not able to retrieve more information about the fix for XADISK-168 following this URL : https://java.net/jira/browse/XADISK-168.
Is there a way to make this fix available again.

Thank you in advance !
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

Ravi Soni

unread,
May 22, 2019, 2:10:10 PM5/22/19
to XADisk

Hi,

Its very old topic also its hard to recall the exact patch solve this issue. i can share my Working XADisk build, Its tested since 5 years on load of daily 500k+ file processing.

Pl. reffer my blog site, i have uplaoded build on it. 

Pl. let me know if you need more info on this. 

Ravi

monog...@gmail.com

unread,
May 23, 2019, 11:44:03 AM5/23/19
to XADisk
Hello Ravi,

With your build which is well-tested, we will find how to solve this issue.

Thank you for your help !
Have a nice day !

Ravi Soni

unread,
Jun 19, 2019, 3:59:26 PM6/19/19
to XADisk
Any update on this issue,

Ravi

Nitin Verma

unread,
Jun 20, 2019, 1:34:13 AM6/20/19
to XADisk
Hello,

Prior to closing of java.net repository, I had taken a dump of the svn repository, jira bugs etc. I checked there about the jira bug# 168. The fix for this was made as svn revision #571, which consisted modifying only 1 file, GatheringDiskWriter.java.

I am attaching here both versions of this file, 570 and 571, so that they can be compared, and patch applied to any earlier release and tested.

Let me know if you have any questions.

Thanks,
Nitin

GatheringDiskWriter.java.570
GatheringDiskWriter.java.571
Reply all
Reply to author
Forward
0 new messages