I tried to associate the NPE discussed at issue
https://java.net/jira/browse/XADISK-168 with the other exceptions reported in this thread, by doing some analysis of the code. Some findings:
-The NPE seen during GatheringDiskWriter (GDW)'s submitBuffer is caught by the application code, which was writing to a file using XAFileOutputStream's (XAFOS) write (this write method submits the buffer when it is full to GDW).
-From the logs, it appears that the same application code, when having received the NPE, continues to write further data or retries the failed write operation. Ravi needs to confirm this.
-In normal conditons, a new Buffer/byteBuffer is allocated after above submission of the existing buffer to GDW. But as the submission failed, the logic to allocate new buffer in XAFOS did not trigger, and hence these further write operations kept using the older "buffer/byteBuffer". Note that though buffer.getBuffer() would return null (as mentioned in the jira issue), but XAFOS is still holding the reference to the old "buffer" and "byteBuffer".
-So, the application code in effect keeps calling write operations while XAFOS is (incorrectly) holding on to the older buffer/byteBuffer. And these write operations keep resulting in submission of the same buffer again and again and failing inside GDW.submitBuffer due to buffer.getBuffer() being null. This is the reason we keep encountering NPE during XAFOS.write, as seen in the logs.
-This also implies that there would be multiple copies of the same buffer instance in the transaction's buffers list in the GDW. Though, GDW could have written only the first of these (and marked its byteBuffer to null using makeOnDisk); while trying to write the second one in writeBuffersToTransactionLog, it would fail with an NPE. This NPE is what was reported in the logs:
Caused by: java.lang.NullPointerException
at org.xadisk.filesystem.workers.GatheringDiskWriter.writeBuffersToTransactionLog(GatheringDiskWriter.java:137) [xadisk-SNAPSHOT.jar:]
at org.xadisk.filesystem.workers.GatheringDiskWriter.processEvent(GatheringDiskWriter.java:103) [xadisk-SNAPSHOT.jar:]
... 9 more
-One hypothesis regarding the occurrence of IndexOutOfBoundsException in the logs, is the concurrent use of the same byteBuffer by the XAFOS during its close operation and the GDW possibly executing its code to write this buffer (the first occurrence among the repetitive entries in the transaction's buffers) using transactionLogChannel.write.
I have written this post mostly for the purpose of self-notes, but Ravi please let me know if you want to relate it yourself and have questions/feedback.
Thanks,
Nitin