Chronicle Queue v4 writes failing due to seq mismatch

42 views
Skip to first unread message

Kyle Kavanagh

unread,
Nov 16, 2016, 4:38:25 PM11/16/16
to Chronicle
Trying to work through an interesting error with Chronicle Queue v4.5.15 and find myself stumbling around with little success.  We have a singe thread writing data to chronicle:

this.chronicle = ChronicleQueueBuilder.single(myDir)

       .buffered(false)

       .rollCycle(RollCycles.HOURLY)

       .wireType(WireType.FIELDLESS_BINARY)

       .blockSize(4096*4096)

       .build();


...
//In a loop,

ExcerptAppender appender = this.chronicle.acquireAppender();

appender.writeBytes(message.payload);


where message.payload is an instance of BytesMarshallable

@Override

 public void writeMarshallable(BytesOut out) {

   

    out.writeByte(VERSION);

   out.writeLong(this.receivedTime);


    //this.payload is a non-empty ByteBuffer

    this.payload.rewind();

    out.writeUnsignedShort(this.payload.remaining());

    out.writeUtf8(this.channelId);


    while (this.payload.remaining() > 0) {

     out.writeSome(this.payload);

   }


    out.writeLong(this.receivedNanoTime);


  }



During a long running test, we're seeing:


15:22:22.595 [QueuedChannelMessageHandler] ERROR com.<>.QueuedDirectChannel - Exception while processing message 50


java.lang.AssertionError: ~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127767 seq1: 100 seq2: 196


        at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender.checkWritePositionHeaderNumber(SingleChronicleQueueExcerpts.java:438) ~[classes/:?]


        at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender.writingDocument(SingleChronicleQueueExcerpts.java:352) ~[classes/:?]


        at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender.writeBytes(SingleChronicleQueueExcerpts.java:159) ~[classes/:?]


        at com.<>.QueuedDirectChannel.writeMessageToChron(CamlQueuedDirectChannel.java:198) [classes/:?]


        at com.<>.QueuedDirectChannel.access$5(QueuedDirectChannel.java:191) [classes/:?]


        at com.<>.QueuedDirectChannel$Processor.run(QueuedDirectChannel.java:283) [classes/:?]


        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]



I added println statements to SingleChronicleQueueExcerpts.checkWritePositionHeaderNumber to add some context:

~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127146 seq1: 99 seq2: 99

~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127146 seq1: 99 seq2: 99

~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127767 seq1: 100 seq2: 100

~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127767 seq1: 100 seq2: 100

~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 128393 seq1: 101 seq2: 101

[ERR]~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127767 seq1: 100 seq2: 196


15:22:22.595 [QueuedChannelMessageHandler] ERROR com.<>.QueuedDirectChannel - Exception while processing message 50


java.lang.AssertionError: ~~~~~~~~~~~~~~ thread: QueuedChannelMessageHandler  pos1: 127767 seq1: 100 seq2: 196



Since I think checkWritePositionHeaderNumber is called twice for each append (once on writingDocument and again on documentContext.close), the duplicate printlns above the error are expected.  What should I make of seq1 and seq2, and what are they things that could cause them to get out of sync?

Peter Lawrey

unread,
Nov 16, 2016, 5:53:09 PM11/16/16
to java-ch...@googlegroups.com

Hello,
   Could you create a unit test which recreates this issue so we can look into it?

Regards, Peter.


--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicle+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kyle Kavanagh

unread,
Nov 20, 2016, 4:39:36 PM11/20/16
to Chronicle
See attached.  In creating this pared-down test, I actually think I discovered my error.  I was using a tailer DocumentContext in a try-with-resources block, but was closing the context manually inside the block.  This would mean that close() would be called twice on the context.  What made this tricky to track down is that the error/stacktrace only appears on the writer thread, which lead me to think that it was an issue with the writer side of the queue.  I realized that it was the tailer when I wasn't able to recreate the issue at first and observed that it only seems to be a problem when another thread is actively tailing the queue.   A fix for this would just be to make close() a noop if already closed.

You may have to run the test a few times, since its non-deterministic (but has reliably failed for me often).  Stacktrace:
java.lang.AssertionError: ~~~~~~~~~~~~~~ thread: main  pos1: 74395 seq1: 13 seq2: 21
at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender.checkWritePositionHeaderNumber(SingleChronicleQueueExcerpts.java:438)
at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender$StoreAppenderContext.close(SingleChronicleQueueExcerpts.java:834)
at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueExcerpts$StoreAppender.writeBytes(SingleChronicleQueueExcerpts.java:164)
at com.chronicle.test.MultiThreadTest.chronTestWithRealtimeTailer(MultiThreadTest.java:133)

To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicl...@googlegroups.com.
MultiThreadTest.java

Rob Austin

unread,
Nov 20, 2016, 5:13:33 PM11/20/16
to java-ch...@googlegroups.com
>A fix for this would just be to make close() a noop if already closed.

Thanks. - We will implement your suggestion. 

Rob

Sent from my iPhone
<MultiThreadTest.java>
Reply all
Reply to author
Forward
0 new messages