Transaction Logs

478 views
Skip to first unread message

Greg Young

unread,
Mar 3, 2016, 5:50:03 AM3/3/16
to mechanica...@googlegroups.com
Has anyone done any recent research into append/read heavy transaction
logs? Currently I am looking (and benchmarking) between:

page cache/mmap
direct-io
libaio
write/flush (probably the worst perfwise but works everywhere)

Obviously many of the aspects are not readily seen immediately in
tests (involve interactions with other processes, what happens in
varying memory scenarios, etc).

Has anyone had any experiences with any they care to share?

Cheers,

Greg

--
Studying for the Turing test

Chris Vest

unread,
Mar 3, 2016, 8:35:29 AM3/3/16
to mechanica...@googlegroups.com
We currently use write/flush in Neo4j. We haven’t tried the other approaches. We coalesce flushes of concurrent commits to reduce transaction log IO and sync overhead.

Cheers,
Chris

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Georges Gomes

unread,
Mar 3, 2016, 8:59:05 AM3/3/16
to mechanica...@googlegroups.com

+1 same here. It works well and fast.

I just guess mmap would be faster but require some additional file management that write+flush just don't need.

My 2 cents.
GG

Greg Young

unread,
Mar 3, 2016, 9:04:42 AM3/3/16
to mechanica...@googlegroups.com
The main issue with write/flush is it tends to be "spikey" compared to
memmap + selective flush/directio (largely due to the io queue
problems)

Most using mmap aren't actually durable (they flush every n seconds)
in which case it offers lots of benefits (hey its just a write to
memory... until someone pulls the power!).

Actually I was dealing with a client who was telling me both neo and
es were "broken" under heavy load they showed sawtooth throughput
patterns in Azure. TLDR I went through and figured out the sawtooths
were IOPs limiting. The things that were "smooth" weren't actually
saving to disk :) I will have to send that exchange over to Jim for
yet another beer certificate.

Cheers,

Greg

Dan Eloff

unread,
Mar 3, 2016, 9:15:09 AM3/3/16
to mechanica...@googlegroups.com
With an eye to the future, the mmap approach may be more compatible with nvram. If you copy the data into the mmapped buffer with non-temporal stores, then the only thing you would need to do is replace msync with PCOMMIT and you've got insanely low latency durable writes.

However, using mmap is problematic. If there's a disk error you'll get a signal, probably SIGBUS. On the JVM it crashes the whole JVM: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6415680

Using write/flush is simple and you can handle the errors more gracefully.


Francesco Nigro

unread,
Feb 11, 2017, 1:31:08 AM2/11/17
to mechanical-sympathy
Pretty late considering the age of this post and little Java'ish,but worths to know:

https://groups.google.com/forum/m/#!topic/pmem/2LJFFpoc8gA

And

https://github.com/pmem/pcj

Cheers,
Francesco

Reply all
Reply to author
Forward
0 new messages