High performance async binary logging?

1,226 views
Skip to first unread message

Gregg Donovan

unread,
Jan 16, 2014, 1:56:19 PM1/16/14
to mechanica...@googlegroups.com
We -- Etsy.com -- have a request logging system that we've begun to outgrow due to increased log volume and increased size of the messages we need to log. The messages are Thrift-encoded search requests from 1-100K in size. We use the logs to feed our homegrown traffic replay tools for offline performance tests and to reproduce issues found in production (GC, perf, etc.).

For our next gen request logging system we would like the following:

Low contention -- Minimal lock contention despite log writes from many (10s to 100s) application threads.
Async writes -- File I/O should be outside of the actual request path.
Binary logging -- Our current system base64 encodes the payload as a String and logs it via log4j 1.2. ;-) This creates more garbage than we would like and makes the data written twice as large. It was good enough for a while, but now we need something more efficient...
Compression -- Either per-message or per block of messages would be great. Incurring large periodic CPU penalties from logrotate + gzip, as we do now, is less than ideal.
Rotation -- Ideally based on file size, but time-based would also do.

Does anyone know of a good binary logger that fits these requirements? From the discussion on the list it seems like the Disruptor and the JavaChronicle fulfill parts of these requirements, but the use-case seems common enough that I suspect there may be an existing library I'm not familiar with that might do the job.

Thanks for the help!

Gregg Donovan
Senior Software Engineer
Etsy.com, Inc.

p.s. I've long enjoyed lurking on this list and have been pointing folks to it in a GC talk I gave last year. 

Martin Thompson

unread,
Jan 16, 2014, 2:14:13 PM1/16/14
to mechanica...@googlegroups.com
Hi Gregg,

I totally agree that such a logging library would be very useful. 

I've developed one that is lock-free, garage-free, and copy-free in the main logging path. It is also specifically designed to deal with burst traffic by amortizing the expensive costs to keep latency constant. It encodes binary messages and can be memory mapped into many Java or C processes on the same machine to have a true timeline across apps. It also support centralised log aggregation in a server farm where worker nodes have limited or no local storage. I've managed to saturate every SSD I've tested it on.

The client I've developed this for ultimately would like it open-sourced. We are going through the process of setting up the means of doing that. Watch this space but do not hold your breath :-)

Martin...

Sam Barker

unread,
Jan 16, 2014, 2:22:14 PM1/16/14
to mechanica...@googlegroups.com

While nothing like Martin's clever (or should that be simplest workable?) logger I have been looking at https://kafka.apache.org. I haven't tested it in the wild but it does sound like its worth a look.

Sam

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Lawrey

unread,
Jan 16, 2014, 2:35:51 PM1/16/14
to mechanica...@googlegroups.com

I have a lock free, gc free logger/Chronicle (concurrent writer across processes) it is in the sandbox area at the moment but should be in the next release. The TCP replication is not finished.

--

Gregg Donovan

unread,
Jan 16, 2014, 5:10:18 PM1/16/14
to mechanica...@googlegroups.com
Thank you for the suggestions! I'll keep an eye out for the the next version of Chronicle and Martin's future logger. Both sound worth waiting for, so we'll attempt to duct tape our current system together in the meantime. :-)

Much appreciated,

Gregg

ymo

unread,
Jan 17, 2014, 4:15:28 PM1/17/14
to mechanica...@googlegroups.com
Can you elaborate as to why simply logging to cassandra cluster is not an option? It already has features like :
1) write heavy optimizations
2) data center replication aware
3) multi level compaction 
4) append only writes which would be nice on SSDs
etc ...

Why not go for that now ?

On Thursday, January 16, 2014 5:10:18 PM UTC-5, Gregg Donovan wrote:
Thank you for the suggestions! I'll keep an eye out for the the next version of Chronicle and Martin's future logger. Both sound worth waiting for, so we'll attempt to duct tape our current system together in the meantime. :-)

Much appreciated,

Gregg
On Thu, Jan 16, 2014 at 2:35 PM, Peter Lawrey <peter....@gmail.com> wrote:

I have a lock free, gc free logger/Chronicle (concurrent writer across processes) it is in the sandbox area at the moment but should be in the next release. The TCP replication is not finished.

On 16 Jan 2014 18:56, "Gregg Donovan" <greg...@gmail.com> wrote:
We -- Etsy.com -- have a request logging system that we've begun to outgrow due to increased log volume and increased size of the messages we need to log. The messages are Thrift-encoded search requests from 1-100K in size. We use the logs to feed our homegrown traffic replay tools for offline performance tests and to reproduce issues found in production (GC, perf, etc.).

For our next gen request logging system we would like the following:

Low contention -- Minimal lock contention despite log writes from many (10s to 100s) application threads.
Async writes -- File I/O should be outside of the actual request path.
Binary logging -- Our current system base64 encodes the payload as a String and logs it via log4j 1.2. ;-) This creates more garbage than we would like and makes the data written twice as large. It was good enough for a while, but now we need something more efficient...
Compression -- Either per-message or per block of messages would be great. Incurring large periodic CPU penalties from logrotate + gzip, as we do now, is less than ideal.
Rotation -- Ideally based on file size, but time-based would also do.

Does anyone know of a good binary logger that fits these requirements? From the discussion on the list it seems like the Disruptor and the JavaChronicle fulfill parts of these requirements, but the use-case seems common enough that I suspect there may be an existing library I'm not familiar with that might do the job.

Thanks for the help!

Gregg Donovan
Senior Software Engineer
Etsy.com, Inc.

p.s. I've long enjoyed lurking on this list and have been pointing folks to it in a GC talk I gave last year. 

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Konstantin Shaposhnikov

unread,
Jan 17, 2014, 7:33:37 PM1/17/14
to mechanica...@googlegroups.com
Kafka (http://kafka.apache.org/) might be a good option.

I should say that I haven't used it myself though.


On Thursday, 16 January 2014 18:56:19 UTC, Gregg Donovan wrote:

Remko Popma

unread,
Jan 18, 2014, 10:27:18 AM1/18/14
to


On Friday, January 17, 2014 3:56:19 AM UTC+9, Gregg Donovan wrote:
We -- Etsy.com -- have a request logging system that we've begun to outgrow due to increased log volume and increased size of the messages we need to log. The messages are Thrift-encoded search requests from 1-100K in size. We use the logs to feed our homegrown traffic replay tools for offline performance tests and to reproduce issues found in production (GC, perf, etc.).

For our next gen request logging system we would like the following:

Low contention -- Minimal lock contention despite log writes from many (10s to 100s) application threads.
Async writes -- File I/O should be outside of the actual request path.
Binary logging -- Our current system base64 encodes the payload as a String and logs it via log4j 1.2. ;-) This creates more garbage than we would like and makes the data written twice as large. It was good enough for a while, but now we need something more efficient...
Compression -- Either per-message or per block of messages would be great. Incurring large periodic CPU penalties from logrotate + gzip, as we do now, is less than ideal.
Rotation -- Ideally based on file size, but time-based would also do.

You can do this with log4j-2.0. The current version (log4j-2.0-beta-9) out of the box gives you 3 out of 5: Async Loggers use the Disruptor under the hood -> low contention & async writes, and RollingRandomAccessFileAppender (the fastest appender) can do rotation. This appender can also optionally do compression in a separate thread, but this happens on roll-over only, not continuously. (Still, I would guess that this is more CPU-efficient than using logrotate + gzip in a separate process...)

The binary logging can be achieved with a bit of custom code.

First, your objects would need to provide some way to turn themselves into bytes. I doubt that you want to use java standard serialization :-), so how about this as an interface to log4j that your objects would implement:
public interface BinaryMessage extends org.apache.logging.log4j.message.Message {
    int getLength(); // return byte count
    void writeInto(ByteBuffer buffer); // write length bytes from the current buffer position
}

Log4j2 Loggers already have methods that take a Message argument (instead of a String), so in your application code you could just writ
Logger logger = LogManager.getLogger(MyClass.class); // usually cached in static field
logger.info(myObject); // where myObject implements BinaryMessage

Second, in the log4j2 framework, the Layout is responsible for converting a LogEvent into bytes, so you need a BinaryLayout that recognizes BinaryMessages. The layout could optionally output the timestamp (long) and log level (byte), followed by the message size (int) and the message bytes (byte[]).
Using the log4j2 plugin model, an implementation could look something like below.

@Plugin(name = "BinaryLayout", category = "Core", elementType = "layout", printObject = true)
public final class BinaryLayout extends AbstractLayout<LogEvent> {
    private static enum TimestampStrategy {
        WRITE {
            public void write(LogEvent event, ByteBuffer buffer) {
                buffer.putLong(event.getMillis());
            }
        },
        DONT {
            public void write(LogEvent event, ByteBuffer buffer) {
            }
        };
        abstract void write(LogEvent event, ByteBuffer buffer);
    }
    private static enum LevelStrategy {
        WRITE {
            public void write(LogEvent event, ByteBuffer buffer) {
                buffer.put((byte) event.getLevel().intLevel());
            }
        },
        DONT {
            public void write(LogEvent event, ByteBuffer buffer) {
            }
        };
        abstract void write(LogEvent event, ByteBuffer buffer);
    }
    private final TimestampStrategy timestamp;
    private final LevelStrategy level;
    private final int prefixLength;

    public BinaryLayout(boolean timestamp, boolean level) {
        this.timestamp = timestamp ? TimestampStrategy.WRITE : TimestampStrategy.DONT;
        this.level = level ? LevelStrategy.WRITE : LevelStrategy.DONT;
        final int stampBytes = timestamp ? 8 : 0;
        final int levelBytes = level ? 1 : 0;
        prefixLength = 4 + stampBytes + levelBytes;
    }

    @Override
    public byte[] toByteArray(LogEvent event) {
        final BinaryMessage msg = (BinaryMessage) event.getMessage();
        final byte[] result = new byte[msg.getLength() + prefixLength];
        final ByteBuffer buffer = ByteBuffer.wrap(result);
        timestamp.write(event, buffer);
        level.write(event, buffer);
        buffer.putInt(msg.getLength());
        msg.writeInto(buffer);
        return result;
    }
    @Override
    public LogEvent toSerializable(LogEvent event) {
        return event;
    }
    @Override
    public String getContentType() {
        return "application/octet-stream";
    }
    @Override
    public Map<String, String> getContentFormat() {
        return new HashMap<String, String>();
    }
    @PluginFactory
    public static BinaryLayout createLayout(
            @PluginAttribute("writeTimestamp") final String writeTimestamp,
            @PluginAttribute("writeLevel") final String writeLevel) {
        final boolean timestamp = Booleans.parseBoolean(writeTimestamp, false);
        final boolean level = Booleans.parseBoolean(writeLevel, false);
        return new BinaryLayout(timestamp, level);
    }
}


Unfortunately the log4j API does not allow us to be completely garbage-free, you still need to allocate the byte[] array and ByteBuffer for each event...
(The LogEvents are reused Disruptor ring buffer slots if you make all loggers AsyncLoggers.)
Still, this should be a lot better (garbage-wise) than base64-encoding a String, and then turning that String into bytes again...

Here is the log4j2.xml configuration that would put it all together (modify the "packages" attribute at the top to match your package):
(Don't forget to set system property -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector to make all loggers AsyncLoggers.)

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="package.of.binary.layout.class">
  <Appenders>
    <RollingRandomAccessFile name="rraf" fileName="binary.log"
                 filePattern="logs/$${date:yyyy-MM}/app-%d{MM-dd-yyyy}-%i.log.gz"
                 append="false" immediateFlush="false">
      <BinaryLayout writeTimestamp="true" writeLevel="true" />
      <Policies>
        <TimeBasedTriggeringPolicy />
        <SizeBasedTriggeringPolicy size="250 MB"/>
      </Policies>
    </RollingRandomAccessFile>
  </Appenders>
  
  <Loggers>
    <Root level="trace" includeLocation="false">
      <AppenderRef ref="rraf"/>
    </Root>
  </Loggers>
</Configuration>

Thoughts?

Gregg Donovan

unread,
Jan 21, 2014, 10:08:56 AM1/21/14
to mechanica...@googlegroups.com
Remko,

That would be a tremendous addition for us -- thanks! Despite not being garbage free, I'm sure every log4j2 invocation will create orders of magnitude less garbage than our actual request processing. :) I'll open a ticket over at log4j2 to discuss it further. I'm not familiar with the codebase, but I'll be happy to pitch in if I can...


Re: Kakfa, Cassandra -- Both look very promising, but for this application we'd prefer not to incur any more systems complexity than we need to. These logs are helpful diagnostic tools, but don't contain audit or database level data, so they don't need to be replicated, distributed, fault-tolerant, etc.


On Sat, Jan 18, 2014 at 5:29 AM, Remko Popma <remko...@gmail.com> wrote:


On Friday, January 17, 2014 3:56:19 AM UTC+9, Gregg Donovan wrote:
We -- Etsy.com -- have a request logging system that we've begun to outgrow due to increased log volume and increased size of the messages we need to log. The messages are Thrift-encoded search requests from 1-100K in size. We use the logs to feed our homegrown traffic replay tools for offline performance tests and to reproduce issues found in production (GC, perf, etc.).

For our next gen request logging system we would like the following:

Low contention -- Minimal lock contention despite log writes from many (10s to 100s) application threads.
Async writes -- File I/O should be outside of the actual request path.
Binary logging -- Our current system base64 encodes the payload as a String and logs it via log4j 1.2. ;-) This creates more garbage than we would like and makes the data written twice as large. It was good enough for a while, but now we need something more efficient...
Compression -- Either per-message or per block of messages would be great. Incurring large periodic CPU penalties from logrotate + gzip, as we do now, is less than ideal.
Rotation -- Ideally based on file size, but time-based would also do.

I've been thinking about adding binary logging to log4j-2.0. The current version (log4j-2.0-beta-9) should give you 3 out of 5: Async Loggers use the Disruptor under the hood -> low contention & async writes, and RollingRandomAccessFileAppender (the fastest appender) can do rotation. This appender can also optionally do compression in a separate thread, but this happens on roll-over only, not continuously. (Still, I would guess that this is more CPU-efficient than using logrotate + gzip in a separate process...)

Your objects would need to provide some way to turn themselves into bytes. I doubt that you want to use java standard serialization :-), so I was thinking to add an interface to log4j that your objects would implement. Something like:
public interface BinaryMessage extends org.apache.logging.log4j.message.Message {
    byte[] toByteArray();
}

Log4j2 Loggers already have methods that take a Message argument (instead of a String), so in your application code you could just writ
Logger logger = LogManager.getLogger(MyClass.class); // usually cached in static field
logger.info(myObject); // where myObject implements BinaryMessage

In the log4j2 framework, the Layout is responsible for converting a LogEvent into bytes, so I would add a BinaryLayout that recognizes BinaryMessages.
The layout could optionally output the timestamp (long) and log level (byte), followed by the message size (int) and the message bytes (byte[]).

The biggest drawback to this approach that I can see is that it is not garbage-free.

Thoughts?
 
Does anyone know of a good binary logger that fits these requirements? From the discussion on the list it seems like the Disruptor and the JavaChronicle fulfill parts of these requirements, but the use-case seems common enough that I suspect there may be an existing library I'm not familiar with that might do the job.

Thanks for the help!

Gregg Donovan
Senior Software Engineer
Etsy.com, Inc.

p.s. I've long enjoyed lurking on this list and have been pointing folks to it in a GC talk I gave last year. 

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages