Logging level in Overlord

625 views
Skip to first unread message

Zach Cox

unread,
Jul 29, 2015, 5:46:34 PM7/29/15
to Druid User
Hi - we are having trouble getting the Overlord to submit a batch index job to a remote job tracker. The index task logs show the pretty standard usual vague unhelpful logs from hadoop client:

15/07/29 21:36:48 ERROR security.UserGroupInformation: PriviledgedActionException as:druid (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
...
Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
	at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)

Looking at the source of that Cluster.initialize method, there are some seemingly helpful debug logging messages just before that exception. However, there are no debug level logs in the index task logs.

I even changed the config/_common/log4j2.xml on the Overlord to this:

<Configuration status="WARN">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="debug">
            <AppenderRef ref="Console"/>
        </Root>
    </Loggers>
</Configuration>

That seems like it should have everything log at debug level, but INFO is the lowest level in the index task logs, there are no DEBUG lines.

We also set this option when running Overlord:

-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Does anyone know anything else we can do to really actually get debug-level logging in the overlord index task logs?

Thanks,
Zach

charles.allen

unread,
Jul 29, 2015, 5:55:48 PM7/29/15
to Druid User, zco...@gmail.com
Making your default logging level to be debug is going to yield a bad day overall. You probably want to set a debugging logger only for the class of interest (see https://logging.apache.org/log4j/2.x/manual/configuration.html#Additivity )

FYI, you should be able to change log4j2 xml settings dynamically if you open up a jmxremote port on the instance and connect as per

jconsole -pluginpath log4j-api-2.2.jar:log4j-core-2.2.jar:log4j-jmx-gui-2.2.jar jmxremote_connect_string_here

The above command line you should be able to see the ACTUAL log4j2.xml file the jvm is using.

Zach Cox

unread,
Jul 30, 2015, 11:40:54 AM7/30/15
to charles.allen, Druid User
Hi Charles - we were able to get jconsole to connect to the Overlord via jmx with that log4j2 gui plugin, and were able to change the logging level for the Overlord to DEBUG. However, this does not seem to affect the logging level for the index task jvm process, which is where we really need the debug logs to troubleshoot the Hadoop issue. That just remains at INFO level and we are unable to change it.

Do you know how we can configure the logging levels for the index task process? We have druid.indexer.runner.type=local if that makes a difference.

Thanks,
Zach

Zach Cox

unread,
Jul 30, 2015, 5:18:47 PM7/30/15
to charles.allen, Druid User
I found this thread related to customizing the peon logging:


I created a config/_common/log4j.xml file, that path is on the peon classpath and I set the -Dlog4j.configuration peon jvm option. The index task logs now only show logs for guice and aether. There are absolutely no Druid or Hadoop lines in the index task logs.

Is there a better way to customize the peon/index task logging? All I want is DEBUG for org.apache.hadoop.

Thanks,
Zach

charles.allen

unread,
Jul 30, 2015, 5:37:26 PM7/30/15
to Druid User, charle...@metamarkets.com, zco...@gmail.com
I tried a few options, like adding the jmx to the java options on the forking runner, but it is very hard to interrupt the process fast enough before it fails.

Doing something similar to the other issue where the config is specified on the command line worked for me:

<Configuration status="WARN">
 
<Appenders>
   
<Console name="Console" target="SYSTEM_OUT">
     
<PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
   
</Console>
 
</Appenders>
 
<Loggers>

   
<Root level="info">
     
<AppenderRef ref="Console"/>
   
</Root>
   
<Logger name="org.apache.hadoop.mapreduce.Cluster" level="debug" additivity="false">
     
<AppenderRef ref="Console"/>
   
</Logger>
 
</Loggers>
</Configuration>


specifying

-Dlog4j.configurationFile=/Users/charlesallen/src/druid/common/src/main/resources/log4j2.debug.xml
via the overlord config as:


-Ddruid.indexer.runner.javaOpts=" -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005  -Duser.timezone=UTC -Dfile.encoding=UTF-8 -server -Dcom.sun.management.jmxremote.port=12345 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dlog4j.configurationFile=/some/place/log4j2.debug.xml"


There are some GC related stuff and yourkit stuff that I left out of the options above.


Doing this stalls the running of the peon's fork until you connect a java debugging console to 5005. You can hook into the log (check the overlord log to see where the peon log is going) and watch it before attaching the debug process.


This allows me to see debug level logging from org.apache.mapreduce.hadoop.Cluster


Does that happen to get the logging in your case?

charles.allen

unread,
Jul 30, 2015, 5:50:40 PM7/30/15
to Druid User, charle...@metamarkets.com, zco...@gmail.com

Zach Cox

unread,
Jul 30, 2015, 5:54:04 PM7/30/15
to charles.allen, Druid User
I was just about to ask if there was any simpler way to just specify a log4j2.xml for the peon task to use, without going through jmx. I guess that https://github.com/druid-io/druid/issues/1213 says that currently there is not?

charles.allen

unread,
Jul 30, 2015, 5:56:17 PM7/30/15
to Druid User, charle...@metamarkets.com, zco...@gmail.com
You can override druid.indexer.runner.javaOpts at the overlord or middle manager, but if you don't want to tinker with such things, no not really.

Zach Cox

unread,
Jul 31, 2015, 2:41:55 PM7/31/15
to charles.allen, Druid User
Hi Charles - sorry, I'm still stuck on this peon debug logging issue. Should the following work? Or am I misunderstanding any current limitations?

Overlord config includes:

druid.indexer.runner.javaOpts=-server -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Xmx256m -Dcom.sun.management.jmxremote.port=18081 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Dlog4j.configurationFile=/opt/druid/config/overlord/log4j.xml

/opt/druid/config/overlord/log4j.xml contents:

<Configuration status="WARN">
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
      <PatternLayout pattern="JHA %d{ISO8601} %p [%t] %c - %m%n"/>
    </Console>
  </Appenders>
  <Loggers>
    <Logger name="org.apache.hadoop" level="debug" />
    <Root level="info">
      <AppenderRef ref="Console"/>
    </Root>
  </Loggers>
</Configuration>

I included the literal "JHA" in logging pattern just to see if this log4j.xml is actually being used.

When I run Overlord with above config, submit a batch index task to Overlord, and then look at the task logs via the Overlord console (at http://my.overlord.com:8285/console.html) I do not see expected logs.

Most of the lines are formatted like this:

15/07/31 17:22:42 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_45-internal

Clearly this does not match the pattern in log4j.xml.

However, there are 7 lines (out of 1225 total) that are actually formatted as per log4j.xml and they all seem related to Guice/Jersey:

JHA 2015-07-31T17:22:43,640 INFO [main] com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory - Registering com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider as a provider class
JHA 2015-07-31T17:22:43,643 INFO [main] com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory - Registering io.druid.server.StatusResource as a root resource class
JHA 2015-07-31T17:22:43,646 INFO [main] com.sun.jersey.server.impl.application.WebApplicationImpl - Initiating Jersey application, version 'Jersey: 1.17.1 02/28/2013 12:47 PM'
JHA 2015-07-31T17:22:43,729 INFO [main] com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory - Binding com.fasterxml.jackson.jaxrs.json.JacksonJsonProvider to GuiceManagedComponentProvider with the scope "Singleton"
JHA 2015-07-31T17:22:44,364 INFO [main] com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory - Binding io.druid.server.QueryResource to GuiceInstantiatedComponentProvider
JHA 2015-07-31T17:22:44,375 INFO [main] com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory - Binding io.druid.segment.realtime.firehose.ChatHandlerResource to GuiceInstantiatedComponentProvider
JHA 2015-07-31T17:22:44,377 INFO [main] com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory - Binding io.druid.server.StatusResource to GuiceManagedComponentProvider with the scope "Undefined"

In addition, using jconsole on port 18081 shows that /opt/druid/config/overlord/log4j.xml is being used by the peon index task for log4j.

So this leads me to believe that there is some other logging framework, or some other log4j config, being used here to format all but those 7 lines.

Does this seem normal for peon logging?

Thanks,
Zach



Zach Cox

unread,
Jul 31, 2015, 4:04:08 PM7/31/15
to charles.allen, Druid User
I figured out why there are 2 different styles of logging in the index task logs: all of these jars are being included in the peon's classpath:

log4j-api-2.2.jar
log4j-core-2.2.jar
log4j-1.2-api-2.2.jar
log4j-1.2.17.jar

When I removed log4j-1.2.17.jar then the index task logs looked correct - I see that "JHA" prefix on every line, and I also see DEBUG logs for org.apache.hadoop. I'm not sure what is pulling in log4j-1.2.17.jar, but it should not be needed since log4j2 is being used.

charles.allen

unread,
Jul 31, 2015, 6:45:33 PM7/31/15
to Druid User, charle...@metamarkets.com, zco...@gmail.com
Hmm, yeah that jar is easy to sneak in there. I've gone through a couple of rounds of "Who is pulling in this dependency!?" regarding log4j.

Glad the log is working!

 As a note, setting -Dcom.sun.management.jmxremote.port=18081 on the peons may or may not work because multiple peons are probably going to spawn, and if the second one cannot bind to the port, I do not know what the results are expected to be. Best to leave it off unless you're debugging stuff.

Fangjin Yang

unread,
Aug 2, 2015, 11:36:30 AM8/2/15
to Druid User, charle...@metamarkets.com, zco...@gmail.com
Hey guys, is it possible to contribute some of these findings to the existing docs on logging?

They can be found here:
Reply all
Reply to author
Forward
0 new messages