Problem shipping apache logs with Secure Forward Plugin in/out

767 views
Skip to first unread message

HSinclair

unread,
Jul 31, 2014, 5:51:14 PM7/31/14
to flu...@googlegroups.com
Hi all,

I have a setup which seems simple enough, however, the apache logs are not shipping from the apache (sending) server to the log receiving server.

Here are my configs:
http://pastebin.com/ErL5QukB
Could you take a look and see if I've bungled the config?

Everything on the servers is working normally:
  • the agents start without error
  • no network issues; the port is open
  • logs report connected to the receiving server
  • the log position files are getting written but they do not get updated (they are owned by user td-agent)
  • the apache logs themselves are voluminous and are getting written to constantly
  • fluent-info.xxxxxx files are being written on the receiving server and they seem to contain the log content from the sending server's td-agent.log file

Anyone have any ideas why my logs aren't shipping?

Thanks !

Kiyoto Tamura

unread,
Jul 31, 2014, 6:29:40 PM7/31/14
to flu...@googlegroups.com
Harold-

Thanks for trying Fluentd.

Let's test a couple of things:

In the sender config:

Replace

<match apache.*>
   type secure_forward
   shared_key somekey
   self_hostname sslserver.mydomain.com
    <server>
      port 2514
    </server>
</match>

with

<match apache.*>
   type copy
   <store>
     type stdout
   </store>
   <store>
      type secure_forward
      shared_key somekey
      self_hostname sslserver.mydomain.com
      <server>
         host log-std01.mydomain.com
         port 2514
       </server>
   </store>
</match>

and restart the sender td-agent. The goal here is to make sure Apache logs are indeed getting to out_secure_forward. You should see a bunch of actual events in /var/log/td-agent.log

If you are seeing a bunch of events in /var/log/td-agent.log with the above change, then the actual SSL connection seems suspect. Again, on the sender server, edit /etc/default/td-agent like this:

DAEMON_ARGS="-vv"

and restart td-agent. The goal here is to turn on verbose logging. After that, if you can share the messages in /var/log/td-agent/td-agent.log (on the sender machine), that would be hugely helpful.

Thanks!

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Check out Fluentd, the open source data collector for high-volume data streams

HSinclair

unread,
Aug 1, 2014, 10:56:28 AM8/1/14
to flu...@googlegroups.com
Thanks Kiyoto, this was very helpful.

The problem isn't solved but two things are happening now in the logs (now that we have verbose output):

The sender fluentd is not "matching" Apache log entries as they get written to the apache log file - for each apache log line, this is appearing in td-agent.log:

2014-08-01 10:41:11 -0400 [warn]: pattern not match: "[Fri Aug 01 10:46:11 2014]  <entire apache log line> "

2014-08-01 10:41:26 -0400 [warn]: temporarily failed to flush the buffer. next_retry=2014-08-01 10:41:26 -0400 error_class="RuntimeError" error="no one nodes with valid ssl session" instance=70181576798760

So it looks like at least I have some SSL error. I could configure SSL with a real, valid certificate, but would rather use self-signed if it's easier and less configuration.

-Harold



Masahiro Nakagawa

unread,
Aug 1, 2014, 12:20:13 PM8/1/14
to flu...@googlegroups.com
Hi

> 2014-08-01 10:41:11 -0400 [warn]: pattern not match: "[Fri Aug 01 10:46:11 2014]  <entire apache log line> "

"format apache2" is for access log format, not error log.
How about 'Apache Error Log' in this article? > http://docs.fluentd.org/articles/common-log-formats


Masahiro



Kiyoto Tamura

unread,
Aug 1, 2014, 1:29:57 PM8/1/14
to flu...@googlegroups.com
Hi Harold-

Thank you for the information.

w/r/t "pattern not match", pleae follow Masa's advice

w/r/t "no one nodes with valid ssl session", that's most likely due to a failure to establish a connection with the remote server.

Do you see one or both of the following lines in your log?

"...received invalid helo message..."

"...connection refused to <HOSTNAME>..."

If so, can you paste these lines here?

Thanks!

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HSinclair

unread,
Aug 1, 2014, 2:43:45 PM8/1/14
to flu...@googlegroups.com
Kiyoto, and Masa,

Masa's fix solved the "match" problem - thanks for that.

Here is the log from my sending server(s) - it looks like we have no problem connecting:

2014-08-01 14:37:56 -0400 [info]: adding source type="tail"
2014-08-01 14:37:56 -0400 [info]: adding source type="tail"
2014-08-01 14:37:56 -0400 [info]: adding match pattern="apache.*" type="copy"
2014-08-01 14:37:56 -0400 [info]: adding match pattern="**" type="secure_forward"
2014-08-01 14:37:56 -0400 [info]: following tail of /var/log/httpd/ssl-access_log
2014-08-01 14:37:56 -0400 [info]: following tail of /var/log/httpd/ssl-error_log
2014-08-01 14:37:57 -0400 [info]: connection established to log-std01.mydomain.com
2014-08-01 14:37:57 -0400 [info]: connection established to log-std02.mydomain.com
2014-08-01 14:37:57 -0400 [info]: connection established to log-std02.mydomain.com
2014-08-01 14:37:57 -0400 [info]: connection established to log-std01.mydomain.com

and the connection stays up.. Thanks for any further ideas

-Harold

Kiyoto Tamura

unread,
Aug 1, 2014, 8:34:28 PM8/1/14
to flu...@googlegroups.com
Harold-

Thanks. Hmm...a couple of more things:

In the sender's td-agent log:


Do you see something like "Failed to send messages to <receiver hostname> parging"?

On the receiver

Can you try modifying /etc/default/td-agent as before to enable verbose logging? This is to decide if anything is reaching the receiver instance.

Sorry for all the trouble, and thanks for your patience =)

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HSinclair

unread,
Aug 2, 2014, 5:37:50 PM8/2/14
to flu...@googlegroups.com
Kiyoto,

>Do you see something like "Failed to send messages to <receiver hostname> parging"?

You got it this time, it is happening, here is the selected output from a new startup:

2014-08-02 17:18:41 -0400 [warn]: Failed to send messages to log-std02.mydomain.com, parging. error_class=Errno::EPIPE error=#<Errno::EPIPE: Broken pipe>
...

2014-08-02 17:18:45 -0400 [warn]: temporarily failed to flush the buffer. next_retry=2014-08-02 17:18:48 -0400 error_class="RuntimeError" error="no one nodes with valid ssl session" instance=70282918880020
...

2014-08-02 17:18:46 -0400 [info]: dead connection found: log-std02.mydomain.com, reconnecting...
2014-08-02 17:18:49 -0400 [warn]: retry succeeded. instance=7028291888002


The apache logs are not arriving on the receiver but fluent-warn and fluent-info logs are being created.

Any idea what the cause (and fix) might be?
Thanks, Harold 

Masahiro Nakagawa

unread,
Aug 4, 2014, 2:47:35 PM8/4/14
to flu...@googlegroups.com
Harold,

Could you paste the your entire logs of client and server td-agent?
It is helpful to trace your td-agent actions.

Kiyoto,

Could you test Harold's configuration on EC2 or VM instances?
I want to know Harold's configuration is wrong or not.
If it works, we will check OpenSSL or network environment.


Masahiro


--

Kiyoto Tamura

unread,
Aug 4, 2014, 3:50:22 PM8/4/14
to flu...@googlegroups.com
Hey Masa,

I did check Harold's settings in a simlified fashion, and it worked. Here are my configs, and it was with vanilla installations of Ubuntu Precise for the sender and Lucid for the receiver (I don't think it should matter)

Sender's config

<source>
type forward
</source>

<match apache.*>
  type secure_forward
  shared_key somekey
  self_hostname 162.243.152.208
  <server>
    host 104.131.219.114
    port 2514
  </server>
</match>

Receiver's config

<source>
  type secure_forward
  shared_key somekey
  self_hostname log-std01.mydomain.com
  cert_auto_generate yes
  port 2514
</source>
<match **>
  type stdout
</match>

HSinclair

unread,
Aug 5, 2014, 10:11:53 AM8/5/14
to flu...@googlegroups.com
Hi,

Sorry for the delay

I will post my complete logs shortly.

The config test Kiyoto just ran is a bit different than mine - Sender is using "type tail" (and not "type forward") - surely that must make a difference? And on the receiver using "type stdout" skips writing to a file which might be part of my problem.

Thanks
Harold

HSinclair

unread,
Aug 5, 2014, 12:34:02 PM8/5/14
to flu...@googlegroups.com
Here are the logs from a fresh startup of both Sender and Receiver:
http://pastebin.com/mPWw1JSB

Here are the Sender and Receiver configurations:
http://pastebin.com/HTp2cVvX

I have a suspicion that the problem is around the apache format for format apache2 and the custom errorlog regex

Another question - to avoid format problems, can I ship logs using no format? Or is there a regex that would allow all, never causing a "match error"?

Thanks again
Harold

Kiyoto Tamura

unread,
Aug 5, 2014, 1:44:30 PM8/5/14
to flu...@googlegroups.com
Harold-

Sorry I didn't explain my config in greater detail. Yes, it is slightly different in that it is taking data from a different source (in_forward as opposed to in_tail) and outputting to stdout as opposed to files.

However, both configs are identical when it comes to sender-receiver transport. The goal of my setup was to exclude the possibility of SSL connection failing even in the simplest case.

Now that my config worked while yours hasn't, I am strongly suspecting it is either

1) some network settings issues not found in a vanilla Ubuntu environment (which was my test environment)

or

2) due to the volume/throughput you are experiencing with Apache log input

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kiyoto Tamura

unread,
Aug 5, 2014, 1:50:11 PM8/5/14
to flu...@googlegroups.com
Hi Harold,

Thanks for the logs. It does seem that your Apache log format is slightly different than the standard Apache Combined log format. In particular, it has a number between the date and HTTP method string like this:

10.51.216.56 - - [05/Aug/2014:12:06:15 -0400] 0 "GET /index.php?...

First to ensure that this is the issue, can you replace "format apache2" with "format none"? This ensures that in_tail ships logs as is. After that, we can deal with building the regular expression for your Apache logs =)

Thanks!

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HSinclair

unread,
Aug 5, 2014, 2:48:23 PM8/5/14
to flu...@googlegroups.com
Kiyoto,

Re the testing, understood.

Ok, thanks for the help - shipping access log now works with "format none" !!   So I see the format here was different from and not matching "apache2". Makes perfect sense now.

So it seems that the entire problem may have been with the format matching, and nothing to do with secure-in/out or SSL as previously thought.


Using "format none" for the error log however, doesn't seem to work.

Yes, I checked out Fluentular...   so this is all now leading to another question - I will have Apache and log4j logs from various projects - all with different formats meaningful to the various development teams - it would be ideal for us to /not/ have to match any format, and just accept everything "as-is". I will start a different thread since that is a different topic, but any advice here is welcome for the error log.

Many thanks, Harold








Kiyoto Tamura

unread,
Aug 5, 2014, 2:54:57 PM8/5/14
to flu...@googlegroups.com
Hi Harold-

Glad to have made progress =)


>Using "format none" for the error log however, doesn't seem to work.

Is it possible that you are assigning the same tag "apache.access" for both error logs and access logs? That's what I am seeing in the original config that you showed us:

<source>
type tail
path /var/log/httpd/ssl-access_log
pos_file /var/log/td-agent/tmp/apache.access.pos
tag apache.access
format apache2
</source>
 
<source>
type tail
path /var/log/httpd/ssl-error_log
pos_file /var/log/td-agent/tmp/apache.error.pos
tag apache.access
format apache2
</source>

The bold tag should be "apache.error"
Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HSinclair

unread,
Aug 5, 2014, 3:40:04 PM8/5/14
to flu...@googlegroups.com

Kiyoto,


> Is it possible that you are assigning the same tag "apache.access" for both error logs and access logs?
http://pastebin.com/HTp2cVvX is my latest config - no tag mismatches there.

But, mysteriously, the error log started shipping just fine! I guess it needed a minute to warm up:)

So, it looks like logs are shipping, using no formats - it looks like my main issue is solved. Thanks to all for the help.

Note - while using the fluent-plugin-secure-forward - I notice that if you change the configuration on one side and restart it - the SSL session has an issue with:

[warn]: temporarily failed to flush the buffer. next_retry=2014-08-05 15 :33:46 -0400 error_class="RuntimeError" error="no one nodes with valid ssl session" instance=69916014863360

and both sides require a restart to get going again.

Thanks
Harold




Kiyoto Tamura

unread,
Aug 5, 2014, 4:03:28 PM8/5/14
to flu...@googlegroups.com
Harold,

Glad to head that. Now onto parsing the logs. I believe your apache.error events were parsed fine with this regular expression, correct?

format /^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\] \[pid (?<pid>[^\]]*)\] \[client (?<client>[^\]]*)\] (?<message>.*)$/


As for apache.access, looking at your events here: http://pastebin.com/mPWw1JSB, I believe the following regular expression should work:

format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] (?<some_field>[^ ]*) "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" "(?<url_string>[^"]*)" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
I am not 100% sure what the number after the timestamp is, so it's currently called "some_field", but I believe you know what it is.


>[warn]: temporarily failed to flush the buffer. next_retry=2014-08-05 15 :33:46 -0400 error_class="RuntimeError" error="no one nodes with valid ssl session" instance=69916014863360
>
>and both sides require a restart to get going again.
The warning is expected: if the receiver is restarted, then the SSL connection is interrupted so data transfer needs to be retried (this is where Fluentd's memory- or file-based buffering comes in handy. I believe reconnection should be handled automatically though.
Are you also saying that if the sender is restarted, the receiver also needs to be restarted? That sounds weird to me.

Kiyoto
 


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HSinclair

unread,
Aug 6, 2014, 6:01:49 PM8/6/14
to flu...@googlegroups.com

In my scenario I'm noticing that while the apache logs are being shipped, every apache log line is also ending up inside the td-agent.log file.
Obviously that's unwanted, do you think it's the "type copy" and "type stdout" on the sender causing this?

Thanks!

Kiyoto Tamura

unread,
Aug 6, 2014, 6:09:38 PM8/6/14
to flu...@googlegroups.com
Yes. type stdout logs to stdout, which, in td-agent's default config, is piped out to td-agent.log.

removing the "type copy..." (or just removing <store> type stdout </store>) should get the desired effect.

Kiyoto


--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HSinclair

unread,
Aug 7, 2014, 8:51:13 AM8/7/14
to flu...@googlegroups.com
Kiyoto,

Removing <store> type stdout </store> works. Everything this working well now... thanks for the help again!

Harold

HSinclair

unread,
Aug 7, 2014, 1:18:31 PM8/7/14
to flu...@googlegroups.com

Kiyoto,

> The warning is expected: if the receiver is restarted, then the SSL connection is interrupted so data transfer needs to be retried (this is where Fluentd's memory- or file-based buffering comes in handy. I believe reconnection should be handled automatically though. Are you also saying that if the sender is restarted, the receiver also needs to be restarted? That sounds weird to me.

After watching this for a while, it's as you say - the connections are  re-established automatically.

Thanks! Harold

Reply all
Reply to author
Forward
0 new messages