Custom log format

84 views
Skip to first unread message

Wooster

unread,
Jan 10, 2012, 11:33:19 AM1/10/12
to Urchin Software from Google - help forum - Urchin 6
I've just installed Urchin 7.100 and am having trouble extracting any
data from my log files. They are tomcat access files and a sample is
below:

108.xx.yy.zzz - - [06/Jan/2012:02:14:33 +0000] "GET /ukpmc/search/?
scope=fulltext&page=1&query=prevention+of+left+ventrical+growth HTTP/
1.1" 200 160465 "http://ukpmc.ac.uk/search/?page=1&query=prevention+of
+left+ventrical+growth" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT
5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR
3.5.30729; .NET CLR 2.0.50727; XF_mmhpset)" "User-
UUID=A22BFBACBC0AAC7486F92E290B4A6E27;
JSESSIONID=UZqeEeeYpjFeOJIFio61.80"

So the fields are as follows:
client_ip, date:time, request, status, bytes, referrer, user-agent,
session_id.

I have created the following custom log format file:

PrimaryPositions: "4,0,16,6,10,11,15,13,14"
SecondaryPositions: -
PrimaryKey: -
SecondaryKey: -
PrimaryContent: HIT
SecondaryContent: -
CommentKey: #
FieldSeparator1: \s
FieldSeparator2: \-
QuotesEscapeSep: YES
BracketsEscapeSep: YES
MergeSuccessiveSep: NO
CleanWhiteSpace: NO
StatusRequired: YES
CustomDateFormat: "%d/%b/%y:%H:%M:%S %z"
CustomTimeFormat: "%H:%M:%S"
TimeZoneOffset: 0

But still fail to get any 'hits':
data lines: 413841 (100%)
data hits: 0
data proc: 1.00 B in 00:00:03 (0.000 MB/sec)
data range: 0 - 0

Some notes:
- I've tried with and without the zero field after ip; I thought it
might need this to cope with the "- -" sequence.
- I've tried FieldSeparator2 with and without quotes, "\-" or \-
- I haven't tried to break down the request field (6) any further, but
maybe should?
- in the CustomDateFormat I haven't included the + that goes with the
timezone, but maybe should?

Any help would be much appreciated!
thanks, Andrew

Jeff Sturm

unread,
Jan 11, 2012, 2:06:11 PM1/11/12
to urchin...@googlegroups.com
Andrew,

Couldn't you use "auto" format? Your sample is a pretty ordinary-looking log record--I'd be surprised if the auto log format didn't detect it properly.

I haven't yet created a custom log format within Urchin, nor have I found a need to create one.

-Jeff

> --
> You received this message because you are subscribed to the Google Groups "Urchin 6"
> group.
> To post to this group, send email to urchin...@googlegroups.com.
> To unsubscribe from this group, send email to urchin-help-
> 6+unsu...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/urchin-help-
> 6?hl=en.
>


salv esp

unread,
Feb 3, 2012, 10:58:11 AM2/3/12
to Urchin Software from Google - help forum - Urchin 6
Hi all,
I am facing the same problem; I can't extract any valuable information
from my logfiles.
I tried setting the "auto" format,then setting "ncsa" but it didn't
work;therefore I created my custom.lf logformat
When I look at the debug History file that Urchin generates I realize
that the parsing of the fields completely fails
The format our administrators use for our webserver is somehow similar
to the one from Andrew:
Below is an example
kkk.xx.yy.zzz - - - [06/Jan/2012:02:14:33 +0000] "GET /simple/url/
query?param1=val1&param2=val2 HTTP/1.1" 200 160465 "http://
referer.org" "Mozilla/4.0 (compatible; MSIE 8.0;)" -
So we have
IP - - - date request status-code bytes-sent referer user-agent
It is very simple,but I can't figure out how to make it parsed by
Urchin.
This is the configuration that i specificed in the custom.lf file I
created:
PrimaryPositions: "4,19,\-,\-,3,6,10,11,15,13,\-"
Can you give me some help?
Regards
Salvatore

santos soler

unread,
Feb 8, 2012, 9:28:30 PM2/8/12
to urchin...@googlegroups.com

I had this issue with Glassfish and their uncommon pattern in the logs, it is missing some key separators (").

 

To fix this you can change the AccessLogFormat to common or combined.

  • common - %h %l %u %t "%r" %s %b
  • combined - %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-Agent}i"

Some information on this here http://tomcat.apache.org/tomcat-5.5-doc/catalina/docs/api/org/apache/catalina/valves/AccessLogValve.html

 

To fix the logs I had from before I copied all the logs into a linux machine and ran a sed script on the directory.

for i in *; do sed -e 's/"\([0-9\.]*\)"/\1/g' -e 's/"\([^"]*-0500\)"/[\1]/g' $i > new.$i; Done

 

This adds all the appropriate missing “

 

You may need to change the -500 to your timezone

 

hope this helps :)


Santos Soler

System Administrator

IFAS Information Technology

Server Administration

University of Florida



To unsubscribe from this group, send email to urchin-help-...@googlegroups.com.

salv esp

unread,
Feb 9, 2012, 1:26:06 PM2/9/12
to Urchin Software from Google - help forum - Urchin 6
Hi Santos,
thank you very much for your support and suggestions; my division's
goal was to configure Urchin in order to process
log files without changing Apache Log Format,because this would have
required a restart of httpd and a certain amount of downtime for all
enterprise applications.
We then decided to develope a small bash script in order to convert
the logs in the combined format;we used a combination of grep ,awk and
sed (as you did).
I am a bit disappointed ;I think that configuring a custom log format
in Urchin should be way easier.
Thank you again,
Regards,
Salvatore
On 9 Feb, 03:28, santos soler <ssole...@gmail.com> wrote:
> I had this issue with Glassfish and their uncommon pattern in the logs, it
> is missing some key separators (").
>
> To fix this you can change the AccessLogFormat to common or combined.
>
>    - *common* - %h %l %u %t "%r" %s %b
>    - *combined* - %h %l %u %t "%r" %s %b "%{Referer}i" "%{User-Agent}i"
>
> Some information on this herehttp://tomcat.apache.org/tomcat-5.5-doc/catalina/docs/api/org/apache/...
>
> To fix the logs I had from before I copied all the logs into a linux
> machine and ran a sed script on the directory.
>
> for i in *; do sed -e 's/"\([0-9\.]*\)"/\1/g' -e
> 's/"\([^"]*-0500\)"/[\1]/g' $i > new.$i; Done
>
> This adds all the appropriate missing “
>
> You may need to change the -500 to your timezone
>
> hope this helps :)
>
> Santos Soler
>
> System Administrator
>
> IFAS Information Technology
>
> Server Administration
>
> University of Florida
>
Reply all
Reply to author
Forward
0 new messages