generating proxy(squid) statistics and reporting....

53 views
Skip to first unread message

khmadhu

unread,
Mar 18, 2009, 2:24:04 AM3/18/09
to ossec-list
hi,


I have installed the ossec server on linux. I have a log file which
is generated by a proxy server. its arround 400 MB .i want to generate
the statistics of that log only, like TCP_HIT,TCP_REFRESH etc..
downloads,top sites,in graph..

is there a way i can do with ossec..?



Andy Tripp

unread,
Mar 18, 2009, 9:16:19 AM3/18/09
to ossec...@googlegroups.com
No, try Cacti.


hi,


CONFIDENTIALITY NOTICE: This correspondence, and all attachments transmitted with it, may contain legally privileged and confidential information intended solely for the use of the intended recipient. If the reader of this message is not the intended recipient or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying or other use of this communication is strictly prohibited. If you have received this message in error, please notify the sender immediately by telephone at 580.213.1730, or by electronic mail hd...@cnb-enid.com, and delete this message and all copies and backups thereof. Failure to comply with this confidentiality notice may result in criminal or civil penalties and/or prosecution.

Edvin Seferovic

unread,
Mar 18, 2009, 9:50:00 AM3/18/09
to ossec...@googlegroups.com
Cacti can access the SNMP info from a squid server. SARG can generate pretty
statistics from the log file.

Regards,
E:S

Daniel Callan

unread,
Mar 20, 2009, 2:29:06 AM3/20/09
to ossec...@googlegroups.com
Hi khmadhu

khmadhu wrote:
> I have installed the ossec server on linux. I have a log file which
> is generated by a proxy server. its arround 400 MB .i want to generate
> the statistics of that log only, like TCP_HIT,TCP_REFRESH etc..
> downloads,top sites,in graph..
>

I routinely generate exactly the kind of reporting you describe (on our
Squid logs) using Calamaris - http://cord.de/tools/squid/calamaris/

It is basically a perl program that you pipe logs to using STDIN and it
will output in HTML or plain text reports (total HITS, total MB
transferred, top X domains, etc). A whole variety of analysis based on
what parameters you run it with.

I am actually throwing Gigs of logs at it for quarterly reports and such
so I have written a few shell scripts around it that use the "-o" and
"-i" flags to output a digested stats output (-o) and then import that
file (-i) into the scan of the next months log file ... thus I
"snowball" my way through the massive log files instead of making
calamaris process 1 huge concatenated file. Just have to make sure you
set all of the thresholds to infinite "-1" when you are do this so that
you don't lose any data before you do the final run (then you can set
the threshold to just the top 10/50/100/whatever).

eg: "snowballing the daily logs together"

# PROCESS 1ST DAY
cat /tempdump/access.log_Jan-01 |calamaris -d -1 -s -t -1 -O -c -v \
-o ./data/caldata_Jan-01 > ./reports/report_Jan-01 2>> ./calamaris.err
# THEN IMPORT 1ST AND PROCESS 2ND
cat /tempdump/access.log_Jan-02 |calamaris -d -1 -s -t -1 -O -c -v \
-i ./data/caldata_Jan-01 -o ./data/caldata_Jan-01_02 \
> ./reports/report_Jan-01_02 2>> ./calamaris.err
# THEN IMPORT 1ST+2ND AND PROCESS 3RD
cat /tempdump/access.log_Jan-03 |calamaris -d -1 -s -t -1 -O -c -v \
-i ./data/caldata_Jan-01_02 -o ./data/caldata_Jan-01_03 \
> ./reports/report_Jan-01_03 2>> ./calamaris.err
# RINSE AND REPEAT FOR WHOLE MONTH

Eventually you end up with a "caldata_Jan-01_31" stats digest file (with
all stats threshold set infinite). With that file you generate any
report you want just by running a "-i" import of it into a calamaris run.

eg: "top 100 domains and top 10 TLDs for Jan"

calamaris -d 100 -t 10 -z -i data/caldata_Jan-01_31 > Top100_Jan.txt


You can run this same report for Q1 by importing the 3 monthly stats
digest files for Jan, Feb and Mar (seperated by ":" delimiters).

eg: "top 100 domains and top 10 TLDs for 1st Quarter"

calamaris -d 100 -t 10 -z -i \
data/caldata_Jan-01_31:data/caldata_Feb-28_31:data/caldata_Jan-01_31 \
> Top100_Q1.txt


Of course, all of this import/output stuff is only necessary if you need
to analyse very large quantities of logs. If you can send everything in
one single STDIN pipe then it can all be done in one command (although I
recommend at least doing one big digest output and importing it into
your actual report command - that will speed up subsequent reports by
saving you redoing all the log parsing)

Hope this helps :)

Cheers,
-Dan

Reply all
Reply to author
Forward
0 new messages