DSpace statistics/logs file size control

1,216 views
Skip to first unread message

Fitchett, Deborah

unread,
Mar 1, 2016, 9:14:12 PM3/1/16
to dspac...@googlegroups.com

Kia ora koutou,

 

Our solr statistics files are large and growing, and similarly our dspace/log directory. We were looking at compressing the dspace.log files to save space but understand that:

a)      The stat-monthly cronjob needs the dspace log files for the latest month, but

b)      The stat-general cronjob needs all dspace.log files present or the general/total stats that you get when you first hit the statistics page will be inaccurate.

 

It doesn’t seem very efficient for us to have to store all the dspace.log files forever just to keep accurate statistics. Is this just how it is or are we missing something?

 

Thanks very much!

 

Nāku noa, nā

 

Deborah Fitchett

Senior Advisor, Digital Access

Library, Teaching and Learning

 

p +64 3 423 0358

e deborah....@lincoln.ac.nz | w ltl.lincoln.ac.nz

 

Lincoln University, Te Whare Wānaka o Aoraki

New Zealand's specialist land-based university

 



P Please consider the environment before you print this email.
"The contents of this e-mail (including any attachments) may be confidential and/or subject to copyright. Any unauthorised use, distribution, or copying of the contents is expressly prohibited. If you have received this e-mail in error, please advise the sender by return e-mail or telephone and then delete this e-mail together with all attachments from your system."

Andrea Schweer

unread,
Mar 1, 2016, 10:04:01 PM3/1/16
to Fitchett, Deborah, dspac...@googlegroups.com
Hi Deborah,


On 02/03/16 15:14, Fitchett, Deborah wrote:
Our solr statistics files are large and growing, and similarly our dspace/log directory. We were looking at compressing the dspace.log files to save space but understand that:

a)      The stat-monthly cronjob needs the dspace log files for the latest month, but


That's correct.


b)      The stat-general cronjob needs all dspace.log files present or the general/total stats that you get when you first hit the statistics page will be inaccurate.


No, that isn't how it works. Once the month is over and you have the monthly .dat file, you can compress/delete/move the dspace.log files. (I believe JSPUI uses the html files created by stat-report-general / stat-report-monthly; XMLUI doesn't use these html files and consequently doesn't need you to run the stat-report-XYZ commands.)

Also, just to be clear -- all of this only applies to what I call the "Admin statistics" (what you see at http://demo.dspace.org/xmlui/statistics when logged in as an administrator).

The solr-based usage statistics don't read any log files at all. If by "solr statistics files" you mean the solr.log.[date] files in [dspace]/log, those are just "what's going on" type log files telling you what solr queries are being made -- they aren't read by any DSpace processes. You can delete these and/or reduce the solr log level in log4j-solr.properties to WARN or the like without losing any data; in fact I believe one of the more recent DSpace versions has some bug fixes around verbosity of the Solr logging. (Ah no, that's just for catalina.out, see https://github.com/DSpace/DSpace/commit/aaf608ceae59cbbabe0f52495a8c3834e0d942a5)

I hope this helps!

cheers,
Andrea

-- 
Dr Andrea Schweer
Lead Software Developer, ITS Information Systems
The University of Waikato, Hamilton, New Zealand
+64-7-837 9120

Fitchett, Deborah

unread,
Mar 2, 2016, 2:56:03 PM3/2/16
to Andrea Schweer, dspac...@googlegroups.com

Oh awesome, that will let us save a lot of space indeed! Thanks very much!

 

Deborah

Reply all
Reply to author
Forward
0 new messages