Can I create the monthly stat report without calling "stat-report-monthly"?

482 views
Skip to first unread message

Terry Brady

unread,
Dec 19, 2016, 5:10:03 PM12/19/16
to DSpace Technical Support, DSpace Community
The DSpace Wiki indicates that the "stat-report" commands are deprecated.

https://wiki.duraspace.org/display/DSDOC6x/Command+Line+Operations#CommandLineOperations-Legacystatistics

Looking at demo.dspace.org, I see the following pages are available

--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
425-298-5498 (Seattle, WA)

Alan Orth

unread,
Dec 20, 2016, 4:36:54 AM12/20/16
to Terry Brady, DSpace Technical Support, DSpace Community
Hi,

We still use these legacy stats as well in DSpace 5.5, which is annoying because we need to keep all dspace.log.* files around for the entire month. Anyways, this is the cron job I run every night:

/dspace/bin/dspace stat-general && \
/dspace/bin/dspace stat-monthly && \
/dspace/bin/dspace stat-report-general && \
/dspace/bin/dspace stat-report-monthly

Hope that helps.

--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.
--

Terry Brady

unread,
Jan 11, 2017, 5:55:09 PM1/11/17
to DSpace Technical Support, DSpace Community
I am re-sending this question hoping to get some additional feedback.  Alan, thank you for your earlier response.

Is there a current recommendation on the use of the "legacy statistics" reports?  I see that these reports continue to be produced on demo.dspace.org.

How trustworthy is the data generated from these reports?  Does the community recommend that these reports continue to be run?

When I attempt to reconcile the data in this report with my solr statistics, I see significant differences.

There are a couple of fields such as OAI requests and User logins that are not captured in solr statistics.

Terry

On Tue, Dec 20, 2016 at 1:36 AM, Alan Orth <alan...@gmail.com> wrote:
Hi,

We still use these legacy stats as well in DSpace 5.5, which is annoying because we need to keep all dspace.log.* files around for the entire month. Anyways, this is the cron job I run every night:

/dspace/bin/dspace stat-general && \
/dspace/bin/dspace stat-monthly && \
/dspace/bin/dspace stat-report-general && \
/dspace/bin/dspace stat-report-monthly

Hope that helps.

On Tue, Dec 20, 2016 at 12:10 AM Terry Brady <Terry...@georgetown.edu> wrote:
The DSpace Wiki indicates that the "stat-report" commands are deprecated.

https://wiki.duraspace.org/display/DSDOC6x/Command+Line+Operations#CommandLineOperations-Legacystatistics

Looking at demo.dspace.org, I see the following pages are available

--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
425-298-5498 (Seattle, WA)

--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech+unsubscribe@googlegroups.com.

To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Bram Luyten

unread,
Jan 12, 2017, 7:25:43 AM1/12/17
to Terry Brady, DSpace Technical Support, DSpace Community
The code for these reports can be found here if I'm not mistaking:

I was looking for a trace of robot detection/filtering but couldn't find any.

Our (Atmire) point of view on these legacy stats is that they haven't been touched/developed for a long while and shouldn't be used anymore.

IF there is some bot filtering in there, the bot filtering we currently have in SOLR, and the possibility to retroactively mark usage as bots when new ips or agents have been detected, is definitely not present in these reports.

However, this is still an interesting discussion, would definitely be in favor of adding OAI requests and User logins as usage events that we start tracking in the SOLR logs. Will create JIRA issues for those.

Bram


logoBram Luyten
250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Esperantolaan 4, Heverlee 3001, Belgium
atmire.com

--
You received this message because you are subscribed to the Google Groups "DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-community+unsubscribe@googlegroups.com.
To post to this group, send email to dspace-community@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-community.

Terry Brady

unread,
Jan 12, 2017, 12:26:12 PM1/12/17
to DSpace Technical Support, DSpace Community
Bram,

Thanks for the feedback on this.  If the data in these reports should not be used anymore, I wonder if we should suppress the inclusion of these reports by default and require an explicit action to continue to display them.  

Terry

Tom Desair

unread,
Jan 13, 2017, 3:52:05 AM1/13/17
to Terry Brady, DSpace Technical Support, DSpace Community
My feeling is that with the development of DSpace 7, we need to refactor and improve the way DSpace logs and processes stats/events:
  • Add more event types like OAI requests and user logins. But we could even take this further and provide a complete audit trail (log edits, deletes, updates, moves... of all DSpace objects). This would allow and admin to see everything that happened to an item.
  • When we have all that information, we can remove the legacy stats from the code base and build similar screens that use this new information.
  • I also think that this event information should be logged in a table in the database. Events should than be processed asynchronously (send data to Google Analytics, index statistics view record in SOLR with extra item metadata, notify any other third party that might be interested (like IRUS) ...). This would improve the user experience (page load times) and also solve problems like https://jira.duraspace.org/browse/DS-2904
  • This would also allow you to "reindex" stats and makes taking a backup of your statistics a lot easier since they are included in the regular database backups. SOLR was never built to be a "persistent data store" as mentioned here: https://groups.google.com/forum/#!msg/dspace-tech/tMxMSif5U-Q/mC7SuBBDFwAJ. SOLR cores can easily become corrupt by unexpected server shutdowns.

What do you guys think? Should we create a Jira ticket for this and discuss this in a developer meeting?
 

 
logoTom Desair

250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Esperantolaan 4, Heverlee 3001, Belgium

Terry Brady

unread,
Jan 13, 2017, 1:27:21 PM1/13/17
to DSpace Technical Support, DSpace Community
Andrea, Tom, Anthony, thanks for the comments.

I created a ticket to disable legacy statistics: https://jira.duraspace.org/browse/DS-3454.

Per the DSpace Roadmap, usage statistics consolidation is targeted for DSpace 7 Priority 1: https://wiki.duraspace.org/display/DSPACE/RoadMap#RoadMap-CandidateFeaturesforDSpace7.0-Priority1

I added this as a discussion item for the next Developer meeting: https://wiki.duraspace.org/display/DSPACE/DevMtg+2017-01-18

On Fri, Jan 13, 2017 at 6:23 AM, Anthony Petryk <Anthony...@uottawa.ca> wrote:

Hi Tom,

 

We are definitely interested in a more robust statistics/reporting system for DSpace. 

 

Anthony

Reply all
Reply to author
Forward
0 new messages