Make data count

42 views
Skip to first unread message

Don Richards

unread,
Nov 6, 2023, 4:57:52 PM11/6/23
to Dataverse Dev
I'm in need of some assistance. I've made an attempt to set up Make Data Count on an existing Dataverse server, following the official documentation. It appears that I've successfully configured it based on my observations, but the data doesn't seem to be aggregating as expected. Within the "/usr/local/payara5/glassfish/domains/domain1/logs" directory, I can see various log files such as counter_YYYY_MM_DD.log, export_YYYY_MM_DDThh-mm-ss.log, oaiSetsUpdate_YYYY_MM_DDThh-mm-ss.log, server.log, and server.log_YYYY_MM_DDThh-mm-ss. All of these files are owned by "counter:dataverse." The log/ directory is set to `drwxr-x---.`. /usr/local/counter-processor-0.1.04 is owned by "counter:counter".  But the /usr/local/counter-processor-0.1.04/tmp directory is owned but "dataverse:counter". The /usr/local/counter-processor-0.1.04/tmp/make-data-count-report.json file is empty. With the example of the counter-processor-config.yaml file here

To provide more context on how I set up the environment for the Make Data Count setup, you can find a script here. And my daily/weekly shells scripts are here. I've reached out into the chat conversations to see if I can work this out but I'm kinda at a loss. I'm hoping to find a few answers. 
  1. Are these the correct file permissions for the logs?
  2. Is this the correct directory to point to?
  3. How should I be testing this?
  4. What else should I be checking?

James Myers

unread,
Nov 8, 2023, 10:11:23 AM11/8/23
to datave...@googlegroups.com

Don,

 

I can’t spot an error in what you’ve sent, but I can try to give some hints w.r.t. debugging:

  • The counter_daily.sh script has to be able to touch/write 0 byte files for any days in the month where there was no activity. Counter_processor itself should only need read access to those log files. (FWIW – I’ve usually put those logs in an mdc subdir of the normal logs directory to keep them separate, but I don’t see any reason using the main logs dir wouldn’t work.)
  • The counter_daily.sh script writes info to the tmp/counter_daily.log file. I’d expect info in there that would show you how far through things you’re getting.
  • Once you’re setup, you would normally just run the counter_daily.sh via cron job once per day but for testing you can run it manually. All of counter_processor’s state is in the state subdir, so if you empty that you’ll be starting over. (Or edit the statefile.json to remove a month and remove the corresponding sqlite3 file) (I’m not sure what would be written to the make-data-count-report.json file if you are trying to run for a month counter-processor thinks it has already completely done.)
  • The counter_daily.sh script runs counter-processor to process the logs and, if upload_to_hub is true, to send that info to DataCite. Once you have content in the make-data-count-report.json file, there are DataCite api calls you can use to see that your info was indeed sent. The next step in the script is to also send that report to Dataverse. Those go into the datasetmetrics table (each daily run of the cron job overwrites the entries for the current month), so you should be able to see changes there for successful runs.

 

Hope that helps. If not, the contents of the counter_daily.log file might help me/others spot something.

 

-- Jim

--
You received this message because you are subscribed to the Google Groups "Dataverse Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-dev/8726819a-2f02-48bb-ad62-11e17bf43685n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages