Dataverse migration download statistics

50 views
Skip to first unread message

Kaitlin Newson

unread,
Jan 23, 2017, 9:28:56 AM1/23/17
to Dataverse Users Community
Hi everyone,

Apologies if this has been addressed elsewhere - I did some searches and haven't found anything.

Since we've migrated from 3.6, we've had users notifying us that the file download statistics appear to be inflated. I believe we've previously discussed this with the Dataverse tea,, and that it was an expected behaviour. My question is, is there any way to get accurate numbers for metrics, or is this impossible post-migration? And is it possible to 'reset' the download numbers so that they accurately reflect our metrics going forward?

Thanks!

danny...@g.harvard.edu

unread,
Jan 23, 2017, 2:00:13 PM1/23/17
to Dataverse Users Community
Hi Kaitlin - in the most recent (4.6) release we've made a change so that the application now properly increases the download count when a file is downloaded via the API, which previously did not happen in 4.x (https://github.com/IQSS/dataverse/issues/3331). However, this would lead to the opposite of metrics inflation (deflation?).

Can you please open a github ticket or provide some more specifics here so that we investigation in more detail?

I'll also check with the team about whether or not a "reset" the download numbers is possible. 

- Danny
Message has been deleted

Kaitlin Newson

unread,
Jan 27, 2017, 9:30:56 AM1/27/17
to Dataverse Users Community
Hi Danny,

Here is a sample of some of the download numbers in 3.6 vs 4 post-migration that were reported to us. It appears that they may have all doubled, and it's unlikely that they all received this amount of downloads between those time periods, with the exception of the Truth and Reconciliation dataset. Please let me know if there's any more specific information that would help to investigate this issue.

danny...@g.harvard.edu

unread,
Feb 2, 2017, 5:32:03 PM2/2/17
to Dataverse Users Community
Hey Kaitlin - thanks for the examples and the offer to provide more information. 

Is it possible for you to check these values against what's in the database? There's a recently noted issue about a discrepancy between DB counts and guestbook counts - https://github.com/IQSS/dataverse/issues/3582. I'd be interested to see if this is part of what's happening here. 

Thanks,

Danny

Kaitlin Newson

unread,
Feb 9, 2017, 11:58:06 AM2/9/17
to Dataverse Users Community
Hi Danny,

We tried running the query in the github issue - select count(id) from guestbookresponse where guestbook_id=3;

But are getting counts of 0. Would the guestbook_id be the same as the dataset id?

Tim DiLauro

unread,
Feb 9, 2017, 12:49:42 PM2/9/17
to dataverse...@googlegroups.com
Hi Kaitlin,

The same guestbook can be used for multiple datasets. The dataset and guestbook ids would only be the same by happenstance. To get a count of downloads for a particular dataset, you can use the following, replacing "<dataset-id>" near the end of the query with the desired id:

    select count(id) from guestbookresponse where guestbook_id in (select guestbook_id from dataset where id = <dataset-id>);

Cheers,
~Tim

-- 
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/10a24375-b7d4-4bb5-b3e2-5e2c59d88831%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc

Kaitlin Newson

unread,
Feb 9, 2017, 1:54:40 PM2/9/17
to Dataverse Users Community
It looks like the ids we are looking for don't have guestbook ids in the database. In that case, how can we check the download counts? Where would they be pulled from in the db? Both of these datasets were migrated from 3.6, in case that makes any difference.

Thanks for all the help so far!
Hi Kaitlin,

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages