Retrieving MDC counts via API not working

126 views
Skip to first unread message

Lars Vilhuber

unread,
Mar 19, 2024, 8:32:56 PM3/19/24
to Dataverse Users Community
Dear all,

I'm a user (not admin). I want to download the MDC metrics for some dataset. I follow the instructions at https://guides.dataverse.org/en/latest/api/native-api.html#dataset-metrics-api and for good measure https://guides.dataverse.org/en/6.1/api/native-api.html.

It's not working.

Problem 1: The instructions state that "you can optionally pass" (my emphasis) country and month parameters. I just want totals. When I do that:

export SERVER_URL=https://dataverse.harvard.edu
export PERSISTENT_ID=doi:10.7910/DVN/DZBQB2

curl "$SERVER_URL/api/datasets/:persistentId/makeDataCount/viewsTotal?persistentId=$PERSISTENT_ID"

I get

{"status":"OK","data":{"message":"No metrics available for dataset 8939107 for null for country code null."}}

The dataset page itself says "19".

If I add the (optional!) parameter "country=us",

curl "$SERVER_URL/api/datasets/:persistentId/makeDataCount/viewsTotal?persistentId=$PERSISTENT_ID&country=us"

I get "... for country code us" but the same "No metrics".

I tried it for one other dataverse as well (which in this forum has complained about inexact numbers, but not "null" numbers)

export SERVER_URL=https://data.aussda.at/
export PERSISTENT_ID=doi:10.11587/VJ01D5


I also ensured that if I first "echo" my entire string, that it seems plausible (no hidden characters).

What am I doing wrong? This all seems exactly like the Guide says to query this.

Sebastian Karcher

unread,
Mar 19, 2024, 8:41:01 PM3/19/24
to dataverse...@googlegroups.com
You're not doing anything wrong, but neither Harvard nor Aussda look like they have MDC runnning (you're just seeing the native metrics on the dataset page).
The call above works on QDR, e.g.


Sebastian

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/76bb2eb0-3e0d-4c08-b751-76fac3fa2691n%40googlegroups.com.


--
Sebastian Karcher, PhD
www.sebastiankarcher.com

Lars Vilhuber

unread,
Mar 20, 2024, 7:43:06 AM3/20/24
to dataverse...@googlegroups.com
Thanks! Two questions then:

A) shouldn't the API report back that it's giving you nonsense? Some text like "this site is not running MDC, move on, nothing to see here..." 

B) what other API endpoint allows me to obtain the native metrics? 

(OK, and Harvard should really be running MDC...) 

Lars 

--
From mobile device

You received this message because you are subscribed to a topic in the Google Groups "Dataverse Users Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataverse-community/MN1m9PWz8Ts/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/CAOSYSD6c_Nw6fUbP3C%3DpsrnEBH%3DUFSt1-KzoaUgt8JsRMGbQOQ%40mail.gmail.com.

James Myers

unread,
Mar 20, 2024, 8:48:02 AM3/20/24
to dataverse...@googlegroups.com

Re B) – did you find https://guides.dataverse.org/en/latest/api/metrics.html ? That’s the overall list of metrics endpoints.

Philip Durbin

unread,
Mar 20, 2024, 3:18:31 PM3/20/24
to dataverse...@googlegroups.com
For A), yes, something like "Make Data Count hasn't been set up, please try the Metrics API" sounds fine. If you are so inclined, please feel free to create an issue: https://github.com/IQSS/dataverse/issues

Also, at Harvard Dataverse, we do plan to set up Make Data Count: https://github.com/IQSS/dataverse.harvard.edu/issues/3



--

Lars Vilhuber

unread,
Mar 20, 2024, 5:18:45 PM3/20/24
to Dataverse Users Community
OK, I think I've got it.

To get a list of file-specific download stats, I would need to

1. Get the list of a dataverse collection's contents (which lists datasets): $SERVER/api/dataverses/$dvALIAS/contents
2. For each dataset ID, get a list of the files: "$SERVER/api/datasets/$ID/versions/$VERSION/files"
3. For the dataverse collection, get the file download stats: $SERVER/api/info/metrics/filedownloads?parentAlias=$dvALIAS
4. Match the IDs (file IDs?) from (3) to the list in (2) by "id", and I have the names of files and their associated file download stats.

I didn't see anything easier/more straightforward than that, right?

Lars

Philip Durbin

unread,
Mar 21, 2024, 12:38:10 PM3/21/24
to dataverse...@googlegroups.com
Sounds like that should work. I'd be remiss, however, if I didn't point out the "Downloads per DataFile (top 100)" plot at https://data.qdr.syr.edu/metrics_5ef2ae2be4b/ (screenshot attached) in case you can reuse any of that code, which you can find at https://github.com/gdcc/dv-metrics

There are also various reporting tools contributed by the community that you might want to check out: https://guides.dataverse.org/en/6.1/admin/reporting-tools-and-queries.html



Screenshot 2024-03-21 at 12.35.56 PM.png

Lars Vilhuber

unread,
Mar 21, 2024, 4:39:40 PM3/21/24
to dataverse...@googlegroups.com
Thanks. Most of those seem to be admin-level installations, which I am not. Once the Harvard DV implements MDC, it'll be easier I assume. For now, it's handed off to an RA, and I'll share that code (for parsing a single collection) on that Google Doc (which has at least one dead link).

Lars



--
Lars Vilhuber
Private: la...@vilhuber.com
            la...@cloutier-vilhuber.net

James Myers

unread,
Mar 21, 2024, 5:38:29 PM3/21/24
to dataverse...@googlegroups.com

FWIW: MDC does not track per-file counts at all, only per-dataset.

James Myers

unread,
Mar 21, 2024, 5:50:44 PM3/21/24
to dataverse...@googlegroups.com

Sorry – clicked send to soon: MDC does not track per-file counts at all, only per-dataset and the API currently only aggregates to the level of a subcollection. We do display the per-dataset counts in the UI so we’ll need an api to get per-dataset counts for the new single-page front end, but I don’t think you can get per-dataset MDC info now from Dataverse. We do report stats to DataCite, so nominally you can get the reports from there – see https://support.datacite.org/docs/consuming.

 

-- Jim

Reply all
Reply to author
Forward
0 new messages