Dataverse dataset references in LaTeX

1,471 views
Skip to first unread message

Philipp at UiT

unread,
Jan 31, 2020, 11:51:19 AM1/31/20
to Dataverse Users Community
A researcher has sent us the following question about using dataset references provided by Dataverse:

"On the dataset page, there is a button called "Cite Dataset". When choosing BibTeX, a reference is produced of the type @data. But LaTex is not able to process this reference. It seems that LaTeX does not support @data. Should I try to replace it with @misc, or should I just use the DOI number?"

Does anyone have any thoughts on this?

Best, Philipp

Sebastian Karcher

unread,
Jan 31, 2020, 10:28:34 PM1/31/20
to dataverse...@googlegroups.com
This is actually an immensely complex and somewhat fraught issue, so bear with me for more than you ever wanted to know about it:
"Standard" bibtex, in its purest form goes back to 1985 and is quite unsatisfactory for lots of contemporary item types -- @misc ends up being used for websites, data, software, etc. and there are no standard fields 
 or things like URLs and date of access. Because of that, lots of bibtex styles actually use item types and fields not defined in standard bibtex. We frequently see, e.g. fields like doi and url. For fields that's usually fine -- styles that don't have them defined just ignore those fields. But for item types, this is a problem because bibtex can fail when you use an item type that's not defined in the style you're using.

There are several approaches here:
1. Just support bibtex with a pared down, standardized @misc format. This means that citations "work" but is quite unsatisfactory because these aren't data citations. You're effectively going to end up citing datasets as webpages
2. You support both bibtex and biblatex. That's the route e.g. Zotero has taken. Biblatex is actively maintained and updated and does have item types for data and software. Unfortunately, it's much less commonly used, still, than bibtex, given the wide availability of (very hard to code) bibtex citation styles
3. Go all in with biblatex. That's the route that Zenodo very recently has decided to take, and they now use @dataset and @software : https://github.com/zenodo/zenodo/issues/1428 but note the not-very-old comment on that same issue that reported lots of user complaints about those item types https://github.com/zenodo/zenodo/issues/1428#issuecomment-398729076 -- it's unclear from the discussion on the issue that they every actually came to grips with the fact that their solution is going to cause a lot of things/styles to break. But as Katrin Leinweber makes clear in that thread, this is also a political move, trying to get people to adopt styles/systems that actually support proper data (and software) citation, which is in line with dataverse's mission.

(The @data item type that dataverse currently uses used to be the best option based on its usage in the biblatex-apa style (APA was one the first major style guides to have an explicit data citation format), but now that dataset is in biblatex -- which is very recent: it's included in version 3.14 which came out in December 2019) it should definitely be replaced.)

I think for a data repository to go with option 1 is tantamount to capitulation and I don't think dataverse should do it (if it did, we'd likely revert in our folk). I think options 2 and 3 both have merit. 2 has slightly higher implementation costs, but I'd expect this to be minimal as things go. 3. is the more forceful political move, but it will break things for end-users. 

As for what to tell your research now -- it kind of depends on the citation style, but I'd suggest using @misc and then making sure the DOI is in the printed citation (e.g. using howpublished = {\url{https://doi.org/10.1234/5678}} )

Sorry for the lengthy answer, I hope it's useful.
Sebastian

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/46787adc-73f3-446d-af44-d718b4665afe%40googlegroups.com.


--
Sebastian Karcher, PhD
www.sebastiankarcher.com

Philipp at UiT

unread,
Feb 1, 2020, 3:18:46 AM2/1/20
to Dataverse Users Community
Thanks, Sebastian, this was very useful! Although I - as someone who is not (yet) using LaTeX - might not have understood all the details, I'll pass this on to my RDM support colleagues and to the researcher in question.

Best, Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Danny Brooke

unread,
Feb 3, 2020, 6:10:31 PM2/3/20
to Dataverse Users Community
Thanks Philipp for bringing this up and Sebastian for the detailed response. Is there any Github issue for a feature request that should be created as a result of this conversation? I think we'd probably lean towards option 2, where we'd add biblatex as an additional citation type and support both for some period of time. 

- Danny

Philipp at UiT

unread,
Feb 5, 2020, 11:09:20 AM2/5/20
to Dataverse Users Community
I'm not aware of any GitHub issues about this. May I suggest that Sebastian creates one - as he knows the details about how it would make sense to fix this?
Best, Philipp

Sebastian Karcher

unread,
Feb 5, 2020, 11:28:39 AM2/5/20
to Dataverse Users Community
Reply all
Reply to author
Forward
0 new messages