OA classification - hybrid with best_oa_location.license is NULL?

97 views
Skip to first unread message

Bianca Kramer

unread,
Jan 23, 2022, 6:02:07 PM1/23/22
to Unpaywall discussion
Hi Richard, Heather, Jason,

I came across a surprising (well, to me) thing in the OA classification that results in the Unpaywall oa_status: there appear to be some 10K records with oa_status 'hybrid' but without a license (best_oa_location.license = null)?

Two examples, one with and one without license detected, both from the same journal and the same year:
https://api.unpaywall.org/v2/10.1242/jcs.246025?email=bianca...@gmail.com
(oa_status hybrid, license: cc-by)
(oa status hybrid, license: null)

The articles in question are in large majority from two publishers: American Mathematical Society (n=5.3 K) and The Company of Biologists (n=3.4 K). 

I might well be overlooking something obvious (and been staring at data for a bit too long...), but if you could shed some light on this, that would be much appreciated! 

kind regards, 
Bianca
 


ric...@ourresearch.org

unread,
Jan 25, 2022, 10:38:22 AM1/25/22
to Unpaywall discussion
Hi Bianca,

This is surprising to me too! Part of this is just a bug - we don't want any paper without a license to be Hybrid. We can fix that, but the way it happened here is interesting. Using 10.1242/jcs.241513 as an example:
For this DOI we have a simple decision: if https://www.biologists.com/user-licence-1-1/ is open, it applies to the published version and this is a Hybrid DOI. If it isn't, it's Bronze.

But what if the license is open and only applies to the accepted manuscript? Is that Hybrid? Or is a publisher-hosted, open-licensed manuscript Green? Or something else, since "green" = "repository" for so many of us? What if the license isn't open and only the manuscript is available? Is that Bronze? More generally, should publisher-hosted manuscripts be bronze/hybrid, green, or something new?

I think the most important thing is for us to get the OA locations right (in this case, not merge OA locations that are different versions of the paper) so you can make your own decision. oa_status is trying to characterize both the OA location and the publishing model, and I think this is one place where those goals conflict.

Richard

ric...@ourresearch.org

unread,
Jan 26, 2022, 12:22:59 PM1/26/22
to Unpaywall discussion
For anyone following: the original record that prompted the question is here: https://gist.github.com/richard-orr/a1189b84fdb2695c58d5c5de4d944c19

We're now showing accepted manuscripts and published papers on the same publisher page as separate oa_locations in all cases. There are still things I'd like to clean up in this example (mainly, accepting the publisher-specific license from crossref as open while ignoring it on the publisher page) but this should resolve the main problem, which was a Hybrid article with no hint of a license in any of the oa_locations.

Richard

Bianca Kramer

unread,
Jan 26, 2022, 5:13:25 PM1/26/22
to Unpaywall discussion
Thanks a lot Richard for your detailed explanations! It took a while to wrap my head around :) 

I do see how publisher-hosted accepted manuscripts are complicating things... (I'm also otherwise not a fan of that development, but that's beside the point here). I like your approach now to show accepted manuscripts and published versions as separate oa_locations, even if they are both hosted by the publisher. 

As your other questions/considerations go, my personal take: I lean towards considering publisher version on publisher platform as gold/hybrid/bronze (depending on the license of the publisher version at that location), and other versions and/or other locations as green - including accepted manuscripts on publisher platforms, submitted manuscripts on preprint servers and any version in disciplinary or institutional repositories. But... I also fully recognize this is a subjective take and opposite choices can also be argued. It's complicated!

In general, having all information on all oa_locations available (including version at that location and the license of that version at that location) enables applying different lenses  -> good! 

What I'm still parsing is the potential of having oa_status not just determined by the characteristics of  best_oa_location (incl. version, host type and license), but potentially also by a characteristic of one of the other oa_locations (e.g. license). In that sense, I guess to me the difference between versions is more important than the similarity of host type. But again, I fully admit that's a subjective interpretation. 

Thanks again for the brain teaser :-) 

kind regards, Bianca 
Reply all
Reply to author
Forward
0 new messages