Unpaywall data update

40 views
Skip to first unread message

Scott Chamberlain

unread,
Oct 6, 2022, 12:53:39 PM10/6/22
to Unsub Announcements

Hi!

We use Unpaywall browser extension data in Unsub as one of the pieces of information that goes into calculating your Unsub forecast. See this Unsub docs page for more information. 

We just updated the data yesterday (Oct 5, 2022). What does this mean for you? Before we get to that, let’s look at what the title-by-title histogram of Cost Per Use part of the scenario looks like. 

Every Unsub dashboard histogram of journal titles has three categories:
- Cost Per Use Less Than $100 (or £100) (hereafter <100)
- Cost Per Use Greater Than $100 (or £100) (hereafter >100)
- No Paywalled Usage (hereafter No Paywalled)

You can see the three categories below:

Screen Shot 2022-09-28 at 10.01.06 AM.png

Back to what does this mean for you … We looked at a random set of 100 Unsub scenarios (i.e., dashboards) and compared them for the old version of the Unpaywall browser extension data vs. the new data. There were ~2700 different journal titles included in these 100 scenarios. Here's the breakdown:
  • Total number of titles in a scenario: 0.1% increase with the new data
  • Number of titles in each of the three CPU groups:
    • <100 CPU: 1.4% increase
    • >100: 12.5% increase
    • No Paywalled usage: 23% decrease
  • Overall scenario cost: 2% decrease
  • Overall scenario access: 1% increase

Keep in mind that these are averages. Your scenarios may change more or less along each of these dimensions.

If we compare journal titles that changed categories from the old to the new data, the top 5 titles (by frequency across the 100 random scenarios) that switched categories were (all Springer Nature titles):
- Physical and Engineering Sciences in Medicine
- Research on Child and Adolescent Psychopathology
- Mind & Society
- Asian journal of business ethics
- SN computer science

Those 5 titles all switched from No Paywalled to <100. In fact, the top 14 titles that switched categories the most all moved from No Paywalled to <100, and the top 33 switched from either No Paywalled to <100 or No Paywalled to >100. There was variation within titles, accounted for by the user specific data uploaded (e.g., COUNTER reports).

What attribute is most associated with the titles that changed categories? It is largely due to titles that started publishing in 2019 or later. Although this data update updates data for all journal titles, the biggest change is expected for those that started publishing in the past few years.

Associated Github release notes: https://github.com/ourresearch/jump-api/releases/tag/v3.2 

As always, reach out if you have any questions at sup...@unsub.org or on the Unsub Discuss list to have a public discussion.

Cheers, Scott

Reply all
Reply to author
Forward
0 new messages