Best query to target jourals/repositories

Tom Walls

unread,

Dec 9, 2022, 3:58:54 PM12/9/22

to Unpaywall discussion

Hi Unpaywall Team

I want to say, great product you have here, impressive amount of data collated together!

I have a question about how best to query the Unpaywall dataset to return all papers that reside in a set of journals/pre-print repositories.

Locations I am interested in at the moment are biorxiv, medrxiv and chemrxiv, and I may want to extend this list in future.

One query I have been using is something like this:

"oa_locations.repository_institution:(*biorxiv* OR *medrxiv* OR *chemrxiv*)"

Another query I have been using is like so:

"query": "(oa_locations.repository_institution:(*biorxiv* OR *medrxiv* OR *chemrxiv*) OR oa_locations.url_for_pdf:(*biorxiv* OR *medrxiv* or *chemrxiv*) OR oa_locations.url_for_landing_page:(*biorxiv* OR *medrxiv* or *chemrxiv*))"

this second query produces x3 times the amount of papers than the first query....

I hoping someone can help explain the difference between these queries and why there is such a difference in the amount of results returned, and help me to understand what would be the best / recommended query to get all papers from particular journals/repositories.

Thanks in advance

Tom

Casey Meyer

unread,

Dec 12, 2022, 10:38:19 AM12/12/22

to Unpaywall discussion

Hi Tom,

Was this question meant for OpenAlex? I believe we answered it there.

Thanks,
Casey

Tom Walls

unread,

Dec 12, 2022, 11:28:42 AM12/12/22

to Unpaywall discussion

Hi Casey

It's kind of the same question really, but for the UPW dataset rather than the OpenAlex dataset - I wasn't sure if I should keep them separate in the different google groups for the specific datasets.... but I have now asked this question in the OpenAlex group following our conversation on both, so can maybe keep the convo going over on the OpenAlex group as to not have two threads on two different forums

Thanks

Tom

Casey Meyer

unread,

Dec 13, 2022, 3:19:22 PM12/13/22

to Tom Walls, Unpaywall discussion

Ok I'll keep the conversation going on the OpenAlex group. Thanks!

Casey

--
You received this message because you are subscribed to the Google Groups "Unpaywall discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unpaywall+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unpaywall/43b02aa7-2452-469d-aa31-2cfb79b9bd82n%40googlegroups.com.

--

Casey Meyer

Developer - OpenAlex, Unpaywall

OurResearch: We build tools to make scholarly research more open, connected, and reusable—for everyone.

Reply all

Reply to author

Forward