how do you search in raw affiliation string? a bug maybe?

134 views
Skip to first unread message

Ivan Sterligov

unread,
Sep 9, 2022, 7:16:47 AM9/9/22
to OpenAlex users
Hello everyone,

As the ROR and internal org IDs in OpenAlex have yet very low coverage, the best way to get at least some papers for the majority of organisations seems to search in raw affil string.

But when I use the recommended method, I get way too many results. This is the canonical example from the docs:

Get works with the words Department of Political Science, University of Amsterdam somewhere in at least one author's raw_affiliation_string:


It currently returns 61.129.340 results

ok, the docs recommend parentheses to narrow:


It returns 88.843.403 results

Is there something strange going on or am I missing something?


Best regards,
Ivan





Casey Meyer

unread,
Sep 12, 2022, 12:55:07 PM9/12/22
to Ivan Sterligov, OpenAlex users
Hi Ivan,

Great catch! Yes, this is a bug that is fixed. 

We made some changes to raw affiliation string searches to try and catch situations such as "france" in "paris, france#TAB#". But that change ended up loosening the query too much. It's now similar to how the other searches work, while still catching the situation I mentioned.

Can you take a look and see if it is working how you expect it to?

Thanks,
Casey

--
You received this message because you are subscribed to the Google Groups "OpenAlex users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openalex-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openalex-users/CAD%3DWXNXfUs%2Bg7-_7TKa%3Da02CJo5%2BzEL3%3DpRauT0F1%2BCutmdMhQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
Casey Meyer
Developer - OpenAlex, Unpaywall
OurResearchWe build tools to make scholarly research more open, connected, and reusable—for everyone.

Ivan Sterligov

unread,
Sep 13, 2022, 7:48:13 AM9/13/22
to Casey Meyer, OpenAlex users
Hi Casey,

Thanks a lot, now it seems to work fine. I'd like to add that the OR pipe and NOT exclamation mark work in search too (maybe this should be mentioned in the docs).

Also, thanks for commenting on OpenAlex\Unpaywall OA status, I get that the lag between Unpaywall and OpenAlex is insignificant

As for the broader topic with affiliation search in OpenAlex, It would be great to hear or read about the future of affiliation metadata in OpenAlex.
 
It leads the open affiliation metadata race with vanilla CrossRef and Semantic Scholar trailing far behind, so I guess many people would be interested in the future developments. Currently based on estimates of a small non-random sample of 50k papers it has ~87% affil coverage for 2021 year compared to WoS\Scopus, which is great, but insufficient for practical research evaluation.

So any info on the future of:  

- correcting\merging org profiles
- ror adoption\linking (currently very poor for the majority of orgs)
- mining missing affiliations from OA papers (currently there are papes in, say, Scientific Reports, that are missing affiliations in CrossRef and OpenAlex, but not in WoS\Scopus)

would be greatly appreciated, the same is true for author profiles. Perhaps a roadmap on the official website would be the best option :) much to ask, I know :)

Best regards,
Ivan
--
Всего доброго,

Иван Стерлигов

Casey Meyer

unread,
Sep 19, 2022, 10:33:59 AM9/19/22
to Ivan Sterligov, Richard Orr, OpenAlex users
Hi Ivan,

You're welcome! Great point on the | and ! filter working within filter searches. I had to check that myself, but you're right it works. :) I'll review the docs to ensure it's mentioned.

As to your other questions - I'll answer two that I know then will refer to @Richard Orr  for the other two:
  • Mining affiliations:  Glad to hear how we're doing with that! We're working on this right now and I would expect coverage to improve even more over the next couple months. Mentioning Scientific Reports is a great example, because we should have very good affiliation coverage for that journal. So I'm looking into it. It would be great if you could run your numbers again in a couple months after we've done some more work. I would like to reach out to you later when we're ready. Always like an independent review in order to see how we are doing.
  •  We've considered a roadmap on the website but aren't ready to do that just yet. Agree it's a great idea and hopefully we can get one done and published later on.
Thanks,
Casey


Reply all
Reply to author
Forward
0 new messages