Issue with affiliation assignments and name disamgibuation

51 views
Skip to first unread message

Jason Augustyn

unread,
Sep 6, 2024, 5:53:58 PM9/6/24
to OpenAlex Community
This document in OpenAlex has 340 authors. Many of them have 53 affiliations listed in the institutions array:


This disagrees with the actual document. For example, the lead author, Chekanov, has 53 affiliations according to OpenAlex, but in reality has only one:


I am finding a very large number of errors with author name disambiguation and affiliation matching across the entire data set. This is an addition to systematic errors with publication counts and derived values (e.g., h-indexes) that I have previously communicated with the OpenAlex team about. For example, in the most recent snapshot there is an author record with 434,622 attributed works:


Is the team aware of these sorts of problems, and if so, what are the plans to improve the data quality? My team's primary use case for OpenAlex is author and institution-level analytics, so  these sorts of issues are major concerns.

Jason Augustyn

unread,
Sep 13, 2024, 8:05:00 AM9/13/24
to OpenAlex Community
Additional context: There are 44,816 author ids that appear in works but have no corresponding record in the authors collection. Checking a sample of those against the API it appears they have been deprecated, yet have not been removed from the works data in the snapshot.
Reply all
Reply to author
Forward
0 new messages