One (awesome) thing I’ve noticed in OpenAlex data is that you can sometimes find if a given work is available from multiple locations. For example this one is available both as a published article in Contemporary Mathematics as well as a preprint from arXiv (look at the locations property):
https://api.openalex.org/works/W1685360256
I took a look in the documentation, albeit somewhat quickly, to try to learn more about how this matching works and I ran across this description [1]:
—
We get information about scholarly works as records. A record can take several forms. It may be an item of Crossref metadata; an entry from a repository like arXiv, Pubmed, or an institutional repository; or publicly available information on the internet. A record contains information about a work, so our first task whenever we get a new record is to determine if the work already exists in our system. If we are able to link it to an existing work—using a DOI or other metadata matching technique—then we use the information in the record to enrich that work.
—
I was wondering if there is any more information to learn about how this linking process works, or if I could get a pointer to the relevant code on GitHub where this is happening?
Thanks for any information you can provide, and for a super resource!
//Ed
[1]
https://help.openalex.org/hc/en-us/articles/24347019383191-Where-do-works-in-OpenAlex-come-from