Upcoming changes to Work types and Source types

480 views
Skip to first unread message

Jason Portenoy

unread,
Jul 14, 2023, 2:38:19 PM7/14/23
to OpenAlex users
Hello all,

This is just a heads up that we are working on some changes to our taxonomy of types for our Works and Sources. Up until now, we have been mostly relying on the work types we receive from Crossref---journal-article, book chapter, dataset, etc. We will soon be replacing these types with our own taxonomy---a similar one, but one that better matches the OpenAlex dataset. The legacy types from Crossref will remain available, in a type_crossref attribute on Works. We will also have some changes to our Source types, to improve those assignments.

You should see those changes coming in in the next few weeks. This means they will also be part of the next snapshot. Please let us know if you have any thoughts!

As always, thank you for your support of OpenAlex

Cheers,
Jason Portenoy

Jason Portenoy

unread,
Jul 20, 2023, 10:18:01 AM7/20/23
to OpenAlex users
The new work-type taxonomy is here! The new types are now what is in the Work.type attribute. The old types are still available in the Work.type_crossref attribute. The major changes are here, with a summary below. (also note that the linked code is likely to change somewhat in the near future as improvements are made)

- The old types "journal-article," "proceedings-article," and "posted-content" have all been merged into one type: "article".
- Several types, such as "journal," "journal-issue," and "proceedings" have been merged into a new type: "paratext," along with all works that were previously identified as paratext. (These are not the articles, but rather works representing the container objects themselves, i.e., the journal, the conference proceedings, etc.)
- Works that are errata (corrections) are type: "erratum". Coverage is low on this but will improve.
- "Book," "reference-book," and "monograph" types have been merged into type: "book".
- Most other types were left unchanged.

We made the decision to merge all of the article types into one "article" type because we feel it more accurately represents the current and future state of things, where distinctions between journal articles, conference papers, and preprints are not always clearly defined, especially between fields. For many analyses, these are the types of works that are of interest, and so many of the others can be ignored (paratext, datasets, etc.).

The changes above are currently being applied, and the data should be fully aligned with the new types within a few days. These changes will also be in the next snapshot.

Thank you for your continued support of OpenAlex, and be on the lookout for a few more work-type and source-type changes coming soon.

Bianca Kramer

unread,
Jul 20, 2023, 5:39:32 PM7/20/23
to Jason Portenoy, OpenAlex users
Thanks Jason for the detailed explanation (and documentation). 

I am happy that the legacy types from Crossref will remain available, as for many current (if perhaps not future) use cases, the distinction between preprints, journal articles and conference proceedings is relevant, issues with their definition across disciplines notwithstanding. In addition, since at least posted content has a separate metadata scheme in Crossref, it's also useful from that perspective to be able to keep them separate (e.g. when comparing metadata coverage). 

In general, while it's always possible to group categories together depending on a specfic use case, once categories are merged in the source data like this, it's much harder to separate them out again. I know it will still be possible to map back to Crossref to obtain their type classification (if they would be deprecated from OpenAlex in future), and for some, but not all of the non-DOIs it will be possible to infer the type by source. And of course, I realize you have a wide user base to consider, for many of whom this might well be a useful change. 

And as always, the heads-up, explanation and documentation are much appreciated! 

kind regards, Bianca  



Op do 20 jul 2023 om 16:18 schreef Jason Portenoy <jpor...@ourresearch.org>:
--
You received this message because you are subscribed to the Google Groups "OpenAlex users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openalex-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openalex-users/5023672b-0abb-4a8e-a1de-cd90ad1c3bcen%40googlegroups.com.

Jason Portenoy

unread,
Jul 20, 2023, 5:51:18 PM7/20/23
to OpenAlex users
These are excellent points, Bianca; I'm glad you raised them. We've added some information that will hopefully clarify our rationale here: https://docs.openalex.org/api-entities/works/work-object#type

We definitely agree that in many cases, having finer-grained categories that the end user can merge themself is preferable. However, when it came to the work type classifications we were using, we felt that the distinctions were not meaningfully serving the data. With the different article types in particular--- we consider those distinctions to be about where the work is published or hosted, not attributes of the works themselves. So, for instance, we consider preprints to be works of type "article" which have "submittedVersion" as their `primary_location.version` (as opposed to "publishedVersion" or "acceptedVersion"). Journal articles and conference papers are differentiated by their `primary_location.source.type`. (Please see the docs link above for more information)

It's always great to hear different perspectives. The issue about Crossref metadata differences between work types was not one that I was personally aware of before. Crossref is just one source of our metadata, though (albeit a major one), so our coverage is not entirely dependent on it. And we keep the `type_crossref` information for cases that rely on it (and for easy backward compatibility as we make the transition to the new types).

Thanks again, we really appreciate the feedback!

-Jason Portenoy
Reply all
Reply to author
Forward
0 new messages