Introducing OpenAlex Topics

388 views
Skip to first unread message

Jason Portenoy

unread,
Feb 12, 2024, 2:21:27 PMFeb 12
to OpenAlex users

Dear OpenAlex community,


This is the official announcement of our new Topics feature. Topics are a new and improved way of classifying OpenAlex works by what they are about.


  • The Topics are live in the API; they live in Works’ topics and primary_topic attributes. They will be appearing in the Web interface and the data snapshot very soon.

  • Learn about Topics and how they are assigned here: https://help.openalex.org/how-it-works/topics. This is also where you can find links to methodology and code.

  • Technical documentation on topics in the API and data model: Topics and Work.topics


There are about 4,500 topics, and most works are assigned at least one of them based on their title, abstract, citations, and journal—or some combination of these, depending on what is available to us. Each topic is also described by broader categories in a four-level hierarchy. The broadest of these is called the domain, then the field, then the subfield. These correspond to the categories used by Scopus’s ASJC system, except that they are applied at the level of Works, rather than at the journal level. Please see the documentation page for more explanation, and a helpful diagram.


We developed the methods behind these topics in collaboration with CWTS, who used it in part to help with field-normalization in their recently-released Open Leiden Ranking. Learn more about this in their recent blog post: "An open approach for classifying research publications". The method involves clustering the citation network to find research communities, labeling the clusters with modern AI language models, and using deep-learning networks to expand the labeling to most works, even those not well-represented in the citation network.


There are new API endpoints to get information about topics: the /topics endpoint is documented here, and the /subfields, /fields, and /domains endpoints will be documented soon.


We’re very excited about this new feature, as we think it’s a big improvement over previous ways of categorizing the hundreds of millions of research publications at scale. We know that it won’t be perfect, however, so we encourage you to try them out and report back what you think of them, including if there’s anything you think should be corrected. This is a community effort! Send us your thoughts using the form at https://openalex.org/feedback


A note about Concepts

We are keeping the legacy Concepts which were based off the MAG fields of study, and we will continue to assign them to incoming works. However, we will not be actively supporting or maintaining that system, and we have no plans to update them or add additional concepts. We are planning to rename them from concepts to mag_concepts in the API and data model. We will announce when we plan to do this, but for now there are no breaking changes that will affect your code or application, if you are using it.


Thanks everyone, have a great week!

OpenAlex team

Reply all
Reply to author
Forward
0 new messages