Topics missing from many works in latest snapshot

16 views
Skip to first unread message

Jason Augustyn

unread,
Feb 26, 2026, 11:23:45 AM (yesterday) Feb 26
to OpenAlex Community
I am working with the latest snapshot and finding that around 27% of the documents are missing topics. A lot of these are datatsets, letters, and other document types that might be hard to classify, but there are also a lot of articles and similar document types missing topics. For instance, there are over 32M articles without topics. This complicates using topics for bibliographic analysis.

Anyone have insights on why this might be the case?

Christos Petrou

unread,
Feb 26, 2026, 5:21:20 PM (yesterday) Feb 26
to Jason Augustyn, OpenAlex Community
I raised this with the OpenAlex team. It starts in October, peaks in November (I think >80% missing), and is rectified by January. It's all types of papers, both in the snapshot and the API. Tough to do 2025 analysis with such a gap. Hoping for a fix.

On Fri, Feb 27, 2026, 01:23 Jason Augustyn <js.au...@gmail.com> wrote:
I am working with the latest snapshot and finding that around 27% of the documents are missing topics. A lot of these are datatsets, letters, and other document types that might be hard to classify, but there are also a lot of articles and similar document types missing topics. For instance, there are over 32M articles without topics. This complicates using topics for bibliographic analysis.

Anyone have insights on why this might be the case?

--
You received this message because you are subscribed to the Google Groups "OpenAlex Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openalex-commun...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/openalex-community/d9bbe922-2b62-488e-a2cd-2cd0b6e6f8aen%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages