--
You received this message because you are subscribed to the Google Groups "OpenAlex users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openalex-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openalex-users/245dd762-67b9-4d43-9d8f-bf30116e0680n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Jason,
thanks for your thoughts!
With MAG, „aggregating“ concepts to the top-most level was problematic in my view, since lower-level concepts had multiple ancestors (if tracing the full hierarchy), which resulted in very vague tagging. An example I remember was “alien”, which can be understood in terms of biology, immigration law, and astronomy or film, and thus was at least traceable to biology, sociology, and potentially another top-level field. I gather you have done some work on the concepts, and in this particular case, the situation is much better now (https://api.openalex.org/concepts?filter=display_name.search:alien has three well-defined concepts).
If works could always be (unambiguously, and with some certainty) tagged with top-level concepts, this would definitely be useful. For the community, it would mean that we would have one solution that can be tested and applied consistently, instead of everyone re-inventing the wheel. It seems to me that retrieving the top-level concepts, or vice-versa, retrieving all works that relate to a concept, are both fairly common use-cases.
Best,
Thomas
p.s.: not sure why, but your message showed up only in my inbox, but not in the related thread (https://groups.google.com/g/openalex-users/c/wyFD6svC0Qo)
From: Jason Priem <ja...@ourresearch.org>
Sent: Donnerstag, 11. August 2022 00:15
To: Casey Meyer <ca...@ourresearch.org>
Cc: Thomas Klebel <klebel...@gmail.com>; OpenAlex users <openale...@googlegroups.com>
Subject: Re: multiple doi assignments, incomplete concepts tagging in Works
Hi, I just wanted to expand a bit on what Casey already said about the concepts, to share the philosophy behind it.
Our goal from early on was to create something that was mostly compatible with MAG, and so that's the initial reason for the behavior you observe.
As Thomas notes above, the rather counterintuitive tagging behavior you've observed was actually quite rampant in MAG. Their approach created concept (aka "field-of-study," aka "topic") links on a concept-by-concept basis that completely ignored hierarchy. So that way, when you see a high-level concept like "Biology" applied to an article, it means that the tagger made a direct match between that article and that concept. This direct match is not "polluted" by any inference based on the tag hierarchy. This is why there were tons of MAG articles that matched on "Computational Biology" (for example) but not on its ancestor "Biology."
I'm not exactly sure why MAG opted for this approach, but it does have one very nice advantage: you can easily see the strength (as measured by the assignment algo) of each concept-to-work mapping. And then if you want to also include tags that can be logically assigned based on the hierarchy, you can do it yourself, by looking up ancestors in the published tag tree. So MAG's approach is a bit more explicit, and supports both use cases with a bit of work on the user's part ("show me the directly assigned concepts" and also "show me both the directly assigned concepts, and the logically-assigned ancestor concepts.")
But all that said, I agree MAG's approach (and now ours) is pretty confusing, and I think we may change it in future data dumps, depending on user feedback. It's not hard for us to assign tags in both ways (directly, and logically-from-the-tree), and that saves downstream users from having to do it. So if folks have a preference, please let us know, and we'll consider that carefully.
Best,
Jason
To view this discussion on the web visit https://groups.google.com/d/msgid/openalex-users/CABuLGXjdgFPG%3DEonbzEZSr%3Dwv5VCnvEE-57ScdA5A_fk5X%3DAZA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Jason Priem, CEO
OurResearch: We make software to help open science.
follow at @jasonpriem and @OurResearch_org
To view this discussion on the web visit https://groups.google.com/d/msgid/openalex-users/129d0863-c0cf-4ac9-8839-045bea3443a1n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.