Feature request: Expand `cited_by_api_url` to the actual list of work ids when requested?

Rainer M Krug

unread,

Jul 3, 2024, 11:03:11 AMJul 3

to OpenAlex Community, Jason Portenoy, Steve Gruber

Hi

I am asking for thoughts from user side as well as posting a feature request.

I am doing many snowball search using OpenAlex (forward - backward searches, finding paper which are cited and papers which are citing a set of key works). Which works fine, but there is one small annoyance:

For each key work, I have to send an additional query to get the works citing the key work. In contrast, when looking up the referenced works, they are there in the field `referenced_works`.

Now I see the difference (referenced_works is static, while citing is increasing over time), so I do not ask for returning the citing work ids each time, but it would be useful to have an option to be able to specify a field `cited_by_works` which, when asked for it, returns a list of all work ids of the works citing the key work (equivalent to `referenced_works`).

I assume this would lower the load of API calls coming in, as well as make the doing of snowball searches much easier (and faster) for the user.

This would also be something, which could be added to the web interface - I am not aware that any other academic database has that feature!

Thanks,

Rainer

--
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany)

Orcid ID: 0000-0002-7490-0066

Department of Evolutionary Biology and Environmental Studies
University of Zürich
Office Y19-M-72
Winterthurerstrasse 190
8075 Zürich
Switzerland

Office: +41 (0)44 635 47 64
Cell: +41 (0)78 630 66 57
email: Raine...@uzh.ch
Rai...@krugs.de

PGP: 0x0F52F982

Jason Priem

unread,

Jul 5, 2024, 5:39:07 AMJul 5

to Rainer M Krug, OpenAlex Community, Jason Portenoy, Steve Gruber

Hi Rainer,

Thanks for getting in touch and for your suggestion! Unfortunately I don't think we'll be able to do that one, for both practical and conceptual reasons.

Practical first: for a given article w42, the set of works citing w42 can include many tens of thousands of items. The set of items cited by w42 is only a few hundred at most. So we can include those few hundred outgoing citations no problem within the API response for w42, but tens of thousands of incoming citations won't fit in a single API response. And we wouldn't create any kind of paging for a single entity, because that would break the concept of it being a single entity with a single URI.

Conceptual: outgoing references are properties of the work. Incoming citations are, on the other hand, properties of other works. So outgoing references are included in the work response, but incoming citations requires a filtered query to the /works endpoint. When we ask for incoming citations to w42, we're not really asking for properties of w42, we're asking about properties of other works--specifically, do their references sections point back to w42.

I'd say that both of these are reflections of the way the citation graph works. The outgoing citation, Merton's "pellet of peer recognition" is an act of the author, recorded in the paper. The incoming citation has no real existence at all, it's merely the tally of the bookkeeper (in this case, us).

But I'd be interested in hearing more about your UI suggestion! Are you thinking we might include a list of incoming citations on the work entity page? I'm open to that idea, although I'm a little unclear on the value, especially when there are many citing works. Here's an example from Semantic Scholar; is this what you had in mind? What problem would that solve?

Thanks again for taking the time to get in touch and share your suggestion! Sorry I can't be more help on this one.

Best,
Jason

--
You received this message because you are subscribed to the Google Groups "OpenAlex Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openalex-commun...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openalex-community/5BD31E6F-653C-48D1-86C2-DB350BD078FA%40krugs.de.

--

Jason Priem, CEO

OurResearch: We make software to help open science.

follow at @jasonpriem and @OurResearch_org

Rainer M Krug

unread,

Jul 5, 2024, 5:59:44 AMJul 5

to Jason Priem, OpenAlex Community, Jason Portenoy, Steve Gruber

Dear Jason

Thanks for your response.

I understand your points and I see the problem with the practical one, although (from my naive position as an API user and not developer) I could see a solution to give an API call which returns the ids only as a download of a ze.g. zip file.

The conceptual one I also agree and I see you argument.

Concerning GUI: Yes - that is what I am thinking about.

It would make a snowball search from the GUI possible, especially if one could do the same for multiple input papers at once. Also, it could feed into generated citation network graphs, which could be something really useful in the GUI - essentially a simplified VOSViewer type graph which one could create online (and download the network data for further local analysis by e.g. VOS Viewer or scripts (R, Python, etc).

I could imagine a workflow in the GUI where one

1. has a search which returns a few realist (this could also be based on provided does or a normal search) [key-papers]

2. In a tab, one could see the papers cited in the key -papers

3. In another tab one could see the papers citing the key-papers (probably specifying the number of steps?)

4. Create a citation graph of the snowball search (tab 1 + tab 2)

5. Download the combined results from all tabs (key-papers + cited + citing) for a more detailed local analysis in different formats (VOSViewer, network data two csv files(nodes and wedges), etc

This would combine the quick on-line screening of the data with the ease of continuing a more detailed analysis locally.

Cheers,

Rainer

Jason Priem

unread,

Jul 5, 2024, 6:13:32 AMJul 5

to Rainer M Krug, OpenAlex Community, Jason Portenoy, Steve Gruber

Thanks Rainer, I can see what that would look like now....thanks for the example. Agreed, it sounds cool! I can't make any promises about shipping it soon, but we'll definitely keep it in mind, especially if we get more requests!