API dash/hyphen bug

98 views
Skip to first unread message

I Allison

unread,
Nov 17, 2023, 2:48:41 PM11/17/23
to arXiv API
I'm trying to use the API to search based on report numbers, but I've hit a bug where search terms with dashes/hyphens are interpreted in a strange way. As an example, if use the advanced search page to search for "Report Number: CERN-TH" (leaving everything else at the default settings) you currently get 6910 results, but if you try the same thing with the API...
You get 2 results. There's a github issue describing the same problem, where they suggest the hyphen might be interpreted as a logical not, but if I try


I get  9163 results, so I think something else might be going on. Either way, is there any way to escape the hyphen to avoid this problem? Also, is there any way to include the report number in the results from the API?

Jake Weiskoff

unread,
Nov 17, 2023, 3:09:24 PM11/17/23
to arxi...@googlegroups.com
Looks like you'll want to use an html escape for the minus: 


provides a more reasonable result set.

-Jake 

--
You received this message because you are subscribed to the Google Groups "arXiv API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to arxiv-api+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/arxiv-api/0e4fb6c4-03e1-4fd1-8058-23c3e74a258dn%40googlegroups.com.

I Allison

unread,
Nov 17, 2023, 4:12:41 PM11/17/23
to arXiv API
Hi Jake, I think that just searches for CERN in the report number. I get the same results for both of the following searches
In the API docs it mentions `&` will be used to separate query params so I think it probably interprets `search_query=rn:CERN` as the first parameter and `minus;TH` as the second

Jake Weiskoff

unread,
Nov 17, 2023, 4:20:13 PM11/17/23
to arxi...@googlegroups.com
I think the query is breaking apart the term after, since I'm getting a similar number across other variants. 

This is likely a shortcoming of the engine this older search is built upon. In the coming months we'll likely be putting out a user survey to guide the API construction going forward. It's currently running on a platform that's nearing its EOL, but the initial shift will likely include a 1:1 port. 

-Jake

Reply all
Reply to author
Forward
0 new messages