I've been trying to build a simple client for tracking new papers on arXiv - like the email announcements but filtered and more nicely formatted.
The submittedDate query is close, but not quite right (it's wrong for papers under hold).
I've seen other posts on the topic,
but I dont know if I've seen a clear 'canonical' solution to this problem.
I think the arxiv ids are actually sequential by announce date and not by submitted date, so in theory I could query for thousands of the most recent papers each day, sort by id and then cut off after trying to infer where the new papers start (is there even a way to grab the most recent papers by ID?).
I could also maintain a large DB of all the papers, so that I can keep track of which papers are actually new and which ones are not. This seems very heavyweight for an app that could be stateless if there was an announcedDate metadata field. Is there something cleaner than either of these solutions?
I guess another solution would just be to silently drop any paper that was put on hold, but that seems fairly suboptimal..
Best regards,
Tatsu.