I've followed a previous thread's suggestion to grab new papers via the RSS feed. However, as reported in a previous thread (
https://groups.google.com/g/arxiv-api/c/NXxbrhRHYEk) there are missing items. I noticed something kind of specific with the missingness pattern in today's paper so I figured I might as well report it.
On the RSS for cs.CL - we have the following set of papers
On the email, there are quite a few more papers, and trying to line the two up, every new paper with cs.CL lines up with the email up until this point
------------------------------------------------------------------------------\\arXiv:2311.03998Date: Tue, 7 Nov 2023 13:54:01 GMT (11149kb,D)Title: Exploring Jiu-Jitsu Argumentation for Writing Peer Review RebuttalsAuthors: Sukannya Purkayastha, Anne Lauscher, Iryna GurevychCategories: cs.CLComments: Accepted at EMNLP Main Conference 2023\\ In many domains of argumentation, people's arguments are driven by so-calledattitude roots, i.e., underlying beliefs and world views, and theircorresponding attitude themes. Given the strength of these latent drivers ofarguments, recent work in psychology suggests that instead of directlycountering surface-level reasoning (e.g., falsifying given premises), oneshould follow an argumentation style inspired by the Jiu-Jitsu 'soft' combatsystem (Hornsey and Fielding, 2017): first, identify an arguer's attitude rootsand themes, and then choose a prototypical rebuttal that is aligned with thosedrivers instead of invalidating those. In this work, we are the first toexplore Jiu-Jitsu argumentation for peer review by proposing the novel task ofattitude and theme-guided rebuttal generation. To this end, we enrich anexisting dataset for discourse structure in peer reviews with attitude roots,attitude themes, and canonical rebuttals. To facilitate this process, we recastestablished annotation concepts from the domain of peer reviews (e.g., aspectsa review sentence is relating to) and train domain-specific models. We thenpropose strong rebuttal generation strategies, which we benchmark on our noveldataset for the task of end-to-end attitude and theme-guided rebuttalgeneration and two subtasks.\\ ( https://arxiv.org/abs/2311.03998 , 11149kb)------------------------------------------------------------------------------\\arXiv:2311.04020Date: Tue, 7 Nov 2023 14:18:03 GMT (12497kb,D)Title: Analyzing Film Adaptation through Narrative AlignmentAuthors: Tanzir Pial, Shahreen Salim, Charuta Pethe, Allen Kim, Steven SkienaCategories: cs.CLComments: 20 pages, 5 figures, 10 tables\\ Novels are often adapted into feature films, but the differences between thetwo media usually require dropping sections of the source text from the moviescript. Here we study this screen adaptation process by constructing narrativealignments using the Smith-Waterman local alignment algorithm coupled withSBERT embedding distance to quantify text similarity between scenes and bookunits. We use these alignments to perform an automated analysis of 40adaptations, revealing insights into the screenwriting process concerning (i)faithfulness of adaptation, (ii) importance of dialog, (iii) preservation ofnarrative order, and (iv) gender representation issues reflective of theBechdel test.\\ ( https://arxiv.org/abs/2311.04020 , 12497kb)----------------------------------------------------------------------------- -
where 3998 is in the list, but 4020 (and any papers beyond it) are not. I noticed the timestamp here is before and after 1400, which is the arxiv annoucement cutoff in EST. I wonder if there's some type of timezone issue where the RSS is cutting off at 1400 GMT and dropping papers that were submitted after 1400 GMT. A similar pattern seems to hold for CS.LG, where the RSS cuts off at 3996 and does not include 4007, and the two papers are before and after the 1400 GMT timestamp.
This may be completely off, but it seemed worth noting since there was no known root cause in the previous thread, and this seemed like a pretty distinct pattern.
Tatsu.