RSS missing records - maybe a timezone issue?

50 views
Skip to first unread message

Tatsunori Hashimoto

unread,
Nov 8, 2023, 9:16:42 AM11/8/23
to arXiv API
I've followed a previous thread's suggestion to grab new papers via the RSS feed. However, as reported in a previous thread (https://groups.google.com/g/arxiv-api/c/NXxbrhRHYEk) there are missing items. I noticed something kind of specific with the missingness pattern in today's paper so I figured I might as well report it.

On the RSS for cs.CL - we have the following set of papers 

<rdf:Seq>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03365"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03420"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03498"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03510"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03533"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03551"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03566"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03584"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03614"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03627"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03633"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03658"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03663"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03672"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03687"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03696"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03716"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03731"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03732"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03734"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03748"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03753"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03754"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03755"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03767"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03780"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03788"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03792"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03798"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03810"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03812"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03837"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03839"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03881"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03896"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03928"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03952"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03963"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03969"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03998"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2004.14254"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2007.04874"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2111.11104"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2203.03897"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2210.12040"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2211.13709"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2212.09462"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2301.10884"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2303.00333"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2303.03283"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2303.10093"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2303.15714"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2305.03353"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2305.14303"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2305.14907"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2305.19466"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2306.00802"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2307.04657"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2307.07697"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2307.07870"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2308.14132"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2310.06827"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2310.12798"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2310.18581"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2310.20246"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.01305"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.01767"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.02849"/>
<rdf:li rdf:resource="http://arxiv.org/abs/2311.03287"/>
</rdf:Seq>

On the email, there are quite a few more papers, and trying to line the two up, every new paper with cs.CL lines up with the email up until this point

------------------------------------------------------------------------------
\\
arXiv:2311.03998
Date: Tue, 7 Nov 2023 13:54:01 GMT   (11149kb,D)

Title: Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals
Authors: Sukannya Purkayastha, Anne Lauscher, Iryna Gurevych
Categories: cs.CL
Comments: Accepted at EMNLP Main Conference 2023
\\
  In many domains of argumentation, people's arguments are driven by so-called
attitude roots, i.e., underlying beliefs and world views, and their
corresponding attitude themes. Given the strength of these latent drivers of
arguments, recent work in psychology suggests that instead of directly
countering surface-level reasoning (e.g., falsifying given premises), one
should follow an argumentation style inspired by the Jiu-Jitsu 'soft' combat
system (Hornsey and Fielding, 2017): first, identify an arguer's attitude roots
and themes, and then choose a prototypical rebuttal that is aligned with those
drivers instead of invalidating those. In this work, we are the first to
explore Jiu-Jitsu argumentation for peer review by proposing the novel task of
attitude and theme-guided rebuttal generation. To this end, we enrich an
existing dataset for discourse structure in peer reviews with attitude roots,
attitude themes, and canonical rebuttals. To facilitate this process, we recast
established annotation concepts from the domain of peer reviews (e.g., aspects
a review sentence is relating to) and train domain-specific models. We then
propose strong rebuttal generation strategies, which we benchmark on our novel
dataset for the task of end-to-end attitude and theme-guided rebuttal
generation and two subtasks.
\\ ( https://arxiv.org/abs/2311.03998 ,  11149kb)
------------------------------------------------------------------------------
\\
arXiv:2311.04020
Date: Tue, 7 Nov 2023 14:18:03 GMT   (12497kb,D)

Title: Analyzing Film Adaptation through Narrative Alignment
Authors: Tanzir Pial, Shahreen Salim, Charuta Pethe, Allen Kim, Steven Skiena
Categories: cs.CL
Comments: 20 pages, 5 figures, 10 tables
\\
  Novels are often adapted into feature films, but the differences between the
two media usually require dropping sections of the source text from the movie
script. Here we study this screen adaptation process by constructing narrative
alignments using the Smith-Waterman local alignment algorithm coupled with
SBERT embedding distance to quantify text similarity between scenes and book
units. We use these alignments to perform an automated analysis of 40
adaptations, revealing insights into the screenwriting process concerning (i)
faithfulness of adaptation, (ii) importance of dialog, (iii) preservation of
narrative order, and (iv) gender representation issues reflective of the
Bechdel test.
\\ ( https://arxiv.org/abs/2311.04020 ,  12497kb)
----------------------------------------------------------------------------- - 

where 3998 is in the list, but 4020 (and any papers beyond it) are not. I noticed the timestamp here is before and after 1400, which is the arxiv annoucement cutoff in EST. I wonder if there's some type of timezone issue where the RSS is cutting off at 1400 GMT and dropping papers that were submitted after 1400 GMT. A similar pattern seems to hold for CS.LG, where the RSS cuts off at 3996 and does not include 4007, and the two papers are before and after the 1400 GMT timestamp.

This may be completely off, but it seemed worth noting since there was no known root cause in the previous thread, and this seemed like a pretty distinct pattern.

Tatsu.
Reply all
Reply to author
Forward
0 new messages