Dear arXiv team,
It looks like the new OAI-PMH implementation for the "arXiv"
metadataPrefix has a new surprising behavior: LaTeX in titles is
partially converted to unicode. For example in
https://oaipmh.arxiv.org/oai?verb=GetRecord&metadataPrefix=arXiv&identifier=oai:arXiv.org:2511.15569
the title contains LaTeX maths where the "\nu" macros have been
replaced by "ν".
I suspect this is the result of an over-eager unicode translation that
was implemented to deal with LaTeX-encoded author names (which is the
typical way to support accented characters on arXiv) but is detrimental
when applied to math in titles, as the title is not valid LaTeX any
longer (for default engine configurations that don’t handle literal
Greek characters in math environments).
This new behavior directly impacts downstream services that offer
citation features for the LaTeX ecosystem (such as
https://inspirehep.net which I’m responsible for) as users need to fix
the resulting compilation errors manually by undoing your automated
translation, so I strongly believe this should be treated as a bug.
Do you agree and if so could you please fix it?
Best,
Micha
---
Micha Moskovic
INSPIRE Product Manager
CERN Scientific Information Service