Regarding papers licensed under the “arXiv.org - Non-exclusive license to distribute,”

たなかけん

unread,

Feb 24, 2026, 7:16:05 PMFeb 24

to arXiv API Discussion

I work for a Japanese company.

Regarding papers licensed under the “arXiv.org - Non-exclusive license to distribute,” is it possible to use an API to download paper data and then use generative AI to extract data from the papers or summarize their content for our company's internal operations?

Of course, we absolutely will not use the paper data for training generative AI.

Also, is it possible to share the above data or summaries internally?

I apologize for the inconvenience, but I would greatly appreciate your guidance on whether the above actions are permissible.
Thank you for your consideration.

Yosuke Hanawa
SCREEN Holdings Co., Ltd.
https://urldefense.com/v3/__https://ap.sansan.com/v/vc/zjovq6podumudm2t2goc6vuxjy/__;!!A1wp6HbNkA!C-ETUB6imlbyQ9mYk7HP5Ie3wBqru7dD4r0aMPI41ddW3N1td76FQ-x9FXjZPS26MxqaPEUinqKRpJszEX2LGxFHrVo$

Jake Weiskoff

unread,

Feb 25, 2026, 9:29:37 AMFeb 25

to a...@arxiv.org

Dear Yosuke Hanawa,

arXiv cannot advise you with regard to your legal obligations nor risks. We would ask that if your intent is to harvest the full corpus's full text to please instead take advantage of one of the bulk tools available to you such as either Kaggle's dataset (available for free), or the Amazon S3 buckets, which are at "requestor pays". See:

https://info.arxiv.org/help/bulk_data.html

Sincerely,

-Jake Weiskoff (he/him)

Project Manager, arXiv.org

--
You received this message because you are subscribed to the Google Groups "arXiv API Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to api+uns...@arxiv.org.
To view this discussion visit https://groups.google.com/a/arxiv.org/d/msgid/api/5554fc82-ba9e-4e1c-8909-8b853f0541c7n%40arxiv.org.

たなかけん

unread,

Feb 26, 2026, 5:33:09 PMFeb 26

to arXiv API Discussion, ja...@arxiv.org

Dear Jake Weiskoff,

Thank you for your response.
I understand that you are unable to provide advice regarding this matter.
Thank you for sharing information about Kaggle datasets and other resources. It is very helpful, and I will consider using them.

Thank you very much.

Sincerely,

Yosuke Hanawa
SCREEN Holdings Co., Ltd.

Reply all

Reply to author

Forward