sorry for the delayed response -- it's the conference and vacation
time of the year
Please see
http://arxiv.org/help/bulk_data
and for the bulk PDF download from Amazon cloud service
http://arxiv.org/help/bulk_data_s3
Cheers
Thorsten
On Thu, Jul 1, 2010 at 5:20 AM, ogrisel <olivier...@gmail.com> wrote:
> On Oct 23 2009, 1:36 am, Thorsten S <thorsten.schwan...@gmail.com>
> wrote:
>> On Thu, Sep 17, 2009 at 8:28 PM, Dave <daba...@gmail.com> wrote:
>>
>> > Martin did you ever get a response?
>>
>> > On Aug 10, 1:21 pm, Martin <mku...@gmail.com> wrote:
>> >> Hi:
>>
>> >> I am putting together a batch PDF recording for I, Librarian. I wanted
>> >> to ask you how many API requests per second, or minute are tolerable.
>> >> PubMed for instance explicitly allows up to 3 requests per second.
>> >> What is your limit?
>>
>> >> Martin
>>
>> the arXiv api does not support bulk download of PDF.
>>
>> we ask third party services to be reasonable with the frequency of
>> search and metadata request via the arXiv api. while there are no hard
>> limits at this time, if our servers get bogged down by certain clients
>> via the api we will have to take some protective measures.
>>
>> for a full copy of (or particular subsets of) PDF for arXiv papers, we
>> are in the process of setting up a service in the Cloud, which will
>> offer the option for bulk download. I'll let you know when that
>> becomes available.
>
> Hi Thorsten,
>
> Have you made any progress on this side? I would like to gain access
> to a corpus of around 1000 to 10000 papers from various arxiv
> categories to test algorithms for semantic document analysis and
> clustering.
>
> Best,
>
> --
> Olivier
Thank you very much this is perfect.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel