how to increase max_results

45 views
Skip to first unread message

fragk...@gmail.com

unread,
Apr 30, 2009, 3:33:13 AM4/30/09
to arXiv api

Hi

I am wondering whether there is anyway to increase the max_results
value.
It seems that the maximum number of results I can retrieve is 1000.

Can one increase this value? If no, then can I retrieve, say the first
10000 ,
documents of some category by retrieving chunks of 1000?

many thanks

Frag.

julius.lucks

unread,
May 1, 2009, 5:29:08 PM5/1/09
to arxi...@googlegroups.com
Hi Frag,

Thanks for your interest and the email. Right now the limit of 1000
results is enforced by the backend database that the API runs off of.

You should be able to receive chunks of documents by using the 'start'
parameter and incrementing that in successive calls - see http://export.arxiv.org/api_help/docs/user-manual.html#_calling_the_api
for more information.

Depending on what your use case is though, you may want to look at the
ORE interface which allows for more bulk-type downloads. See http://www.openarchives.org/ore/1.0/toc

Thanks,

Julius

---------------------------------------------------------------------------------------
http://www.openwetware.org/wiki/User:Julius_B._Lucks
----------------------------------------------------------------------------------------

Fragkiskos Papadopoulos

unread,
May 1, 2009, 6:49:04 PM5/1/09
to arxi...@googlegroups.com
Many thanks Julius, I'll check out the ORE interface.

However, I tried the 'start' parameter but could not make it work.

See, say you want to get 2000 results. I thought that by setting  start=0 max_results=1000 for the first chunk
and start=1001 max_results=2000 for the second, I would be able to get all the 2000. 

But it turns out that I do not get anything for the second chunk.

Perhaps the start value should be by default less than (or equal to) max_results. Since max_results cannot change
the 'start' index cannot exceed 1000, unless I am missing something ? :)

thanks,

frag.

julius.lucks

unread,
May 1, 2009, 7:02:38 PM5/1/09
to arxi...@googlegroups.com
Hi Frag,

Can you paste in the exact url's you are trying?

Thanks,

Julius

---------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------



Fragkiskos Papadopoulos

unread,
May 1, 2009, 7:39:07 PM5/1/09
to arxi...@googlegroups.com
sure,

try for example this:


it won't return any results. I'm I doing something wrong?

many thanks!

julius.lucks

unread,
May 1, 2009, 7:43:03 PM5/1/09
to arxi...@googlegroups.com
Hi Frag,

Actually you are not doing anything wrong and this has to do with the way the backend is plugged into the API.  Is there a way you could use the sortBy and sortOrder parameters of the API call to be able to refine your search?

Maybe if you describe your use case, we can think of a work-around.

Thanks,

Julius

---------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------



Fragkiskos Papadopoulos

unread,
May 1, 2009, 7:51:30 PM5/1/09
to arxi...@googlegroups.com
I see :)

I am trying to get the all the results (or only authors and titles) from some category, say cond-mat.

Ideally I would like to constrain my search to only a time period, e.g. 1 year. 

So, I was trying to get all 2008 results (around 11000 entries) from the cond-mat category.

I appreciate the help a lot!

Paul Ginsparg

unread,
May 1, 2009, 7:59:25 PM5/1/09
to arxi...@googlegroups.com
well actually he's doing something wrong with
start=1000&max_results=1100
since he presumably meant max_results=100
(i.e. from 1000 - 1099)

but there is a bug anyway, things like
start=900&max_results=10

gives ten results, but

start=995&max_results=10

gives only 5, and starts above 1000 give nothing, so it clearly
doesn't know how to start beyond 1000

Fragkiskos Papadopoulos

unread,
May 1, 2009, 8:09:32 PM5/1/09
to arxi...@googlegroups.com
thanks Paul, 

yes initially I tried what you say, but I thought that was wrong too as I did not get any results either for the reason
you also say:)

Thorsten

unread,
May 4, 2009, 1:47:37 PM5/4/09
to arXiv api
see my post today on new search parameters and other improvements

the problem described below was fixed yesterday 5/3/09

actually the limit was intentionally hard-coded, but at the very least
there should have been a useful error message instead of a silently
enforced upper limit. max_results is now limited to 30,000 and a
larger value will result in an error.

thanks for the feedback and for alerting us to the problem
T.

On May 1, 6:09 pm, Fragkiskos Papadopoulos <fragkisk...@gmail.com>
wrote:
> thanks Paul,
> yes initially I tried what you say, but I thought that was wrong too as I
> did not get any results either for the reason
> you also say:)
>
> On Fri, May 1, 2009 at 4:59 PM, Paul Ginsparg <ginsp...@cornell.edu> wrote:
>
> > well actually he's doing something wrong with
> > start=1000&max_results=1100
> > since he presumably meant max_results=100
> > (i.e. from 1000 - 1099)
>
> > but there is a bug anyway, things like
> > start=900&max_results=10
>
> > gives ten results, but
>
> > start=995&max_results=10
>
> > gives only 5, and starts above 1000 give nothing, so it clearly
> > doesn't know how to start beyond 1000
>
> > On 5/1/09, julius.lucks <julius.lu...@gmail.com> wrote:
> > > Hi Frag,
>
> > > Actually you are not doing anything wrong and this has to do with the way
> > > the backend is plugged into the API.  Is there a way you could use the
> > > sortBy and sortOrder parameters of the API call to be able to refine your
> > > search?
>
> > > Maybe if you describe your use case, we can think of a work-around.
>
> > > Thanks,
>
> > > Julius
>
> > ---------------------------------------------------------------------------------------
> > >http://www.openwetware.org/wiki/User:Julius_B._Lucks
>
> > ----------------------------------------------------------------------------------------
>
> > > On May 1, 2009, at 4:39 PM, Fragkiskos Papadopoulos wrote:
> > > sure,
>
> > > try for example this:
>
> >http://export.arxiv.org/api/query?search_query=cat:cond-mat&start=100...
>
> > > it won't return any results. I'm I doing something wrong?
>
> > > many thanks!
>
> > > On Fri, May 1, 2009 at 4:02 PM, julius.lucks <julius.lu...@gmail.com>
> > wrote:
>
> > > > Hi Frag,
>
> > > > Can you paste in the exact url's you are trying?
>
> > > > Thanks,
>
> > > > Julius
>
> > ---------------------------------------------------------------------------------------
> > > >http://www.openwetware.org/wiki/User:Julius_B._Lucks
>
> > ----------------------------------------------------------------------------------------
>
> > > > On May 1, 2009, at 3:49 PM, Fragkiskos Papadopoulos wrote:
>
> > > > Many thanks Julius, I'll check out the ORE interface.
>
> > > > However, I tried the 'start' parameter but could not make it work.
>
> > > > See, say you want to get 2000 results. I thought that by setting
> >  start=0
> > > max_results=1000 for the first chunk
> > > > and start=1001 max_results=2000 for the second, I would be able to get
> > all
> > > the 2000.
>
> > > > But it turns out that I do not get anything for the second chunk.
>
> > > > Perhaps the start value should be by default less than (or equal to)
> > > max_results. Since max_results cannot change
> > > > the 'start' index cannot exceed 1000, unless I am missing something ?
> > :)
>
> > > > thanks,
>
> > > > frag.
>
> > > > On Fri, May 1, 2009 at 2:29 PM, julius.lucks <julius.lu...@gmail.com>
> > > wrote:
>
> > > > > Hi Frag,
>
> > > > > Thanks for your interest and the email.  Right now the limit of 1000
> > > > > results is enforced by the backend database that the API runs off of.
>
> > > > > You should be able to receive chunks of documents by using the
> > 'start'
> > > > > parameter and incrementing that in successive calls - see
> > >http://export.arxiv.org/api_help/docs/user-manual.html#_calling_the_api
> > > > >  for more information.
>
> > > > > Depending on what your use case is though, you may want to look at
> > the
> > > > > ORE interface which allows for more bulk-type downloads.  See
> > >http://www.openarchives.org/ore/1.0/toc
>
> > > > > Thanks,
>
> > > > > Julius
>
> > ---------------------------------------------------------------------------------------
> > > > >http://www.openwetware.org/wiki/User:Julius_B._Lucks
>
> > ----------------------------------------------------------------------------------------
>

Fragkiskos Papadopoulos

unread,
May 4, 2009, 4:03:04 PM5/4/09
to arxi...@googlegroups.com
thanks a lot for all the help and super-fast responses!

frag.
Reply all
Reply to author
Forward
0 new messages