Comparing arXiv API vs. NCBI Entrez

9 views
Skip to first unread message

julius.lucks

unread,
Nov 14, 2007, 8:54:04 AM11/14/07
to arxi...@googlegroups.com
Hello list,

For those of you who have experience with both the arXiv API and NCBI's Entrez system (for searching pubmed for example), I would be interested to know what you think of the two approaches.  In particular:

1.) Between the two, which one is easier to get started with?
2.) Which one is more powerful (i.e. easier to do advanced tasks - and can you please give examples)?
3.) Would you like to see a more uniform API to access both data sources?

In particular I am interested in any API features that you might like to see (in either system) that would help you write your applications, especially if these applications bridge the two data sources.  For that matter, I'd also like to know about what other systems you are using these API's with, and what your ideas are on how to make them more inter-operable.

I really appreciate your comments!

Julius

---------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------



Alf Eaton

unread,
Nov 14, 2007, 9:05:32 AM11/14/07
to arXiv api
On Nov 14, 1:54 pm, "julius.lucks" <julius.lu...@gmail.com> wrote:
> Hello list,
>
> For those of you who have experience with both the arXiv API and
> NCBI's Entrez system (for searching pubmed for example), I would be
> interested to know what you think of the two approaches. In particular:
>
> 1.) Between the two, which one is easier to get started with?
> 2.) Which one is more powerful (i.e. easier to do advanced tasks -
> and can you please give examples)?
> 3.) Would you like to see a more uniform API to access both data
> sources?
>
> In particular I am interested in any API features that you might like
> to see (in either system) that would help you write your
> applications, especially if these applications bridge the two data
> sources. For that matter, I'd also like to know about what other
> systems you are using these API's with, and what your ideas are on
> how to make them more inter-operable.

Both are pretty straightforward REST-based APIs, so I don't think you
need to worry about making them compatible. In fact I think you've
done the right thing using OpenSearch and Atom. You already have the
parameters for paging, so there's not much else missing (apart from
the chronological ordering, obviously, but then PubMed doesn't do
relevance-ranked search so you're ahead there already).

alf

julius.lucks

unread,
Nov 14, 2007, 9:21:22 AM11/14/07
to arxi...@googlegroups.com
>
> Both are pretty straightforward REST-based APIs, so I don't think you
> need to worry about making them compatible. In fact I think you've
> done the right thing using OpenSearch and Atom. You already have the
> parameters for paging, so there's not much else missing (apart from
> the chronological ordering, obviously, but then PubMed doesn't do
> relevance-ranked search so you're ahead there already).
>

What about having to use the NCBI Entrez history server in your
queries (i.e. to get more than summary information you have to do an
esearch->parse_cookie->e_fetch rather than just doing one call)?
Would you find a system that did this for you (i.e. masked the
history server) better? Would you find it useful if the NCBI Entrez
system returned results using OpenSearch and Atom?

What do you mean by 'PubMed doesn't do relevance-ranked search'?

Thanks - these comments are very useful.

Julius


Alf Eaton

unread,
Nov 14, 2007, 9:51:39 AM11/14/07
to arXiv api
On Nov 14, 2:21 pm, "julius.lucks" <julius.lu...@gmail.com> wrote:

> What about having to use the NCBI Entrez history server in your
> queries (i.e. to get more than summary information you have to do an
> esearch->parse_cookie->e_fetch rather than just doing one call)?

It depends what you want to do - if you just want the IDs for each
result then the EUtils way of doing it is useful as the results are
smaller, but generally you want the full details for each result which
means you have to make two calls for each query.

> Would you find a system that did this for you (i.e. masked the
> history server) better? Would you find it useful if the NCBI Entrez
> system returned results using OpenSearch and Atom?

Probably, but it's not hard to work with the current system either:
http://hublog.hubmed.org/archives/001518.html

HubMed actually provides OpenSearch/Atom results, though it doesn't
take all the parameters at the moment (no reason it couldn't though,
if necessary):
http://www.hubmed.org/opensearch.xml
http://hubmed.macropus.org/feeds/atom.cgi?q=example

> What do you mean by 'PubMed doesn't do relevance-ranked search'?

PubMed only returns search results in reverse-chronological order, not
ordered by relevance.

alf

Reply all
Reply to author
Forward
0 new messages