Python API parsing with multiple authors

400 views
Skip to first unread message

Andrea Zonca

unread,
Jul 23, 2011, 10:25:23 AM7/23/11
to arXiv api
hi,
I've improved the python parsing example available here:
http://arxiv.org/help/api/examples/python_arXiv_parsing_example.txt
in order to support multiple authors, previously it was only getting
the first author.
The code is available on github:
https://github.com/zonca/python-parse-arxiv
relevant commit:
https://github.com/zonca/python-parse-arxiv/commit/3873e2bc74f1a27df801cda3fbee9ff46db12e02
regards,
andrea zonca

Thorsten S

unread,
Jul 24, 2011, 4:30:49 PM7/24/11
to arxi...@googlegroups.com
Hi,

thanks for the suggestion.

Upon code inspection I note 2 things:

1)
the current example code on arXiv does in fact not display the "First
Author", but instead it displays the last author. This is with
feedparser 5.0.1

2)
The comments in the code point to a limitation in feedparser 4.1.
# feedparser v4.1 only grabs the first author
author_string = entry.author

This limitation is not present in current versions (5.0.1) of
feedparser (I did not verify that it is in fact present in version
4.1)
so the direct method to display all authors is something along the lines of

print 'Authors: %s' % ', '.join(author.name for author in entry.authors)

without the "workaround" you suggest in your patch
# change author -> contributors (because contributors is a list)
response = response.replace('author','contributor')

this renaming is unnecessary with current feedparser.

So then the question is, should the example be fixed to correctly
display the first author, should it be modified to display all
authors, or should another example be added doing one or the other?
Also, what about affiliation. Your code removes the section that
handles affiliation. Why?

Cheers
T.

> --
> You received this message because you are subscribed to the Google Groups "arXiv api" group.
> To post to this group, send email to arxi...@googlegroups.com.
> To unsubscribe from this group, send email to arxiv-api+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/arxiv-api?hl=en.
>
>

Thorsten S

unread,
Jul 24, 2011, 4:48:34 PM7/24/11
to arxi...@googlegroups.com
On Sun, Jul 24, 2011 at 2:30 PM, Thorsten S
<thorsten....@gmail.com> wrote:
> Hi,
>
> thanks for the suggestion.
>
> Upon code inspection I note 2 things:
>
> 1)
> the current example code on arXiv does in fact not display the "First
> Author", but instead it displays the last author. This is with
> feedparser 5.0.1
>
> 2)
> The comments in the code point to a limitation in feedparser 4.1.
>    # feedparser v4.1 only grabs the first author
>    author_string = entry.author
>
> This limitation is not present in current versions (5.0.1) of
> feedparser (I did not verify that it is in fact present in version
> 4.1)

I verified that feedparser.py version 4.1 does indeed not handle
multiple authors.

feedparser.py version 5.0.1 corrects this deficiency, so the direct
way to print all author names is


print 'Authors: %s' % ', '.join(author.name for author in entry.authors)

Cheers
T.

Reply all
Reply to author
Forward
0 new messages