Looking for lists of articles

4 views
Skip to first unread message

Rod Page

unread,
Jan 8, 2010, 5:40:10 PM1/8/10
to Taxonomic Literature
I've sent this message to TAXACOM, but I suspect members of this group
may also be a potential source of material.

As part of my efforts to track down articles in the Biodiversity
Heritage Library (BHL) I am searching for lists of articles and
bibliographies. The idea is to take information from these, such as
article title, volume number, and pagination, and look these up in
BHL. The results are available at http://biostor.org/ , which can be
accessed with a web browser, or using standard bibliographic tools
(such as EndNote http://biostor.org/endnote.php, Zotero http://biostor.org/zotero.php,
and Mendeley http://biostor.org/mendeley.php ).

Sometimes journals themselves publish lists of articles published
(e.g., http://biostor.org/reference/12611 ), and in other cases I've
found EndNote files on the web (typically for specific taxa). But
there are some huge holes in BioStor, which I'm seeking to fill.

I'm particularly interested in lists of articles from the Annals and
Magazine of Natural History, and the Proceedings of the U S National
Museum. Lists of articles for the Bulletins of the British Museum
would also be useful. You can gauge progress so far by viewing the
list of journals so far at http://biostor.org/journals.php

Regards

Rod

sau...@ira.uka.de

unread,
Jan 9, 2010, 3:26:16 AM1/9/10
to taxo...@googlegroups.com
Hi Rod,

I think this is a great idea, especially for string-valued attributes =20
like authors, journal names, and article titles. Guess it takes some =20
special attention with numerical values like volume numbers and =20
pagination, though, as OCR errors could be as undetectable as fatal =20
there - how do you ever figure out that a "5" was mistaken for a "6" =20
in some pagination? There's too many instances of volume numbers and =20
paginations to effectively exclude something based on lists, is there?

- Guido

Rod Page

unread,
Jan 9, 2010, 2:19:43 PM1/9/10
to Taxonomic Literature
My approach so far is to accept that there will be some errors and
manually investigate them. Also I don't rely entirely on volume and
pagination when matching records to BHL content. I compute a word
alignment between the article title and the putative "hit" in BHL and
only accept the match if the alignment has a reasonable score.

Regards

Rod

On Jan 9, 8:26 am, saut...@ira.uka.de wrote:
> Hi Rod,
>
> I think this is a great idea, especially for string-valued attributes =20
> like authors, journal names, and article titles. Guess it takes some =20
> special attention with numerical values like volume numbers and =20
> pagination, though, as OCR errors could be as undetectable as fatal =20
> there - how do you ever figure out that a "5" was mistaken for a "6" =20
> in some pagination? There's too many instances of volume numbers and =20
> paginations to effectively exclude something based on lists, is there?
>
> - Guido
>
>
>
> > I've sent this message to TAXACOM, but I suspect members of this group
> > may also be a potential source of material.
>
> > As part of my efforts to track down articles in the Biodiversity
> > Heritage Library (BHL) I am searching for lists of articles and
> > bibliographies. The idea is to take information from these, such as
> > article title, volume number, and pagination, and look these up in

> > BHL. The results are available athttp://biostor.org/, which can be


> > accessed with a web browser, or using standard bibliographic tools

> > (such as EndNotehttp://biostor.org/endnote.php, Zotero  


> >http://biostor.org/zotero.php,
> > and Mendeleyhttp://biostor.org/mendeley.php).
>
> > Sometimes journals themselves publish lists of articles published

> > (e.g.,http://biostor.org/reference/12611), and in other cases I've

Reply all
Reply to author
Forward
0 new messages