Does NLTK Happen to Have a Library or Dictionary of Names?

549 views
Skip to first unread message

StunnerAlpha

unread,
Oct 12, 2009, 7:38:54 AM10/12/09
to nltk-users
I was wondering if there happened to be a library of names within
NLTK. What I need to be able to do is to randomly draw a first name
and a last name, ideally from a dictionary or library. If NLTK doesn't
have it, if you happen to know of any other place I should look please
let me know. Thanks in advance!

Steven Bird

unread,
Oct 12, 2009, 7:41:33 AM10/12/09
to nltk-...@googlegroups.com
There's a list of first names (the Names Corpus), but no surnames. I
wonder if you can get surnames from an online phonebook. Since this
is a question about available corpora, I think the corpora list may be
a more helpful place to post it.

http://gandalf.aksis.uib.no/corpora/

2009/10/12 StunnerAlpha <ajju...@gmail.com>:

Pedro Marcal

unread,
Oct 12, 2009, 1:54:37 PM10/12/09
to nltk-...@googlegroups.com, Steven Bird
Hi StunnerAlpha, Steven,
If you follow Steven's suggestion for first names, you might then use the attached file for surnames. It is obvious how to extract them. I don't remember how I got this list. Probably extracted it from the Brown and Penn State Corpora.
CD_dictSurnames.txt

StunnerAlpha

unread,
Oct 12, 2009, 9:00:21 PM10/12/09
to nltk-users
Awesome, thanks guys.

On Oct 12, 10:54 am, Pedro Marcal <marca...@cox.net> wrote:
> Hi StunnerAlpha, Steven,
> If you follow Steven's suggestion for first names, you might then use the attached file for surnames. It is obvious how to extract them. I don't remember how I got this list. Probably extracted it from the Brown and Penn State Corpora.
>
> ---- Steven Bird <s...@csse.unimelb.edu.au> wrote:
>
>
>
> > There's a list of first names (the Names Corpus), but no surnames.  I
> > wonder if you can get surnames from an online phonebook.  Since this
> > is a question about available corpora, I think the corpora list may be
> > a more helpful place to post it.
>
> >http://gandalf.aksis.uib.no/corpora/
>
> > 2009/10/12 StunnerAlpha <ajjub...@gmail.com>:
>
> > > I was wondering if there happened to be a library of names within
> > > NLTK. What I need to be able to do is to randomly draw a first name
> > > and a last name, ideally from a dictionary or library. If NLTK doesn't
> > > have it, if you happen to know of any other place I should look please
> > > let me know. Thanks in advance!
>
>
>
>  CD_dictSurnames.txt
> 149KViewDownload

John Francis Lee

unread,
Oct 13, 2009, 6:56:56 AM10/13/09
to nltk-...@googlegroups.com
That link leads to the gandalf site and one of the links towards the bottom of the page points to the google search page for the site. Typing 'names corpora' there yields, in two clicks

 http://www.census.gov/genealogy/names/

which has links to files containing all the first and last names recorded during the 1990 census with frequency data as well.

--
"This message may have been intercepted and read by U.S. government agencies including the FBI, CIA, and NSA without notice or warrant or knowledge of sender or recipient."

John Francis Lee
1025/37 Thanon Jet Yod
T.Wiang A.Mueang J.Chiangrai 57000
Thailand

--- On Mon, 10/12/09, Steven Bird <s...@csse.unimelb.edu.au> wrote:
Reply all
Reply to author
Forward
0 new messages