Stoplist in WordSmith Tools 4 -

19 views
Skip to first unread message

LIRD

unread,
Apr 17, 2008, 7:28:49 PM4/17/08
to WordSmith Tools
Hi,

I am currently working on a corpus on annual reports, and there is a
lot of 'noise' coming up (in this case mostly company names), which I
want to exclude as I don't consider them as lexical items. I know the
stoplist function allows me to do so, but is there a limit to how big
the stoplist may be?

An initial test with a stoplist of 30 items seemed to work quite well,
however, the Stoplist header said "6 Stoplist Items" (it 'stopped'
more than six words though). Now before I feed my full stoplist there
I would just like to check whether anyone has experience with larger
stoplists (mine will be approx. 4000 items or more) and how the
software handles these.

Thanks in advance

mi...@lexically.net

unread,
May 8, 2008, 4:27:16 AM5/8/08
to WordSmith Tools
There's no limit on the list for stoplists or similar lists. In the
main Controller you will only see the first 50, any more than that you
see "etc." but they are all loaded. Of course there is a cost in
memory but 4000 should be OK.

How to test:
I suggest you try 4000 words taken from a wordlist, then write a
little text containing word 1 and word 2000 and word 4000 plus one
other word not on your list, and see what sort of wordlist gets
generated.

Cheers -- Mike

Reply all
Reply to author
Forward
0 new messages