Calculating coverage

23 views
Skip to first unread message

awk...@gmail.com

unread,
Oct 1, 2020, 4:34:36 PM10/1/20
to WordSmith Tools
Hello,

Hoping there's an easy way to calculate coverage of a word list in a particular file or collection of files. 

Specifically, I would like to know what percentage of types in a particular file/corpus are included in a specific word list. 

Thanks!

Aaron

Mike Scott

unread,
Oct 1, 2020, 4:48:17 PM10/1/20
to WordSmith Tools
All of them (100%), except that by default any strings containing a number get summarised as #.
Cheers

awk...@gmail.com

unread,
Oct 1, 2020, 4:56:37 PM10/1/20
to WordSmith Tools
Hi Mike,

I mean I would like to know coverage of a word list of a different text. For instance, if I generate a word list from medical journals, I want to know what percentage of a sociology text are covered by that word list (or keyword list). 

Hope that clarifies things. 

Mike Scott

unread,
Oct 2, 2020, 5:15:40 AM10/2/20
to WordSmith Tools
Have you tried Detailed Consistency, Aaron? That might be what you're looking for.
Cheers

awk...@gmail.com

unread,
Oct 2, 2020, 6:14:05 AM10/2/20
to WordSmith Tools
Thanks for that suggestion, Mike. Looks like it needs to be a word list. I'm looking to do it for my keywords list. 
Reply all
Reply to author
Forward
0 new messages