AntConc and Word from Microsoft

2 views
Skip to first unread message

Gabriel Felley

unread,
Dec 16, 2025, 7:47:51 AM (2 days ago) Dec 16
to AntConc-Discussion
Hi Anthony  I 've used AntConc  to  analyze to occurencies of specific word in my text (180'000 words). But the results produced by AntConc are different than what Word shows me. How to understand this discrepancy?
Thank you for clarifying hints.
Regards
Gabriel

Laurence Anthony

unread,
Dec 16, 2025, 11:12:36 AM (2 days ago) Dec 16
to ant...@googlegroups.com
Hi Gabriel,

In Microsoft Word, the definition of a word is not exactly clear, but in AntConc, it is very precisely defined in the token settings. The default setting is to separate words like "doesn't" into "doesn" and "t", which is a common practice in corpus research. If you want a count that is closer to the count in Microsoft Word, I'd suggest you tokenize (or even better, POS tag) your files using a tool like my TagAnt tool, and then load the files into AntConc using the simple_word_pos_headword indexer in the corpus manager. If you do this, "doesn't" will probably get treated as one word or perhaps two words ("does" + "n't") depending on the tagger you use.

I hope that helps!

Laurence.

--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/antconc/04e25457-e0ca-4197-b567-cbaad1bfad2dn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages