Very short list of keywords

58 views
Skip to first unread message

Martha Partridge

unread,
Jun 28, 2022, 12:02:47 PM6/28/22
to AntConc-Discussion
Hello,

I have just uploaded my own corpora of teacher feedback comments to Antconc (mac) and generated Keyword lists. However, this generates a list of just 18 words. My target corpus is 11,676 tokens and my reference corpus is 18,679 - both UTF-8. I don't understand why the keywords list stops at 18. Do I need to lemmatise the corpora, or change something in the Antconc settings, perhaps?

Thank you very much,

Martha

Laurence Anthony

unread,
Jun 28, 2022, 8:37:56 PM6/28/22
to ant...@googlegroups.com
Hi Martha,

In AntConc, the default setting is to cut off the keyword list at a statistical threshold of p < 0.05 with a Bonferonni correction. If you target and reference corpora are similar, this might result in a low keyword count. You might want to check this by changing the keyword settings to show all keywords. Note, however, that in this case, the list will then include keywords that are not statistically significant.

If you expect that your target and reference corpora are very difficult, you should end up with many hundreds or thousands of keywords, so perhaps you have not loaded your corpora correctly. My video on how to do keyword analysis might help:

I hope that helps.

Laurence.


###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/2c52c554-716b-46ab-8ab4-762a9d3c043bn%40googlegroups.com.

Martha Partridge

unread,
Jun 29, 2022, 3:59:04 AM6/29/22
to AntConc-Discussion
Hello Laurence,

Thank you so much for your swift reply. I think the issue is that the target and reference corpora are too similar, as you suggest; I am comparing sets of teacher feedback comments. I have changed the reference corpus to one of the pre-loaded ones and now have a much longer keyword list.

Thank you for the tutorial too - very useful.

Best,

Martha

Laurence Anthony

unread,
Jun 29, 2022, 5:20:40 AM6/29/22
to ant...@googlegroups.com
Hi Martha,

That's good to know. If you remove the Bonferroni correction (which introduces a more strict threshold), you should find that you get more keywords, as well. 

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Reply all
Reply to author
Forward
0 new messages