Effect size measures for Keyword analysis

384 views
Skip to first unread message

Martha Partridge

unread,
Aug 2, 2022, 11:20:53 AM8/2/22
to AntConc-Discussion
Hello Prof. Anthony,

I am using Antconc to extract Keyword lists from several corpora to identify differences and similarities between them. I have been reading a lot about which effect size measure to use, and am still unsure. One challenge is that I have not been able to find an explanation for exactly what the abbreviations for the effect size measures in the 'Tool settings' in Antconc mean. Is there anywhere which explains this, please?

Also, is there a measure for Odds Ratio, as this seems to be recommended?

Thank you very much,

Martha

Laurence Anthony

unread,
Aug 5, 2022, 9:46:50 AM8/5/22
to AntConc-Discussion
Hi Martha,

I'm working now on a document to explain all the stats that are used in AntConc after finally getting permission to copy some unpublished work that I used as a basis for the stats. I'll update everybody when it is ready.

At the moment, there isn't an Odds Ratio measure in AntConc. I could easily add it, but there is a problem with such a measure when the word doesn't appear in the reference corpus. In this case, we get a division by zero, which is undefined or infinity depending on your perspective. Can you say where you saw Odds Ratio being recommended?

Laurence.

Martha Partridge

unread,
Aug 7, 2022, 3:48:12 AM8/7/22
to AntConc-Discussion
Hello Laurence,

Thank you for your reply. It will be very helpful to know what the stats mean in exact terms, so I look forward to this upcoming work!

Thank you also for your advice about using LL - some studies I have read use this, while others argue strongly against it (particularly Gabrielatos, 2018). Would it be OK if I cite your recommendation above as 'personal correspondence' in my current research? This is so I have some evidence to base my decision to use LL as my stat. I can also cite some studies which have used it for keyness analysis.

The Odds Ratio is recommended in Pojanapunya & Watson-Todd (2018), and mentioned in Gabrielatos (2018) too.

Best,

Martha

Refs
Gabrielatos, C., 2018. Keyness analysis: Nature, metrics and techniques. In Corpus approaches to discourse (pp. 225-258). Routledge.

Pojanapunya, P. and Todd, R.W., 2018. Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis. Corpus Linguistics and Linguistic Theory, 14(1), pp.133-167.

Laurence Anthony

unread,
Aug 7, 2022, 5:07:41 AM8/7/22
to ant...@googlegroups.com
Hi Martha,

Yes, you can certainly use my name if you want. The paper by Gabrielatos is often mentioned, and while I know Costos, I find the paper quite problematic, especially with his recommendation to use a measure with a fudge factor of 0.0000000000000001 (or however many zeros that is). In mathematics terms, this is very problematic. If you try the proposed effect size measure, you will find that it tends to highly rank extreme outliers in the data, which is exactly what the mathematics would predict. To avoid the division by zero problem, Odds ratio would require some sort of fudge factor, too, which leads to the exact same problem.

My investigations of all the measures proposed led me to conclude that we still don't have a convincing effect size measure, and that ranking by log-likelihood is still the best approach, despite its own obvious flaws.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to a topic in the Google Groups "AntConc-Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antconc/0Pd8SGZjj7k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/69c80f96-deeb-4712-927f-6b548a189484n%40googlegroups.com.

Martha Partridge

unread,
Aug 7, 2022, 2:00:59 PM8/7/22
to AntConc-Discussion
That's interesting to hear Laurence. I look forward to reading your work on this in future!

Thank you for your help,

Martha

Marites Querol

unread,
Aug 7, 2022, 10:21:17 PM8/7/22
to ant...@googlegroups.com
Thanks so much Prof. Anthony,

I've been learning a lot. I am enjoying the learnings I gain from you and Antconc.
Thank you so much.

Best regards,

Marites


You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/e7c18f24-7994-4603-8974-5385f5464da3n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages