Keyness analysis: Log-likelihood procedure, assumptions for statistics, and new alpha after Bonferroni correction

Hyeseung Jeong

unread,

Aug 19, 2024, 3:22:32 AM8/19/24

to AntConc-Discussion

Dear Prof Anthony and all,

I've done keyword analysis in AntConc with the setting: 'Log-likelihood (4-term)' for likelihood measure and 'p < 0.01 (6.63 with Bonferroni)' for threshold.

To describe the analysis in the method section, I need to know:

log-likelihood procedure
assumptions of the statistics for kyeness
the adjusted alpha level after Bonferroni correction

Regarding the second point about assumptions of statistics, it seems that general assumptions for statistics like random sampling or normal data distribution are not applicable (or not necessary) for keyness and other corpus analyses, because the purpose of inferential statistics in corpus analysis is not rejecting null hypothesis but identifying words and ranking them. I got this impression by reading some sources like Baker (2006), Bestgen (2014), Rayson (2019) and Pojanapunya and Todd (2018). Am I understanding properly? Then what can be assumptions particularly of corpus analysis statistics?

I will very much appreciate if you can help me to figure out how I describe the three points.

Thank you!

Best regards,

Hyeseung

*References*

Baker, P. (2006). Using corpora in discourse analysis. Continuum.
Bestgen, Y. (2013). Inadequacy of the chi-squared test to examine vocabulary differences between corpora. Literary and Linguistic Computing, 29(2), 164-170. https://doi.org/10.1093/llc/fqt020
Pojanapunya, P., & Todd, R. W. (2018). Log-likelihood and odds ratio: Keyness statistics for different purposes of keyword analysis. Corpus Linguistics and Linguistic Theory, 14(1), 133-167. https://doi.org/doi:10.1515/cllt-2015-0030
Rayson, P. (2019) Corpus analysis of key words. In Chapelle, Carol A. (Ed) The Concise Encyclopedia of Applied Linguistics. Wiley, p. 320-326. https://www.wiley.com/en-gb/The+Concise+Encyclopedia+of+Applied+Linguistics-p-9781119147367

Laurence Anthony

unread,

Aug 20, 2024, 12:48:09 AM8/20/24

to ant...@googlegroups.com

Hi Hyeseung,

These are all very good questions.

As for 1), check the AntConc help page (in the menu), which explains the exact procedure.

As for 2), I think the following papers are very useful

https://www.researchgate.net/publication/301537550_Log-likelihood_and_odds_ratio_Keyness_statistics_for_different_purposes_of_keyword_analysis

https://osf.io/eb2n9/download/

As for 3), here's an explanation:

https://en.wikipedia.org/wiki/Bonferroni_correction

For keyness analysis, it means that you divide the alpha level by the number of types in your corpus.

I hope that helps!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/1ddff12c-9571-4db1-9a1c-7d3116d61b46n%40googlegroups.com.

Hyeseung Jeong

unread,

Aug 20, 2024, 4:14:14 AM8/20/24

to ant...@googlegroups.com

Thanks so much!

You received this message because you are subscribed to a topic in the Google Groups "AntConc-Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antconc/o0wd-pepOqg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/CAL6Fgv35%2BAyRfN7jiuDCtR2oC3sJmvkvgAXeAK6w_NRYhUX%2BFg%40mail.gmail.com.

Reply all

Reply to author

Forward