Keyword log-likelihood 2-term vs 4-term formula

64 views
Skip to first unread message

Sebastian Nex

unread,
Feb 2, 2023, 3:10:02 PM2/2/23
to AntConc-Discussion
Dear Professor Anthony,

As you know, the standard measure to calculate keyness in AntConc is log-likelihood (4-term), which I thought was the same measure that is described in Rayson & Garside (2000)

However, after running some calcs using the Rayson online calculator and comparing them to the AntConc results, I've realized that the formula discussed in said paper is the 2-term log-likelihood calculation.

I've tried searching corpus literature, browsing this google group and googling to find the formula to calculate the 4-term log-likelihood, but I could not find anything at all.

This leads me to my question: Could you please point me to an article, a website or tell me the specific formula for the log-likelihood (4-term) measure outright?

I've been trying to deepen my understanding of statistical calculations, and I would therefore be very grateful for any help.

Thank you.

Best,
Sebastian


Laurence Anthony

unread,
Feb 2, 2023, 7:20:50 PM2/2/23
to ant...@googlegroups.com
Hi Sebastian,

As you say, the Rayson online calculator only uses two terms. This is obvious from the description they give:
"This equates to calculating log-likelihood G2 as follows: G2 = 2*((a*ln (a/E1)) + (b*ln (b/E2)))"

The 4-term version just completes the full calculation by including the components for (c-a) and (d-b) in the above.

I recently received permission to include a summary of a non-published paper by Andrew Hardie that explains all the stats used in AntConc. I'll be putting that on my website (and possibly embedded into AntConc) in time for the next release.

I hope that helps.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/f56345f1-d139-4f42-ae71-57d82d621834n%40googlegroups.com.

Sebastian Nex

unread,
Feb 3, 2023, 2:04:07 AM2/3/23
to AntConc-Discussion
Dear Laurence,

Thank you for your incredibly swift reply.

I've tried messing around with the calculation now in various different ways to include the (c-a) and the (d-b), and I've gotten really close to the results that AntConc gives me with the following method, but i'm still slightly off.

That is:
E1 = c*(a+b) / (c+d)
E2 = d*(a+b) / (c+d)
LL1 = 2*((a*log(a/E1)) + (b*log(b/E2)))

E3 = (c-a)*(a+b) / (c+d-a-b)
E4 = (d-b)*(a+b) / (c+d-a-b)
LL2 = 2*((a*log(a/E3)) + (b*log(b/E4)))

LL = (LL1+LL2)/2

To be perfectly honest with you, this (intuitively) does not seem right to me at all, but the results I'm getting with this are the closest of all my attempts (and depending on the keyword i double-check with the AntConc results, it's either correct or off by at most 0.02)

I've tried so many different ways, but I have no idea where I'm going wrong.

I would love to include this calculation in my MA thesis - that's why I'm trying so hard to figure it out. I'd appreciate any and all further help.

Thank you.

Best,
Sebastian

Laurence Anthony

unread,
Feb 3, 2023, 5:25:25 AM2/3/23
to ant...@googlegroups.com
Hi again,

You're trying to take an average, which is wrong. Attached is the complete equation using slightly different notation.

I hope that helps!

Laurence

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

summary_of_equations_2_term_vs_4_term.docx

Sebastian Nex

unread,
Feb 4, 2023, 2:18:19 PM2/4/23
to AntConc-Discussion

Dear Laurence,

Thank you so much! 

I really appreciate all the work you put into answering all these questions. You're a legend of the field!

Best,
Sebastian

Laurence Anthony

unread,
Feb 4, 2023, 7:25:30 PM2/4/23
to ant...@googlegroups.com
You're very welcome. Thank you for the kind words!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Reply all
Reply to author
Forward
0 new messages