The p value and log likelihood in Keyword

1,181 views
Skip to first unread message

darun...@gmail.com

unread,
Jan 20, 2017, 12:00:28 PM1/20/17
to AntConc-Discussion
Dear Sir,

May I ask?
1) What is the default  p value assigned for every word in the two corpora (under log likelihood in Keyword function? 
2) Can I know the reason of the default p value assigned by Antconc
3) Why do you recommend log-likelihood (stated in the video), instead of chi-square as I am afraid that my PhD committee would ask?
4) Am I right if I interpret the keyness value (I chose log-likelihood test) in Antconc as follows:

The higher the keyness value, the more significant is the difference between the two frequency scores. A log likelihood of 3.8 or higher is significant at the level of p<0.05 and a keyness value of 6.6 or higher is significant at p<0.01. (I just adapted from Rayson and Garside, 2000)

I apologize for disturbing your time. I am sorry if my questions are not suitable.
Thank you so much for you kind sending me how to filter out words associated with my focus. I am looking forwards to hearing from you.

Kind regards
Darunee






Laurence Anthony

unread,
Jan 21, 2017, 11:57:49 PM1/21/17
to ant...@googlegroups.com
Hi Darunee,

Let me answer your four questions:


1) What is the default  p value assigned for every word in the two corpora (under log likelihood in Keyword function? 

AntConc shows all values by default. The ranking is by keyness values so you can choose the top x words above a particular keyness values which corresponds to a p value of your choice.
 
2) Can I know the reason of the default p value assigned by Antconc

No p-value is assigned by default.
 
3) Why do you recommend log-likelihood (stated in the video), instead of chi-square as I am afraid that my PhD committee would ask?

Log-likelihood is generally considered to provide a better estimate of keyness for low frequency items. A google search on log-likelihood vs chi-squared will give you lots more discussion of this.
 
4) Am I right if I interpret the keyness value (I chose log-likelihood test) in Antconc as follows:

The higher the keyness value, the more significant is the difference between the two frequency scores. A log likelihood of 3.8 or higher is significant at the level of p<0.05 and a keyness value of 6.6 or higher is significant at p<0.01. (I just adapted from Rayson and Garside, 2000)

Yes.
 
I apologize for disturbing your time. I am sorry if my questions are not suitable.

Not at all. You're welcome!

Laurence.

darun...@gmail.com

unread,
Jan 22, 2017, 12:59:05 PM1/22/17
to AntConc-Discussion
Dear Sir

Thank you so much for your help. I am more confident to write the chapter 4 and 5 of my thesis just now.


Kind regards
Darunee

Maria Cupery

unread,
Aug 28, 2017, 3:23:50 PM8/28/17
to AntConc-Discussion
Hi Darunee,

I am doing keyword analyses myself and am working on determining a cutoff point.  I have a question about this quote:  
"The higher the keyness value, the more significant is the difference between the two frequency scores. A log likelihood of 3.8 or higher is significant at the level of p<0.05 and a keyness value of 6.6 or higher is significant at p<0.01. (I just adapted from Rayson and Garside, 2000)"

How did you determine that 6.6 is significant at p<0.01?  I read the Rayson & Garside 2000 paper and I couldn't find the reasoning, I'm wondering if I have the wrong paper, or missed something?

Thank you!
Maria

20 Ocak 2017 Cuma 12:00:28 UTC-5 tarihinde darun...@gmail.com yazdı:
Dear Sir,

May I ask?
1) What is the default  p value assigned for every word in the two corpora (under log likelihood in Keyword function? 
2) Can I know the reason of the default p value assigned by Antconc
3) Why do you recommend log-likelihood (stated in the video), instead of chi-square as I am afraid that my PhD committee would ask?
4) Am I right if I interpret the keyness value (I chose log-likelihood test) in Antconc as follows:

Lini You

unread,
Aug 30, 2017, 12:34:22 AM8/30/17
to ant...@googlegroups.com
Dear Maria
I am so sorry for my late reply. You can read what Prof. Anthony replied to me. There are a lot of details about this. I can recommend the sample chapter in book particularly focusing on this issue. If you want to contact me and talk directly, you are welcome via my email address (daru...@hotmail.com).

Thank you and good luck for your thesis

Kind regards
Darunee

Virus-free. www.avg.com

--
You received this message because you are subscribed to a topic in the Google Groups "AntConc-Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antconc/HkFbCkPsmII/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antconc+unsubscribe@googlegroups.com.
To post to this group, send email to ant...@googlegroups.com.
Visit this group at https://groups.google.com/group/antconc.
For more options, visit https://groups.google.com/d/optout.

Mura Nava

unread,
Aug 30, 2017, 9:28:20 AM8/30/17
to AntConc-Discussion
Hi Maria
the determination of Log likelihood values and p vales come from comparing LL values to the chi-square distribution [https://www.medcalc.org/manual/chi-square-table.php]
note that as Rayson & Garside say that in corpus linguistics (CL) to use LL values we don't need to compare LL values to chi-square distribution, CL generally uses the rankings of LL values instead i.e. top 10, top 20 etc

hope that helps
mura
Reply all
Reply to author
Forward
0 new messages