Can we extract collocations from a corpus using N-gram in Antconc?

75 views
Skip to first unread message

chau chi

unread,
Apr 7, 2024, 5:22:00 PM4/7/24
to AntConc-Discussion
Dear Prof Anthony and everyone,

I am wondering whether the N-gram function in Antconc can be used to extract collocations because N-gram will also extract lexical bundles. 

If yes, what is the normal cut-off for the N-gram?

If not, what are some other methods to extract a list of collocation from a corpus?

Thank you for answering my questions.

Best regards,

Chi Cuong-Chau

Laurence Anthony

unread,
Apr 9, 2024, 3:32:41 AM4/9/24
to ant...@googlegroups.com
Hi Chi,

The best way to extract collocations is with the collocation tool. I suggest you try that first.

Regards,

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/abe163aa-e694-4091-9b14-9d62fb260c87n%40googlegroups.com.

chau chi

unread,
Apr 9, 2024, 4:13:33 AM4/9/24
to ant...@googlegroups.com
Thank you for replying, 

I currently want to extract lists of verb-noun and adjective-noun information, but the collocate tool cannot help me to extract the full list. Can you instruct me how to do so in detail?

Thank you so much for your help

Best regards,

Chi

HCM City University of Education

Chau Cuong Chi 
chaucu...@gmail.com / 0707737757

HCM City University of Education 





Vào Th 3, 9 thg 4, 2024 lúc 8:32 SA Laurence Anthony <antho...@gmail.com> đã viết:

Laurence Anthony

unread,
Apr 9, 2024, 4:17:02 AM4/9/24
to ant...@googlegroups.com
Hi,

For this task, do the following:

1) Create a POS tagged corpus (e.g. using my TagAnt tool)
2) Load the corpus into AntConc through the corpus manager using the simple_word_pos_headword indexer
3) In the cluster tool, set the cluster size to 2 words
4) Search for:
*_V* *_N*
*_JJ* *_N*

I hope that helps!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

chau chi

unread,
Apr 13, 2024, 7:27:39 AM4/13/24
to AntConc-Discussion
Dear Dr Anthony,

I intend to use MI-score, t-score, Log-likelihood (LL), and log-Dice for calculating collocational strength in my study about collocations. The purpose is to see the differences between rankings of LCs, thereby identifying their numanced LC meanings. The MI-score threshold of ≥ 3 and t-score of ≥ 4 will be adopted following Ackermann and Chen's study. However, as for other AMs, can you suggest which threshold I should follow?

Thank you for your time to answer my question.

Respectfully yours

Chi
Vào lúc 09:17:02 UTC+1 ngày Thứ Ba, 9 tháng 4, 2024, Laurence Anthony đã viết:

Laurence Anthony

unread,
Apr 13, 2024, 8:49:10 AM4/13/24
to ant...@googlegroups.com
Hi,

Log-Likelihood is a statistical measure, so you just need to decide a suitable p value (e.g. p < 0.04). AntConc's default setting should be good here. It also applies a Bonferroni correction, which make the threshold more conservative.

Log-Dice is an effect size measure, so there is no real cut-off as you have with statistical measures like Log-Likelihood.  Here's a really good paper describing ways to interpret the Log Dice value:


Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
WWW: http://www.laurenceanthony.net/
###############################################################


Reply all
Reply to author
Forward
0 new messages