Keyword List with Lemmatized Entries

492 views
Skip to first unread message

張盛傑

unread,
Apr 13, 2016, 3:06:05 AM4/13/16
to AntConc-discussion
Dear Mr. Anthony (and whoever knows the answer to the question),

I am a research assistant in a university. I am currently working on texts from senior high school English textbooks and college textbooks, and want to find their keyword lists. When I applied keyword lists in AntConc, I found that the words that should belong to the same lemma (for instance, store/stores) are separated into independent entries. However, I suppose that it may be more reasonable to combine these entries to get one keyness value. After all, they are the same words. Is there any way to generate a keyword list composed of lemmatized entries? Thank you very much!

Laurence Anthony

unread,
Apr 13, 2016, 3:16:35 AM4/13/16
to ant...@googlegroups.com
Hi,

The simple answer is "yes".

1) load in a lemma list (e.g one from my website) into AntConc via word list tool settings menu . 
2) load in your reference corpus in the main window of AntConc
3) create a lemma frequency list in the word list tool for your reference corpus and save it
4) load that lemma frequency list into AntConc as a reference word list via keyword tool settings menu . 
5) load in your target corpus in the main window of AntConc
6) generate a lemma frequency list for your target corpus in the word list tool
7) switch to the keyword tool and create a key word list. The result will be lemma keywords.

I hope that helps.

Laurence.


###############################################################
Laurence ANTHONY, Ph.D.
Professor
Center for English Language Education in Science and Engineering (CELESE)
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

On 13 April 2016 at 16:06, 張盛傑 <dazha...@gmail.com> wrote:
Dear Mr. Anthony (and whoever knows the answer to the question),

I am a research assistant in a university. I am currently working on texts from senior high school English textbooks and college textbooks, and want to find their keyword lists. When I applied keyword lists in AntConc, I found that the words that should belong to the same lemma (for instance, store/stores) are separated into independent entries. However, I suppose that it may be more reasonable to combine these entries to get one keyness value. After all, they are the same words. Is there any way to generate a keyword list composed of lemmatized entries? Thank you very much!

--
You received this message because you are subscribed to the Google Groups "AntConc-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To post to this group, send email to ant...@googlegroups.com.
Visit this group at https://groups.google.com/group/antconc.
For more options, visit https://groups.google.com/d/optout.

張盛傑

unread,
Apr 13, 2016, 3:24:47 AM4/13/16
to AntConc-discussion
Dear Mr. Anthony,

Thank you for your kind and immediate reply!! I will have a try.



張盛傑於 2016年4月13日星期三 UTC+8下午3時06分05秒寫道:

張盛傑

unread,
Apr 19, 2016, 4:48:41 AM4/19/16
to AntConc-discussion
Dear Prof. Anthony,

After I posted the question last time, I found the Youtube video clip "AntConc 3.2.4 Tutorial 10: Working with lemmas" at once, which contains the information I needed. Sorry for not making a full search first.

And thank you for your quick answer last time. However, I 
finally failed to generate the lemma keyword list I wanted. Actually, what I used for the "reference corpus" in your step (2) is the word lists in your AntConc page (like BNC word frequency lists), and I am not really sure whether this will make any difference.

Also, I noticed that in the video clip you used the Brown corpus, yet I just couldn't find it on the Net. I am wondering how I can have an access to corpora like BNC or Brown that can be dealt with in AntConc.

Thank you very much, and I am looking forward to your reply.



張盛傑於 2016年4月13日星期三 UTC+8下午3時24分47秒寫道:

JFlorian

unread,
Apr 19, 2016, 5:28:11 AM4/19/16
to ant...@googlegroups.com

張盛傑

unread,
Apr 21, 2016, 2:47:58 AM4/21/16
to AntConc-discussion
Dear Prof. Anthony,

I have tried BNC corpus these two days, but just failed halfway.

First of all, the original corpus is in the xml format. The information on the Net told me that I simply have to resave the xml files as txt ones, without any other change, in order to make them applicable for AntConc. However, the saved txt files are apparently not like "natural texts." I am not quite sure whether it leads to the failure.

Also, I noticed that the version of AntConc in the video clip is different from the one I am using, and I am also note sure whether it makes any difference. But I modified the files completely the same way as you did in the clip.

The failure occurred when I loaded the lemmatized BNC word list as my reference corpus. AntConc said "One of more of the rank column values in file BNC_lemma list.txt is not a number." But isn't the first column the rank column? I have made sure this column is all composed of numbers, yet the failure still occurred.

I would really appreciate your reply. Thank you!







Lifes於 2016年4月19日星期二 UTC+8下午5時28分11秒寫道:

qwerty asdf

unread,
Apr 21, 2016, 3:48:48 AM4/21/16
to ant...@googlegroups.com

Dear 張盛傑

May be you can use XML to plain text converter, and use AntConc to process the converted file. But lets wait Prof Anthony's answer

I just googled, and there are some converters, this is one of them https://www.browserling.com/tools/xml-to-text

best,

Pri


張盛傑

unread,
Apr 21, 2016, 5:39:46 AM4/21/16
to AntConc-discussion
Hi Pri,

Thank you so much for your reply. I just tried the converter you provided, but the conversion seemed to fail and the converter said "This XML file does not appear to have any style information associated with it. The document tree is shown below."

I think the problem may lie in the fact that the XML files are tagged texts. Can they be used in AntConc without any change in format?

Thank you again, and looking forward to Prof. Anthony's answer.



qwerty asdf於 2016年4月21日星期四 UTC+8下午3時48分48秒寫道:

qwerty asdf

unread,
Apr 21, 2016, 7:37:15 AM4/21/16
to ant...@googlegroups.com

Dear 張盛傑,


That is only one option, sure there are others. Of course, you can just wait for Prof Anthony's answers, or you can do something like keep looking for the right converter.

Pri

Philip Riccobono

unread,
Jan 7, 2017, 3:06:37 AM1/7/17
to AntConc-Discussion
Hello, 

I was wondering if I could follow steps 1-6 to lemmatize a keyword list I have already created. 

Thank you,

Philip

Philip R.

unread,
Jan 7, 2017, 3:51:54 AM1/7/17
to ant...@googlegroups.com
Dear Dr. Anthony,

I figured it and lemmatized a keyword list by comparing it to your lemma list on the site. 

Thanks

--
You received this message because you are subscribed to a topic in the Google Groups "AntConc-Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antconc/b3d1QiQOdzs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antconc+unsubscribe@googlegroups.com.

To post to this group, send email to ant...@googlegroups.com.
Visit this group at https://groups.google.com/group/antconc.
For more options, visit https://groups.google.com/d/optout.



--
Thank you,

Philip Riccobono

M.Ed.
MA History

Laurence Anthony

unread,
Jan 7, 2017, 10:18:42 PM1/7/17
to ant...@googlegroups.com
Great!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor
Center for English Language Education in Science and Engineering (CELESE)
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+unsubscribe@googlegroups.com.

Philip R.

unread,
Jan 7, 2017, 10:26:29 PM1/7/17
to ant...@googlegroups.com
Thanks again. Only is that the lemma words on the right were not numbered by occurance like in your video.

Best,

Philip


--
Sent from Gmail Mobile

Sidney Abramson

unread,
Mar 5, 2024, 4:20:12 PMMar 5
to AntConc-Discussion
Dear Mr. Anthony,
I am a student conducting research on the evolution of various themes of classical anarchy within the lyrics of punk music, and I want to load in one of your lemma lists in order to streamline my research process. However, in the version of the software that I am using (4.2.4 MacOS), the word tool settings doesn't allow me to load in lemma lists. I was wondering if there was a way to do so in the current version? Thank you!

WANG CAN

unread,
Jun 23, 2024, 2:16:12 PM (6 days ago) Jun 23
to AntConc-Discussion
I have the same problem with you, dear Sidney Abramson. Have you found the solution?

PENAFIAN: E-mel ini dan apa-apa fail yang dikepilkan bersamanya ("Mesej") adalah ditujukan hanya untuk kegunaan penerima(-penerima) yang termaklum di atas dan mungkin mengandungi maklumat sulit. Anda dengan ini dimaklumkan bahawa mengambil apa jua tindakan bersandarkan kepada, membuat penilaian, mengulang hantar, menghebah, mengedar, mencetak, atau menyalin Mesej ini atau sebahagian daripadanya oleh sesiapa selain daripada penerima(-penerima) yang termaklum di atas adalah dilarang. Jika anda telah menerima Mesej ini kerana kesilapan, anda mesti menghapuskan Mesej ini dengan segera dan memaklumkan kepada penghantar Mesej ini menerusi balasan e-mel. Pendapat, rumusan, dan sebarang maklumat lain di dalam Mesej ini yang tidak berkait dengan urusan rasmi Universiti Kebangsaan Malaysia (UKM) adalah difahami sebagai bukan dikeluar atau diperakui oleh mana-mana pihak yang disebut.

DISCLAIMER : This e-mail and any files transmitted with it ("Message") is intended only for the use of the recipient(s) named above and may contain confidential information. You are hereby notified that the taking of any action in reliance upon, or any review, retransmission, dissemination, distribution, printing or copying of this Message or any part thereof by anyone other than the intended recipient(s) is strictly prohibited. If you have received this Message in error, you should delete this Message immediately and advise the sender by return e-mail. Opinions, conclusions and other information in this Message that do not relate to the official business of The National University of Malaysia (UKM) shall be understood as neither given nor endorsed by any of the aforementioned.

WANG CAN

unread,
Jun 25, 2024, 3:22:46 AM (5 days ago) Jun 25
to ant...@googlegroups.com
Thank you very much.


You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages