The Setting of Contraction in AntConc

107 views
Skip to first unread message

Nur Rizka Kadir

unread,
Nov 12, 2024, 4:07:32 AM11/12/24
to AntConc-Discussion
Dear Prof. Anthony and Group Members,

I have a project to analyze slang usage in song lyrics, and I am wondering if there is a setting that I can do in AntConc to make sure that the contraction found in lyrics (as I attached below) does not affect the results of word frequency and other analysis in AntConc. I hope I can get an answer from you or anyone in here that are more experienced than me.  

Kind Regards,
Rizka
Screenshot (509).pngScreenshot (510).png

Laurence Anthony

unread,
Nov 12, 2024, 9:50:40 AM11/12/24
to ant...@googlegroups.com
Hi Rizka,

For this kind of work, I would recommend that you POS tag your data at the outset, e.g. using my TagAnt tool, and then load that into AntConc using the simple_word_pos_headword indexer in the Corpus Manager. If you do this, all the contractions would be treated systematically according to the POS tags.

I hope that helps!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/antconc/3d4d8c1c-313a-4abe-8e2a-e76abcca7806n%40googlegroups.com.

Nur Rizka Kadir

unread,
Nov 12, 2024, 10:44:33 AM11/12/24
to ant...@googlegroups.com

Thank you so much for the feedback. I'll make sure to follow this instruction and will be back to ask more questions if there is something that does not go my way.


Nur Rizka Kadir

unread,
Nov 13, 2024, 2:19:36 AM11/13/24
to ant...@googlegroups.com
I have already followed your suggestion, but I am still confused by the word frequency results (I attached the results below). I need your insights into whether  it is already correct or if I have made a mistake within the process.   Screenshot (511).png

Laurence Anthony

unread,
Nov 13, 2024, 4:08:05 AM11/13/24
to ant...@googlegroups.com
Hi,

The results look a little odd to me. Why is "the" so infrequent? But, the tokenization does look to be working correctly, with n't treated as an independent word.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Nur Rizka Kadir

unread,
Nov 13, 2024, 4:21:54 AM11/13/24
to ant...@googlegroups.com
I guess that the lyrics I took for the data included commas as in "(yeah, yeah)", it also has - as in "Ah-ah-ah", and () as in "(Popular)", hence why it tops the list of word frequency since the text of lyrics is repetitive with those elements. My question is, do I have to delete those types of elements or is it fine to include them in my corpus data?

Laurence Anthony

unread,
Nov 13, 2024, 5:10:42 AM11/13/24
to ant...@googlegroups.com
This is a good question. My suggestion would be to use the Global Settings->Filters option to ignore any 'words' that you don't want to include, like the punctuation here.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Nur Rizka Kadir

unread,
Nov 13, 2024, 6:01:37 AM11/13/24
to ant...@googlegroups.com
Thank you for this guidance. I created several lists of 'words' to exclude, resulting in a clearer outcome than the previous version I attached.Screenshot (513).png
 

Laurence Anthony

unread,
Nov 13, 2024, 8:27:48 AM11/13/24
to ant...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages