AntConc 4.3.1 Contractions

21 views
Skip to first unread message

Yedo Ji

unread,
Aug 7, 2025, 9:06:22 PMAug 7
to AntConc-Discussion
Hi Dr. Anthony,

I hope all is well.

I am using AntConc 4.3.1 and the data I have include multiple contractions, such as couldn't, I'm, don't, etc.

I found out that AntConc, without any settings changes, treat them whatever comes before ' (apostrophe)  and after separately, which means it counts not and n't as different tokens. Is there a way I can make it count n't as the same as not in a wordlist?

Thank you very much for your contribution in the field!

Best,
Yedo

Laurence Anthony

unread,
Aug 7, 2025, 9:12:03 PMAug 7
to ant...@googlegroups.com
Hi Yedo,

With the default settings, "don't" will be treated as "don" and "t" in the list. I recommend either of the following:

1) Use a tokenization tool (e.g. my TagAnt tool) to properly segment the raw texts. Then, load the texts into AntConc using the Corpus Manager with the "simple word, tag, headword" indexer turned on. This will then treat all tokenized words separately. I really should add this functionality directly into AntConc, but the models needed to tokenize different languages are quite big, so it would increase the size of the app considerably.

2) Edit the default token definition in AntConc to include the apostrophes.

1) is the better option because it is the most complete. But, 2) is simple and easy to do and the results are very transparent.

I hope that helps!

Laurence.



###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/antconc/6086d542-41c9-48d0-a0fb-26045b019bd5n%40googlegroups.com.

Yedo Ji

unread,
Aug 12, 2025, 7:01:09 PMAug 12
to ant...@googlegroups.com
Thank you. I successfully managed to tag the texts I have with TagAnt and am now able to load them in AntConc.

I have three follow-up questions:

1. Can I make AntCont differentiate 's_POS and 's_VBZ in a keyword list?
2. AntCont still calculates the frequency of not and n't separately. They show up as two different words in a wordlist. Is there a way I can make it treat them as one word?
3. Where can I change the token definition setting? (Although I won't need this function because I chose to do Option 1 you suggested.)

Thank you always for your help!

Best,
Yedo

You received this message because you are subscribed to a topic in the Google Groups "AntConc-Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antconc/E_pFINfTmHk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antconc+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/antconc/CAL6Fgv2mQa0w1%2BMmX%3D%2BOLwKh6F%3D59tWE7nShRf3B_kh5u1uFjg%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages