Proplem with possessives

21 views
Skip to first unread message

Amal Alzahrani

unread,
Jun 14, 2026, 2:43:06 AM (13 days ago) Jun 14
to AntConc-Discussion
Dear Dr. Anthony, 
I am having a trouble with the software. In my study, I am trying to extract 4-word lexical bundle. However, the software counts possessive "s" as a separate token. I searched and can't find "token definition" to fix the problem. Is there a solution? or do I do every case of possessive manually by checking KWIC?

My AntConc version is 4.3.1

Many thanks

Laurence Anthony

unread,
Jun 14, 2026, 9:12:22 PM (12 days ago) Jun 14
to ant...@googlegroups.com
Hi Amal,

I think you have two options:

1) I recommend you use a POS tagger to tag your data and then use the simple_word_pos_headword indexer in AntConc's corpus manager to split up the words.
2) When building a corpus without tagged data, you can edit the token definition in the corpus manager and set the apostrophe as a token character. If you do this, the words will no longer split in the results. But, all occurrences of an apostrophe will be treated as part of a word, which may not be what you want. 

I generally recommend option 1.

I hope that helps.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/antconc/78e5a61d-ee21-4c60-a4f5-14ec13848365n%40googlegroups.com.

Amal Alzahrani

unread,
Jun 15, 2026, 10:23:09 PM (11 days ago) Jun 15
to AntConc-Discussion
Thank you so much for your response. 

I used TagAnt. However, when I loaded the output in AntConc, I got results with numbers, commas, and different characters.  The TagAnt showed that POS tagging helped identify possessive forms more clearly, but it also produced additional punctuation- and number-based n-grams that required filtering. Is there any way to fix this? or do you think I should write in my methodology that the tagged output was used specifically to check the possessive issue, while irrelevant n-grams containing punctuation, numerical references, and citation codes were excluded during manual cleaning.

Laurence Anthony

unread,
Jun 15, 2026, 11:13:42 PM (11 days ago) Jun 15
to ant...@googlegroups.com
Hi,

If you use the latest version of AntConc, you don’t need to manually edit the list. Those punctuation based Ngrams will be automatically filtered out. 

Laurence 

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Amal Alzahrani

unread,
Jun 16, 2026, 7:57:14 PM (10 days ago) Jun 16
to AntConc-Discussion
Thank you so much! I am very excited to try the new version of AntConc. 
Reply all
Reply to author
Forward
0 new messages