Searching for hyphens in AntPConc

162 views
Skip to first unread message

Alex Lakaw

unread,
May 12, 2016, 5:10:20 AM5/12/16
to AntConc-discussion
Dear Mr. Anthony,

at my university, we are using AntPConc for research with aligned corpora. At the moment, we are interested in hyphenated words, and we would like to be able to search for the hyphen character in general to get access to all hyphenated words in the corpora. Is there a way to search for hyphens only in AntPConc? We tried the usual search commands like "-" and *-* etc, but none of them seem to work.

Any help would be much appreciated.

Many thanks,

Alex

Laurence Anthony

unread,
May 12, 2016, 9:57:14 PM5/12/16
to ant...@googlegroups.com
Hi Alex,

I just tested AntPConc here, and *-* worked fine.

Inline images 1

Laurence.




###############################################################
Laurence ANTHONY, Ph.D.
Professor
Center for English Language Education in Science and Engineering (CELESE)
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

--
You received this message because you are subscribed to the Google Groups "AntConc-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To post to this group, send email to ant...@googlegroups.com.
Visit this group at https://groups.google.com/group/antconc.
For more options, visit https://groups.google.com/d/optout.

Alex Lakaw

unread,
May 16, 2016, 2:38:21 AM5/16/16
to AntConc-discussion
Hi again Laurence,

unfortunately, *-* does not work - neither for me nor for my colleagues. We can search for e.g. middle-*; *-class; back-* and get results, but the string *-* does not work. 

We use the same version of AntPConc as you, have the same settings (as compared to your screen-shot). The files are aligned and work fine with other searches, just not the hyphen search. 

Is there a way to define characters as words in AntPConc, as there is in AntConc? This might solve the problem, but I am not sure where to define things in AntPConc...

Very many thanks for your help!

Best, Alex

Laurence Anthony

unread,
May 16, 2016, 4:00:04 AM5/16/16
to ant...@googlegroups.com
Hi,

Can you create a file called "test.txt" with the following content:
"This a hyphen-test file that contains a hyphen test and nothing else."

Then, load this as the target and reference corpus in AntPConc and search for "*-*". Let me know what results you get.

Unfortunately, AntPConc doesn't let you define your own token definition. It uses the Unicode Letter and Mark classes internally: [\p{L}\p{M}]+

I hope that helps.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor
Center for English Language Education in Science and Engineering (CELESE)
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

--
Reply all
Reply to author
Forward
0 new messages