Materials for writing regex

40 views
Skip to first unread message

Juli Sarris

unread,
Apr 1, 2023, 6:23:37 AM4/1/23
to AntConc-Discussion
Laurence, hi, I appreciate all the work you do to support us, particularly those of us who are working in under-resourced countries/universities - providing your tools at no charge is really such a public service.  Thank you!

I have been able to create my own corpus (I used AntFileConverter to prepare the files) and POS tag it using TagAnt.  Yay me!

However, to search my corpus in Antconc I need to write a regex, yes?  Please note that I  am a teacher of English.  I am not a programmer.   Can you or anyone point me in the direction of a very simple guide with examples of simple searches  (e.g.  search for water as a noun vs. water as a verb), or search for all infinitives in the corpus?  Or am I overthinking all of this?

Thanks so much for your help!!
Dr. Juli Sarris

Laurence Anthony

unread,
Apr 2, 2023, 9:53:24 PM4/2/23
to ant...@googlegroups.com
Hi Juli,

You don't need to use regex to search your corpus (although it is an option). The easiest way to search your corpus is to use the same strategies that you use in an Internet search engine like Google, using so-called "wildcards". The most common wildcard is *, which you can use to search for variations of a word:
dog* ~> will search for "dog" or "dogs"

Please check the various wildcards available in the global settings menu. 

To search for nouns, you can use a search like the following:
*_N*

This means "any word" that is followed by a tag that starts with the "N" tag.

I hope that helps!

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################


--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/a7163a7b-0e9e-4898-8856-1ba3d088ed52n%40googlegroups.com.
Message has been deleted

Juli Sarris

unread,
Apr 7, 2023, 4:53:31 AM4/7/23
to ant...@googlegroups.com
Laurence, hi, thanks for the quick response.  I guess this leads to another question though - when I search the internet for support I find advice like "use this example regex code to search for past tense verbs" and it's [a-zA-Z]*ed\b (which isn't even accurate because it won't find irregulars) .  And you're telling me here that I could instead use something simple like *_VBD (I'm assuming we're all using the UPenn Treebank tagging codes).  And I'm wondering if other people are just messing with my head or something?   *sigh**  
Dr. Juli Sarris

You received this message because you are subscribed to a topic in the Google Groups "AntConc-Discussion" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/antconc/4Qvemzu3f1s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to antconc+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/antconc/CAL6Fgv0Ze%3D79XLa38d%3DorF0d6D2hYs0r%3DeTQZpWPwsVJ56DxdQ%40mail.gmail.com.

Laurence Anthony

unread,
Apr 7, 2023, 4:56:16 AM4/7/23
to ant...@googlegroups.com
Hi Juli,

The Internet quote you sent is for a very simplistic example of searching for past tense when the data is not tagged. With tagged data, you don't need to use regex. A wildcard approach would work fine.

Laurence.

###############################################################
Laurence ANTHONY, Ph.D.
Professor of Applied Linguistics
Faculty of Science and Engineering
Waseda University
3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
E-mail: antho...@gmail.com
WWW: http://www.laurenceanthony.net/
###############################################################

Juli Sarris

unread,
Apr 14, 2023, 4:58:23 AM4/14/23
to ant...@googlegroups.com
now I get it.  Thank you !!!
Dr. Juli Sarris

Reply all
Reply to author
Forward
0 new messages