Hi everyone,
I've been using the AntWordProfiler with the default token definition settings so far. Now I have a few texts in which words are separated by hyphens (e.g., "unfortun-ately"). In order not to mess up the token count, I'd like to define the hyphen as a character too. However, since I've used the default settings so far and since I want the analyses to be as similar as possible, I'd like to add the hypen to the regular expression used for the default settings: (?<![\p{N}\p{L}])\p{L}+[\p{N}]*
How would I do that? Unfortunately that exceeds my very basic knowledge of regular expressions..
Thanks a lot in advance!
Kind regards,
Theresa