Include apostrophes/hyphens and hide unused wordlists

Destiny Woods

unread,

Jun 2, 2018, 11:52:51 AM6/2/18

to AntWordProfiler-Discussion

First off, I wanted to express my appreciation for this program. I've been creating a synthetic-phonics based reading curriculum (with a few other specific parameters), which entails over 160 different lessons, each with an associated wordlist added to the total words that the student can decode. Last week I was desperately wishing for a way to take a quote or a poem and automatically compare it against the wordlists, to see at which point a student would have mastered every word in the passage. I knew there had to be something out there, and was overjoyed to discover AntWordProfiler would let me do this - and that it ran on Linux! (I keep Windows VMs around, but native programs are always best if possible.) Even better, it could tell me which words I had forgotten to add to the lists somewhere, which is so helpful. :D

There are only two things I wonder how to do - if it's possible:

1) How do I include words such as "can't" and "don't" in the lists? My version seems to treat them as two separate words each "can" and "t", and "don" and "t", which is not so helpful. The same goes for hyphenated words. I'd like to be able to treat them as a single word. I found several questions about them in this list, but all seemed to be about encoding, and by default every text file I create is UTF-8, so I haven't seen any weird characters, it just doesn't treat them as a single word.

2) Is there a way to hide unused wordlists in the results? Currently it spits back a result for every one of the 160 wordlists, when a single sentence couldn't possibly have words in every one of the lists, resulting in empty headers for each list that wasn't used. Since I'm mainly using this to find the latest wordlist that a word appears in, simply hiding any wordlist that wasn't used at all make it a ton easier to find the last wordlist used.

Laurence Anthony

unread,

Jun 2, 2018, 5:31:10 PM6/2/18

to antword...@googlegroups.com

Hi Destiny,

I'm glad you are finding my software to be useful.

1) How do I include words such as "can't" and "don't" in the lists? My version seems to treat them as two separate words each "can" and "t", and "don" and "t", which is not so helpful. The same goes for hyphenated words. I'd like to be able to treat them as a single word. I found several questions about them in this list, but all seemed to be about encoding, and by default every text file I create is UTF-8, so I haven't seen any weird characters, it just doesn't treat them as a single word.

To do this, just edit the token definition in the settings menu. You will need to use a regular expression that maps to the the characters that will be accepted as a token. [a-zA-Z'] would be a simple example for just letters and an apostrophe.

2) Is there a way to hide unused wordlists in the results? Currently it spits back a result for every one of the 160 wordlists, when a single sentence couldn't possibly have words in every one of the lists, resulting in empty headers for each list that wasn't used. Since I'm mainly using this to find the latest wordlist that a word appears in, simply hiding any wordlist that wasn't used at all make it a ton easier to find the last wordlist used.

Unfortunately not. However, if you load the results into a text editor, it should be easy to quickly search and delete the headers that are empty.

I hope that helps.

Laurence.

Destiny Woods

unread,

Jun 2, 2018, 7:07:30 PM6/2/18

to AntWordProfiler-Discussion

To do this, just edit the token definition in the settings menu. You will need to use a regular expression that maps to the the characters that will be accepted as a token. [a-zA-Z'] would be a simple example for just letters and an apostrophe.

Hmm, I don't have an option to do that in the settings menu. The settings menu has two options: Global Settings and Thesaurus Settings. If I choose Global Settings (I presume that Thesaurus Settings won't be very useful here), I get a dialog box with File Settings, Tag Settings, and Color Settings. File Settings only has something about showing full pathnames and whether to load internal level lists or not. Tag Settings lets me show or hide angle tags (I have absolutely no idea what that is or what it does). Color Settings looks like it's meant to choose colors to highlight words (which I have no idea how to make that work either, lol, not that it's relevant for my problem at the moment). There's absolutely nowhere that even mentions tokens. Is it because I'm running a Linux version?

Reply all

Reply to author

Forward