Hello all.
I have a question about calculating word frequency. We're working with aphasia participants who will often make mistakes, and when they do make mistakes, we'll put in the intended word into [: target] if we know what the intention was. However, I do not want to count [: target] words in the frequency tally of words. Basically, if someone said furry [: fairy] in one instance, and I am looking for a frequency count of the correctly spoken 'fairy,' I want the frequency calculation for 'fairy' to be 0, thus ignoring the word in the target. Further, I'd also like to run for lemmas and not morphological changes. In other words, if I'm looking for "stair," I want 'stairs' to be counted in the frequency of 'stair' usage.
on the attached transcript [completely made up, by the way], it evaluates the %mor line but doesn't ignore the target [: target] words like I thought it would. It does do the correct job in tagging 'stair' even though the participant said 'stairs,' a correct usage from the %mor line. Output of frequency for this command was:
Cinderella: 1
stair: 1
fairy: 1
However, as I said, I wouldn't want the incorrect furry [: fairy] to count. So, I tried:
freq -sm** -sm@* +t*PAR +sCinderella +sstair +sfairy
Now that I've told CLAN to stick to the speaker tier, it then ignores 'stair' because 'stairs' was written, which isn't what we were going for. However, it correctly does not look within the [: target] and correctly states that 'fairy' was said 0 times. As an added point, I've also found that when I run the above command on transcripts, it sometimes gets the counts incorrect. For this command, I get the count:
Cinderella: 1
stair: 0
fairy: 0
So basically, is there any way to tell CLAN to run the analysis on the %mor tier for frequencies of words [specifically, lemmas], but somehow to specify to ignore [: target] words on the speaker tier?
In an ideal world, from the attached transcript, I'd be getting the frequency counts as:
Cinderella: 1
stair: 1
fairy: 0
Thank you very much,
Brie
--