TTR differences between EVAL and FREQ

11 views
Skip to first unread message

Veronica Fletcher

unread,
Mar 14, 2025, 3:06:11 PMMar 14
to chibolts
Hi all,
Our lab is noticing that we receive different Type and Token output values, depending on whether our transcripts are run through EVAL (which I believe uses FREQ counts to calculate TTR?) versus run directly through the FREQ command.

Here is some sample data from Transcript A.  The same file was used for both analyses:
FREQ_types (from EVAL): 166
FREQ_tokens (from EVAL): 596
Types (from FREQ): 189
Tokens (from FREQ): 593

What could be accounting for these differences? Apologies if this is a silly question - I could not find anything in the CLAN manual that would explain this.

Veronica
The Aphasia Network Lab, Northeastern University

Leonid Spektor

unread,
Mar 14, 2025, 3:23:20 PMMar 14
to chib...@googlegroups.com
Hi Veronica,

EVAL counts words on %mor tier and default FREQ command counts words on speaker tier. If your data has a lot of contractions words, then numbers will be different. For example, word (can't) will counted by FREQ as 1 word. But, because on %mor tier this word is represented as (can) and (not) EVAL will count it as 2 words.

You can see the difference if you run FREQ command on %mor tier. For example, command "freq +t%mor ..."


Leonid.

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/5dcaff54-e165-4563-b9de-45a2f1936c8bn%40googlegroups.com.

Veronica Fletcher

unread,
Mar 14, 2025, 3:44:37 PMMar 14
to chibolts
Hi Leonid,
Thanks for your response. I re-ran the file using FREQ on the %mor tier. This resolved the issue with # Tokens. However, I am still receiving a higher Types count with freq +t%mor (204) compared to EVAL (166). Any thoughts? Updated numbers below.

FREQ_types (from EVAL): 166
FREQ_tokens (from EVAL): 596
Types (from FREQ %mor): 204
Tokens (from FREQ %mor): 596

Types (from FREQ): 189
Tokens (from FREQ): 593

Veronica

Leonid Spektor

unread,
Mar 14, 2025, 4:37:02 PMMar 14
to chib...@googlegroups.com

If EVAL FREQ_types are different from FREQ Types 166 to 204, but both EVAL FREQ_tokens and FREQ Tokens are the same 596, then I will need a small sample of your data that show the difference and the command lines you used for both EVAL and FREQ to figure this out. Please send those directly to me at spe...@andrew.cmu.edu.


Leonid.

Leonid Spektor

unread,
Mar 14, 2025, 4:48:26 PMMar 14
to chib...@googlegroups.com
Hi Veronica,

I forgot that EVAL counts lemmas and FREQ count full words.

If you want FREQ to produce the same result as EVAL, then use (+sm;*,o%) option to count only lemmas, command:

freq +sm;*,o%


Leonid.

Veronica Fletcher

unread,
Mar 15, 2025, 10:47:35 AMMar 15
to chib...@googlegroups.com
Thank you!

You received this message because you are subscribed to a topic in the Google Groups "chibolts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/chibolts/G8M2xeo3Gn0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to chibolts+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/31B820E3-319E-4DBC-9002-02FD067B0315%40andrew.cmu.edu.
Reply all
Reply to author
Forward
0 new messages