number of tokens for VOCD

11 views
Skip to first unread message

Gordana Hrzica

unread,
Jul 15, 2015, 9:35:04 AM7/15/15
to chib...@googlegroups.com
Dear all,

I would like to have a measure of vocabulary diversity in number of transcripts of children's narratives. VOCD seems like the most reliable choice for that and I would really like to use it. However, narratives are rather small. Most of them is between 80 and 170 tokens, but some of them are lower than 50. I know that default minimum value for calculating VOCD is 50 tokens, but it can also be set lower. Also, I believe that using the option of replacements would give me some results. However, I do not know how reliable such results would be and which of the two mentioned methods should give me more appropriate measures.

I am sorry if I'm posting this to the wrong group. Perhaps it is more of a methodological than a technical question. I feel it it somewhere between two worlds:). But if it would be more appropriate for info-childes, I will place it there.

I would really appreciate your help on this one.

Brian MacWhinney

unread,
Jul 15, 2015, 12:30:34 PM7/15/15
to chib...@googlegroups.com
Dear Gordana,

    You might want to double check the book by Malvern et al. on VOCD, but I believe that the minimum value is set at 50 because results are unstable for smaller files.  
    We plan to implement another measure of lexical diversity called MATTR (moving average TTR) that may be a bit better for your purposes.  Once it is ready, I will post a note to ChiBolts about this.
    One general point.  If all you care about is making within-group comparisons, then using small transcripts is not a huge problem.  However, if you were to attempt to compare lexical diversity numbers  from small transcripts of the type you are describing with those with large transcripts, then even VOCD would run into problems.

—Brian MacWhinney
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/4508b50b-bb6e-4844-a5a8-f80f67418188%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gordana Hrzica

unread,
Jul 15, 2015, 1:24:56 PM7/15/15
to chib...@googlegroups.com
Dear Brian,

thank you very much for your answer. I will try to get the hold of the book as soon as possible. And yes, I do need the measure for within-group comparison. I am also looking forward to MATTR.

Also, I am sorry if this is also explained somewhere else, but form the CLAN manual I did not understand exactly what are the replacements. I do understand (hopefully:) how the measure works, but not this part. I would really appreciate the explanation or maybe a reference.

Best,
Gordana
Reply all
Reply to author
Forward
0 new messages