calculating subject-verb diversity in CLAN

Risa Stiegler

unread,

Oct 9, 2023, 3:19:13 PM10/9/23

to chibolts

Hello!

I am trying to calculate the number of unique subject-verb combinations (the subject verb diversity) in a child's speech.

I'm able to use combo to find each instance of a child's utterance that has a subject and a verb (or participle):

combo +t*CHI +d7 +sg|SUBJ^*^m|part+m|v +g6 *.cha

I have 2 questions:
1) how can I exclude utterances that are marked with $RT on the %spa tier? (In order to exclude sentences where the child is directly imitating adult speech.)

2) Is there a way to take the output of this combo command and create a list of just the subject-verb combinations and their frequencies? The combo command outputs the main, mor, and gra tiers, and marks the subject and verb:

*CHI: a baby is swimming .
%mor⇔%gra: det:art|a⇔1|2|DET (1)n|baby⇔2|4|SUBJ aux|be&3s⇔3|4|AUX
(1)part|swim-presp⇔4|0|ROOT .⇔5|4|PUNCT

It would be great if CLAN could go through and pull "baby swim" instead of having a human do it.

I saw in the 2023 CHILDES update that you are working on calculating SVD automatically, so if there is a better way to do it than what I've come up with I would love to hear it!

Thank you so much!
Risa Stiegler

Leonid Spektor

unread,

Oct 9, 2023, 4:14:20 PM10/9/23

to chib...@googlegroups.com

Hi,

At this time there no way to just extract "baby swim" from the search result. But, it is easy to change COMBO or create a utility that can do that. I will have to talk to person in charge for that work.

To exclude the %spa lines that have $RT you would need to run KWAL first and then run COMBO. For example:

kwal +t@ +t% -s$RT +d +f filename.cha

combo +t*CHI +d7 +sg|SUBJ^*^m|part+m|v filename.kwal.cex

I do not know anything about calculating SVD automatically, so I will leave to to someone else to answer.

Leonid.

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/f20d1b97-bba5-464f-bc55-e6e375ef8d03n%40googlegroups.com.

Brian Macwhinney

unread,

Oct 9, 2023, 5:21:15 PM10/9/23

to ChiBolts, Risa Stiegler

Dear Risa,
Leonid can write some code to do this. However, I think it would be useful to make sure that people are tracking subject-verb diversity in a well-defined and consistent manner. People in Pam Hadley’s lab at Illinois considered this quite important based on work Hadley, McKenna, and Rispoli reported in 2018 in AJSLP. Is this what you are trying to compute? We have already worked a bit with Pam’s lab on this and it would be best to make sure that we implement ways of doing this consistently.

— Brian MacWhinney
Teresa Heinz Professor of Cognitive Psychology,
Language Technologies and Modern Languages, CMU

Risa Stiegler

unread,

Oct 10, 2023, 12:27:16 PM10/10/23

to chibolts

Hi all,

Yes, we are calculating Subject-Verb Diversity specifically based on Hadley, McKenna, & Rispoli (2018)'s methods, using their supplemental materials as our guide! So using a tool that you have created with Dr. Hadley's lab would be perfect for us as well.

Thanks!
Risa

Reply all

Reply to author

Forward