MLU giving odd results

34 views
Skip to first unread message

Gordon, Peter

unread,
Oct 9, 2024, 5:47:28 PM10/9/24
to chib...@googlegroups.com, Peter Gordon
I just taught a class where students do a simple MLU analysis to get used to CHILDES.  As I was doing it in class I noticed that the MLUs for Adam did not look right. His MLU for the first sample was 4.176, despite having mostly single word utterances.  Any thoughts?

Peter

mlu +tchi childes/Eng-NA/Brown/Adam/*.cha
Wed Oct  9 17:38:29 2024
mlu (29-Oct-2020) is conducting analyses on:
  ONLY dependent tiers matching: %MOR;
****************************************




_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020304.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 1239, morphemes = 5174
	Ratio of morphemes over utterances = 4.176
	Standard deviation = 2.946





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020318.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 1272, morphemes = 5062
	Ratio of morphemes over utterances = 3.980
	Standard deviation = 2.767





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020403.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 830, morphemes = 3964
	Ratio of morphemes over utterances = 4.776
	Standard deviation = 3.062





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020415.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 774, morphemes = 2870
	Ratio of morphemes over utterances = 3.708
	Standard deviation = 2.546





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020430.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 837, morphemes = 3679
	Ratio of morphemes over utterances = 4.395
	Standard deviation = 3.146





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020512.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 810, morphemes = 3392
	Ratio of morphemes over utterances = 4.188
	Standard deviation = 3.216





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020603.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 849, morphemes = 4548
	Ratio of morphemes over utterances = 5.357
	Standard deviation = 4.005





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020617.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 635, morphemes = 4197
	Ratio of morphemes over utterances = 6.609
	Standard deviation = 4.429





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020701.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 853, morphemes = 4596
	Ratio of morphemes over utterances = 5.388
	Standard deviation = 3.860





_________________________________________________________________
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
From file <childes/Eng-NA/Brown/Adam/020714.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
	Number of: utterances = 912, morphemes = 5096
	Ratio of morphemes over utterances = 5.588
	Standard deviation = 4.284



--

Peter Gordon
Pronouns: He/His/Him
Associate Professor
Biobehavioral Sciences and Human Development
Teachers College, Columbia University
525 West 120th Street, Box 306
New York, NY 10027
Email  pgo...@tc.edu |  p: (212) 678-8162


Nan Bernstein Ratner

unread,
Oct 9, 2024, 6:01:03 PM10/9/24
to chib...@googlegroups.com, Peter Gordon
I could be wrong but it looks like your command put in ALL of Adam's files? *.cha?

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAJE3P%2B_TiyPA_-oQBAMppkUF6mP6OBpSZUjK3W_KoJDK5BcK7g%40mail.gmail.com.

Gordon, Peter

unread,
Oct 9, 2024, 6:15:52 PM10/9/24
to Nan Bernstein Ratner, chib...@googlegroups.com
Yes it does them sequentially.  I've always done it that way.

Sarah Surrain

unread,
Oct 9, 2024, 6:18:14 PM10/9/24
to chib...@googlegroups.com, Nan Bernstein Ratner, chib...@googlegroups.com

I tried the MLU command with the same file (020304) and replicated the same result as Peter.

 

Could it have to do with how the MLU command handles utterances with repeated words?

 

For example, these are some of the longest utterances in this transcript:

 

*CHI:    bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer bulldozer .

 

*CHI:    put dirt up (.) put dirt up (.) put dirt up .

 

*CHI:    look look look look .

 

Sarah Surrain, Ph.D.
Postdoctoral Research Fellow (she/her/ella)

Children’s Learning Institute
McGovern Medical School at UTHealth
7000 Fannin St | 2460A | Houston, TX 77030 
713-500-3826
www.childrenslearninginstitute.org 
https://sarahsurrain.com/

 

Peter

 

Image removed by sender.

 

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAJE3P%2B_TiyPA_-oQBAMppkUF6mP6OBpSZUjK3W_KoJDK5BcK7g%40mail.gmail.com.


 

--

Peter Gordon
Pronouns: He/His/Him
Associate Professor
Biobehavioral Sciences and Human Development
Teachers College, Columbia University
525 West 120th Street, Box 306
New York, NY 10027
Email  pgo...@tc.edu |  p: (212) 678-8162

Image removed by sender.

 

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.

Brian Macwhinney

unread,
Oct 9, 2024, 8:45:45 PM10/9/24
to ChiBolts
Peter,
The shift to tagging with UD radically alters the meaning of MLU, because UD outputs all the grammatical features inherent in a stem. This is really important for crosslinguistic analysis, but it is certainly a change.Take a look at the shape of the %mor line in those files.
However, I am glad you called my attention to this, because a few of those features should not be getting into the output and I need to fix this.

If you want to stick with the 1973- 2023 version of MLU, you could either just rely on MLU in words, which is actually pretty close or else work with the older tagging of the corpora that you can get from https://childes.talkbank.org/access/Eng-NA/ Click on the link in the second line.

—Brian

Brian Macwhinney

unread,
Oct 9, 2024, 10:14:42 PM10/9/24
to ChiBolts
Peter,
Leonid just now reminded me that I was about to work on this by creating a filter that would disregard all of the features that UD is creating except for those that align with Brown 1973. I’ll work on this tomorrow. Sorry about the hassle.

—Brian MacWhinney
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAJE3P%2B_TiyPA_-oQBAMppkUF6mP6OBpSZUjK3W_KoJDK5BcK7g%40mail.gmail.com.
>
>
> --
> Peter Gordon
> Pronouns: He/His/Him
> Associate Professor
> Biobehavioral Sciences and Human Development
> Teachers College, Columbia University
> 525 West 120th Street, Box 306
> New York, NY 10027
> Email pgo...@tc.edu | p: (212) 678-8162
>
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAJE3P%2B_rk0iuP3MRbVei8TH923jtAx3%2BtJdjddH3%2Bhj8MKdfMg%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages