error running MLU in Italian

16 views
Skip to first unread message

Kelly P

unread,
Nov 21, 2021, 4:57:28 PM11/21/21
to chibolts
Hi Everyone!

Thanks for adding me to your group.  I am working on a series of classroom recordings that have both Italian and English.  I am totally new to this software and I tried to look through the manuals and old chats but I can't figure out the answer.

Is there a way that I can get the system to analyze Italian only?  Right now the freq and MLU analyses are picking up both languages but we are only interested in Italian.

Thank you so much for any help you can give!
Be well,

Kelly Paciaroni

Brian Macwhinney

unread,
Nov 21, 2021, 7:53:57 PM11/21/21
to ChiBolts, Kelly P
Dear Kelly,
If you want to conduct analyses on one language in a code-switched corpus, you need to mark up your transcript in accord with the principles in section 16.1 of the CHAT manual on “Code-switching”. Once you have done this, you can use the +s switch in programs like FREQ or KWAL to pull out utterances with codes like [- ita] or [- eng] at the beginning of utterances, or perhaps words with markings such as @s.

— Brian MacWhinney
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/e233b6a9-8915-4dc6-b593-2258c8b2b13cn%40googlegroups.com.

Kelly Paciaroni

unread,
Nov 24, 2021, 5:02:16 PM11/24/21
to Brian Macwhinney, ChiBolts
Hi Brian,

Thank you so much for the suggestions.  I read that section of the manual but now when I run a check on the transcript, the system says that the symbol is not declared in the depfile for lines like this one:

*CLE: [-eng] how do you say sad ?

I listed the languages at the beginning of the document in this way:

@Languages: ita, eng

Do you know what I am doing incorrectly?
Thank you and Happy Thanksgiving!

Kelly

Leonid Spektor

unread,
Nov 24, 2021, 6:47:34 PM11/24/21
to chib...@googlegroups.com
Kelly,

All language precodes have to have space character between the '-' character and the language name. In your example it would be [- eng]. The tier would then be:

*CLE: [- eng] how do you say sad ?

Also, all post codes have to have space characters after '+' character, like so [+ code]. In fact most codes, in between [...], have to have space character after the first code identifying character(s).



Leonid.

Kelly P

unread,
Nov 26, 2021, 10:48:56 AM11/26/21
to chibolts
Hi Leonid,

Thank you so much!  That worked.  I ran the freq and everything in English is still appearing at @eng though it isn't being counted in the "total number of different item types used."  Is there any way to get English not to appear at all in the analyses?

Thank you again and be well,
Kelly

Leonid Spektor

unread,
Nov 26, 2021, 11:27:47 AM11/26/21
to ChiBolts
Hi Kelly,

I assume that you meant "@s:eng" words are still appearing in the output. The "@eng" form is not legal. Try adding "-s*@s:eng" option to your command line to stop "@s:eng" words counted in the analyses. If this does not work, then I need an example of your transcript file to see exactly how do you code language attributes. Please email it to me directly at spe...@andrew.cmu.edu.


Leonid.

Reply all
Reply to author
Forward
0 new messages